@E Simulation failed: SIGSEGV continues..

shashi · March 5, 2017

Hello there,

This time i have a very simple design in HLS which takes in a 3d matrix, multiplies with a constant and gives output. It works for [64][64][64] matrix perfectly, but not for [128][128][128] matrix. Again I get the segmentation fault.

Can somebody help me to know what fundamentally going wrong..

Thanks for your halp in advance.

i have attached my design files here:

core.cpp

core.h

core_test.cpp

D@n · March 5, 2017

@shashi,

Your problem is a simple one: malloc is failing and returning a NULL pointer. Since you aren't checking for NULL, when you try to access memory following that pointer your system crashes.

Dan

shashi · March 6, 2017

I modified the code. Still, the same error persists. Any idea where I am going wrong fundamentally?

core_test.cpp

D@n · March 6, 2017

@shashi,

Many systems, and particularly embedded ones, won't let you place that much data on the stack (i.e. the size of your main() local variables).

You might find if you chase this problem down, that you never get past that variable allocation--'cause you are just asking for too much memory.

Dan

P.S. @shashi ... are you sure this is how you want to solve this problem? I foresee you struggling and struggling to get HLS to accomplish your task. What you are asking of HLS might be beyond its capability, and other--more reliable but perhaps less intuitive--means exist for accomplishing your task.

shashi · March 6, 2017

HI Dan, Thanks for the reply.

Basically, i need to do operations on a huge matrix(say 128x100x360) using FPGA as an accelerator. So I am setting the ground for that.

Is it accomplishable or tough to realize? what other tool can I use for accomplishing the same task on FPGA?

Since I can easily implement for loops in HLS, I thought it would be a perfect fit for the problem that we are addressing.

what do you say about it?

Regards:

shashi

D@n · March 6, 2017

@shashi,

I would recommend VHDL or Verilog for your work. I would also recommend you use fixed point if at all possible. If not, handling floating point while doable will just be another headache added on to your already difficult task. Doable, but a challenge.

The problem with the loop you wish to implement in HLS is that you are requesting the FPGA do every element of the loop all at the same clock. You can't feed the FPGA with information fast enough to make that strategy viable. (Feel free to prove me wrong here.) For example, once, years ago, I build an FPGA FFT accelerator. A good FFT algorithm on an FPGA can produce one FFT output per clock, what could go wrong? (Using your HLS approach, you might be trying to get every FFT bin out on every clock--a waste of resources.) In my case, it took so much time to get the information into the FPGA and back out again, that the "accelerator" was actually slower than the CPU it was supposed to accelerate.

Hence, I would suggest you start your design with wherever the data is coming from and wherever the data is going to. That will limit the speed that the FPGA needs to operate at. Your goal should be to run the FPGA's computations at least that fast, and perhaps a touch faster, but not a million times faster (as your algorithm above.) Why not faster? $$. Sure, you could by an FPGA board for $10k that might just be able to do your task, but why? When you don't need to?

The other reason for leaving HLS is that ... HLS doesn't force you to think like hardware. It hides these hardware details from you. Hardware doesn't work like software. It's really different. Everything happens at once in hardware--never one thing after another, but everything at the same time. To make things work, you really have to think differently. The problem you are struggling with now is that you are thinking like a programmer: sequentially (do this, then that). HLS will try to build your program, but ... not every sequential thought/task maps well to hardware. When something doesn't map right, if you aren't aware of what's going on under the hood, you will struggle to find/fix the bugs. HLS will complain, and you won't know why ... (As in ... your posts this last week.)

Hopefully this helps,

Dan

shashi · March 6, 2017

Thanks for that long reply.

what i understand is that the HLS design is not recommended for huge matrix operations which involves for lops.

But the same thing can be accomplished in hardware using HDL , right? So should i go ahead with HDL? will it work for this task atleast?

Thnak you..

D@n · March 6, 2017

@shashi,

Yes, the same thing can be accomplished using HDL. HDL offers "for" loops as well. They work nothing like for loops in C, despite looking nearly identical. Remember, everything happens on an FPGA all at the same time. In other words, the HDL uses the for loop to create N copies of the logic within it, all of which will operate on the same clock. It's a very different thing from anything you might be familiar with--if you've never programmed hardware before.

I think your bigger problem at this point, though, is not your matrix multiplies, but how you will get your data in and out of your FPGA. That will then dictate the type of matrix multiplication solution and speed available to you.

Dan

Sign In

@E Simulation failed: SIGSEGV continues..

Question

shashi

Link to comment

Share on other sites

7 answers to this question

Recommended Posts

D@n

Link to comment

Share on other sites

shashi

Link to comment

Share on other sites

D@n

Link to comment

Share on other sites

shashi

Link to comment

Share on other sites

D@n

Link to comment

Share on other sites

shashi

Link to comment

Share on other sites

D@n

Link to comment

Share on other sites

Archived

Browse

Activity