shashi

Members
  • Content Count

    27
  • Joined

  • Last visited

Reputation Activity

  1. Like
    shashi reacted to [email protected] in @E Simulation failed: SIGSEGV continues..   
    @shashi,
    I would recommend VHDL or Verilog for your work.  I would also recommend you use fixed point if at all possible.  If not, handling floating point while doable will just be another headache added on to your already difficult task.  Doable, but a challenge.
    The problem with the loop you wish to implement in HLS is that you are requesting the FPGA do every element of the loop all at the same clock.  You can't feed the FPGA with information fast enough to make that strategy viable.  (Feel free to prove me wrong here.)  For example, once, years ago, I build an FPGA FFT accelerator.  A good FFT algorithm on an FPGA can produce one FFT output per clock, what could go wrong?  (Using your HLS approach, you might be trying to get every FFT bin out on every clock--a waste of resources.)  In my case, it took so much time to get the information into the FPGA and back out again, that the "accelerator" was actually slower than the CPU it was supposed to accelerate.
    Hence, I would suggest you start your design with wherever the data is coming from and wherever the data is going to.  That will limit the speed that the FPGA needs to operate at.  Your goal should be to run the FPGA's computations at least that fast, and perhaps a touch faster, but not a million times faster (as your algorithm above.)  Why not faster?  $$.  Sure, you could by an FPGA board for $10k that might just be able to do your task, but why?  When you don't need to?
    The other reason for leaving HLS is that ... HLS doesn't force you to think like hardware.  It hides these hardware details from you.  Hardware doesn't work like software.  It's really different.  Everything happens at once in hardware--never one thing after another, but everything at the same time.  To make things work, you really have to think differently.  The problem you are struggling with now is that you are thinking like a programmer: sequentially (do this, then that).  HLS will try to build your program, but ... not every sequential thought/task maps well to hardware. When something doesn't map right, if you aren't aware of what's going on under the hood, you will struggle to find/fix the bugs.  HLS will complain, and you won't know why ... (As in ... your posts this last week.)
    Hopefully this helps,
    Dan
  2. Like
    shashi reacted to [email protected] in @E Simulation failed: SIGSEGV continues..   
    @shashi,
    Many systems, and particularly embedded ones, won't let you place that much data on the stack (i.e. the size of your main() local variables).
    You might find if you chase this problem down, that you never get past that variable allocation--'cause you are just asking for too much memory.
    Dan
    P.S.  @shashi ... are you sure this is how you want to solve this problem?  I foresee you struggling and struggling to get HLS to accomplish your task.  What you are asking of HLS might be beyond its capability, and other--more reliable but perhaps less intuitive--means exist for accomplishing your task.
  3. Like
    shashi reacted to [email protected] in @E Simulation failed: SIGSEGV continues..   
    @shashi,
    Your problem is a simple one: malloc is failing and returning a NULL pointer.  Since you aren't checking for NULL, when you try to access memory following that pointer your system crashes.
    Dan
  4. Like
    shashi reacted to jamey.hicks in File system in vivado SDK to run on ZED board   
    I don't use the Xilinx SDK because I prefer to run Ubuntu or Android on zedboard, zybo, etc. Ubuntu works really well because so many prebuilt packages are available and it's so familiar to developers.
    With either of these, filesystem access is available as usual.
  5. Like
    shashi reacted to [email protected] in c simulation error in HLS   
    @shashi,
    Gosh, HLS?  Ouch.  I am by no means an expert on HLS.  You might wish to turn around and ask your question in the Xilinx forums instead.  My own personal experience with HLS is only counseling people who've been let down by it.
    That said, I do see some problems with your HLS synthesis that might be part of what you are struggling with.  I figure these are problems simply because, were I the HLS program, I would struggle with these.
    For one, and to my knowledge, HLS doesn't give you access to the DDR3 SDRAM.  Sorry.  I know you've been asking about it, and I guess I just didn't realize the task you were trying to accomplish.  Those big huge arrays you have?  HLS would implement those in on-chip block RAM, a very limited resource.  Anything connected to SDRAM needs to be connected to an AXI4 bus, and it will only ever be able to read so many values at once--never more than 64-128 bits IIRC depending upon the size of your DDR3 SDRAM data width.
    And double precision floating point math?  Ok, sure, the FPGA can do it, but ... that's gonna cost you.  Whether or not HLS can do it, well, that's a different question--and a good one for the Xilinx forums.  I do know that while an FPGA can do floating point math, it will cost several clock cycles to do.  I'm not sure how well HLS can handle something that will cross multiple clock cycles.
    Are you trying to build an interpolator to display an image or some such in real user timer?  While it sounds like a fun idea, and a very good use of an FPGA, it's not an easy task to accomplish.  Were I to build such an interpolator, I wouldn't go beyond doing a single set of inputs at any given time with HLS--although I'm not sure how I would adjust HLS to only use 6 inputs at a time.
    Gosh ... I might even want to do a bi-quadratic interpolator (it sounds like a fun challenge!), and send the results either to a VGA-type display or to display memory.  Now that would be a fun challenge of a problem, but I would never start that problem with HLS.  I just wouldn't trust HLS to handle the difficult parts of the problem properly.  I'd probably try to push back against the floating point requirement, to see if I couldn't do it in fixed point--floating point is just *really* hard.  It's not that it cannot be done, just that ... it's hard.
    Hope that helps,
    Dan
  6. Like
    shashi reacted to [email protected] in Storing float values in DDR3   
    @shashi,
    When you compile your program, make a "map" file.  This will tell you where all of the static things (code, global variables, etc) are located.  Using the GNU linker, a map file "something.map" is generated with the "-Map=something.map" option.
    But ... why not use a typical C construct like "malloc"?  As in
    #include <stdlib.h> // then later, within your code ... float *imagep; imagep = (float *)malloc(20*1024*1024); // 20 MB image // Now ... operate on imagep If you've set up your environment well enough, this should work.  Check the result of the malloc (imagep) against NULL to know whether or not you are finding this memory properly.
    Of course, you could also do it like ...
    float global_image[20*1024*1024]; main(...) { ... global_image[index] = value; ... } In this latter case, you should be able to see global_image in your map file, and know where the linker placed it.
    Dan
  7. Like
    shashi reacted to [email protected] in problem in storing data into DDR3   
    @shashi,
    Are you trying to get floating point numbers into the DDR3 memory separate from and apart from your program?  As in, are you hoping to leave the ARM idle and to read/write these values from your programming logic (PL/RTL) configuration rather than the ARM/CPU/processing side?
    The method that would match what you have above is something more like:
    float global_data[] = { M_PI, 2.71728 }; You can extend the array to any length you would like.  Link this with your program, create the ELF file with this in it, and you should be able to get just about any arbitrary sized table into memory.  You can get the address of the beginning of your data from the linker map file.
    Does this help?
    Dan
  8. Like
    shashi reacted to jpeyron in problem in storing data into DDR3   
    Hi @shashi,
    It appears that [email protected] answered your question in this thread. Please let us know if that is not the case.
    cheers,
    Jon
  9. Like
    shashi reacted to [email protected] in Storing float values in DDR3   
    @shashi,
    Ouch, that looks pretty painful!  Why not something simpler, such as:
    float *a = (float *)BASE_ADDRESS; *a = 3.14159; The thing to be aware of, though, is ... what else is using that DDR3 SDRAM?  Will you be overwriting memory already allocated to something else?  What about the memory management unit?  Do you have it enabled or disabled? 
    Another fundamental question: are you running Linux?  Linux tends to use the memory management unit, and there are standard ways of getting access to the memory device under Linux that are device independent and can be looked up in many text books.
    Dan
  10. Like
    shashi reacted to jpeyron in File system in vivado SDK to run on ZED board   
    Hi @shashi,
    Looking into this issue i found a pdf here on pages 25 and 26 that discuss using open(), close() and instead say that file support is limited to stdin and stdout.  You would be able to use read,write where the fd is either 0(standard input) or 1(standard output). I believe you would need to use the axi quad spi and the sd card to read a file. Could embedded linux be a better solution to this issue? This looks like an example that uses fopen() with embedded linux from xilinx's github.
    cheers,
    Jon
  11. Like
    shashi reacted to artvvb in Measuring the time of execution on FPGA, in vivado   
    @shashi
    And as for determining evaluation time, you create a timer by running a set of counters in parallel to the module/s your algorithm is implemented in, starting the timer when you start processing the image, and stopping when it has completed. From there, you can return the elapsed time (in clock cycles, or whatever format you want to convert it to) to your PC over the serial connection you are using for data. The answers to this question may provide a good starting point.
    Hope this helps,
    Arthur
  12. Like
    shashi reacted to artvvb in Measuring the time of execution on FPGA, in vivado   
    @shashi
    Using the Zynq and SDK will certainly make it easier to initially implement something, I still have concerns about the speed of transmission - the AXI bus between Zynq and your IP core can be a bit slow - but you can try to get your project running first, and then try to find a faster way to run it. Try running through the Getting Started with Zynq Servers tutorial, as it should get you started.
    Hope this helps,
    Arthur
  13. Like
    shashi reacted to jpeyron in Measuring the time of execution on FPGA, in vivado   
    Hi Shashi,
    Here is a thread where i includes a echo.txt file that should show you how to alter the echo.c file. Hope this helps!
    cheers,
    Jon
  14. Like
    shashi reacted to artvvb in Measuring the time of execution on FPGA, in vivado   
    @shashi
    Assuming that you are bringing the image onto your zedboard from a PC, your best option is probably to set up an ethernet connection to your board. This tutorial was written for the Zybo, but should still apply. Once you have a server running, and can pass received data to your algorithm, you will need to get the image data into a format that you can send. You have a few options here, you could use some sort of script to communicate with the board (for Python, possibly this library, looks like a significant challenge), or you could parse the image data into a format that whatever serial terminal you are using could accept - for TeraTerm, you would need to write a program to convert your image data into a Tera Term Macro, reference information here.
    Hope this helps,
    Arthur