I am confused about something. I have done the Xilinx High Level Synthesis tutorial (UG871, Dec. 2017) version. I have not done the last two chapters.
But something is not clear.
Let me give an example:
I have some very large c programs that I would like to increase their speed of execution. They were programed properly in that the whole program is made up of subprograms that are, of course, much smaller.
The programs are written in c and they compile and run as they should. However, when I profile the program, it is clear that the majority of the program execution time is spent in only a few subprograms. These programs make up about 98% of the program's execution time. So out of say 80 subprograms in my program only about eight are appropriate or suitable for high level synthesis.
In other words the majority of the code is untouched and certainly not synthesized to Vierilog or VHDL.
I should say that for a long time, I have been programming for GPU and it is almost always the case that only parts of big program are appropriate for modification. The majority of the code in GPU program to speed up execution is untouched. It is just c code.
I am seeing the same when I started in high level synthesis programming. However, there sees to be a disconnect here.
In GPU programming we are always worrying about the bandwidth of the CPU-GPU bus. How much data can pass between the CPU and the GPU and how fast will it pass.
I see nothing like this is high level synthesis. I am not even sure how the FPGA I interfaced with the computer. I am guessing using a USB cable or a PCI Express connection, but I am really not sure.
That is the reason for this post.
In going through the Xilinx HLS tutorial it never discusses this aspect of the process - interfacing with the main program. I assume that it is there; I just have not seen it.
For instance, how does one integrate a translated c code subprogram into the rest of the program? It just seems to be very silent on this matter.
I said above that I have not performed the last two chapters of the HLS Tutorial in UG871.
I think these last two chapters of the tutorial may be in the area that I am seeking. I am interested in your thoughts, since I really want to speed up a large complex program if for no other reason that to justify this expense to my sponsors.
I have done a lot of high level synthesis this year, but I am seeking some practical answers to these questions.
Sorry about the long winded post, but I just want to use HLS to speed up programs that are already running; they just are not running fast enough.
Question
Newport_j
I am confused about something. I have done the Xilinx High Level Synthesis tutorial (UG871, Dec. 2017) version. I have not done the last two chapters.
But something is not clear.
Let me give an example:
I have some very large c programs that I would like to increase their speed of execution. They were programed properly in that the whole program is made up of subprograms that are, of course, much smaller.
The programs are written in c and they compile and run as they should. However, when I profile the program, it is clear that the majority of the program execution time is spent in only a few subprograms. These programs make up about 98% of the program's execution time. So out of say 80 subprograms in my program only about eight are appropriate or suitable for high level synthesis.
In other words the majority of the code is untouched and certainly not synthesized to Vierilog or VHDL.
I should say that for a long time, I have been programming for GPU and it is almost always the case that only parts of big program are appropriate for modification. The majority of the code in GPU program to speed up execution is untouched. It is just c code.
I am seeing the same when I started in high level synthesis programming. However, there sees to be a disconnect here.
In GPU programming we are always worrying about the bandwidth of the CPU-GPU bus. How much data can pass between the CPU and the GPU and how fast will it pass.
I see nothing like this is high level synthesis. I am not even sure how the FPGA I interfaced with the computer. I am guessing using a USB cable or a PCI Express connection, but I am really not sure.
That is the reason for this post.
In going through the Xilinx HLS tutorial it never discusses this aspect of the process - interfacing with the main program. I assume that it is there; I just have not seen it.
For instance, how does one integrate a translated c code subprogram into the rest of the program? It just seems to be very silent on this matter.
I said above that I have not performed the last two chapters of the HLS Tutorial in UG871.
I think these last two chapters of the tutorial may be in the area that I am seeking. I am interested in your thoughts, since I really want to speed up a large complex program if for no other reason that to justify this expense to my sponsors.
I have done a lot of high level synthesis this year, but I am seeking some practical answers to these questions.
Sorry about the long winded post, but I just want to use HLS to speed up programs that are already running; they just are not running fast enough.
Any help appreciated. Thanks in advance.
Respectfully,
Newport_j
Link to comment
Share on other sites
2 answers to this question
Recommended Posts
Archived
This topic is now archived and is closed to further replies.