D@n

Members
  • Content count

    1712
  • Joined

  • Last visited

  • Days Won

    126

Everything posted by D@n

  1. D@n

    Has anyone ported GNU Radio to a zynq development board

    @miner_tom Last I recall, the universal software radio peripheral (USRP) that would feed the gnu-radio companion was built within a Zynq. That would handle downconversion, initial filtering, and rate selection. You should be able to look this code up on line. It was quite public. Dan
  2. @zygot, Last time I checked, such a capability was controlled under ITAR. Therefore, you won't find me discussing anything that good here. Dan
  3. @xc6lx45, I must not have been clear. In the multiple system test, I just counted clocks between PPS values--there was no official "loop". Alternatively, you might say that the "loop" had infinite bandwidth. I expected some noise. I didn't expect it to be correlated. A noisy PPS might cause them to be correlated, as might other conditions (common power supply, temperature, etc.) The earlier post above, where I revealed < 1us performance, discusses the results of a proper loop filter. Dan
  4. The presentation was comparing verilator 4.0 with the big simulation vendors. 4.0 is a brand new release, so if you tried it a while ago, it's probably 5x faster now. Dan
  5. Did you see the recent presentation showing the Verilator can beat the big vendor simulators in speed by 5x or more? Fun stuff, Dan
  6. For one of my projects I did something very similar years ago. I had four clock sources and one PPS source. I then counted the number of clock ticks between PPS events. Much to my surprise, all four clocks adjusted their speed together over the course of several minutes, suggesting that the PPS was .. less than perfect. In the end, I think I knew less about what was going on than when I started: "The man with two watches never knows what time it is." Dan
  7. D@n

    Hi! I'm new here

    @Austin01. Welcome to the forums! There are many types of engineers here. I tend to work on FPGA's myself. If that's your bent, then let me invite you to visit the ZipCPU blog. Dan
  8. @zygot, The project @hamster mentions above isn't really all that hard to do--I have one of my own. I use mine to calculate the absolute time of a disciplined counter within my own design. In my case, I've measured my Basys3 oscillator's frequency against the PPS to be slightly lower than 100MHz, but memory escapes me regarding just how close it is. (It was better than 100ppm as I recall, but that's being conservative) Such a counter could easily be multiplied by a frequency and then used as a phase index for a GPS synchronized audio output as @hamster suggests. While such an output would be better than a tuning fork, that doesn't really answer your question above. Properly answering the question would require a well disciplined OCXO or better--something I don't have. (I'm not even sure I could properly engineer the power rails for such an oscillator ...) Measuring time or frequency accuracy is tricky, especially since truth can be so hard to come by. After getting some counsel, I discovered that it's often done by comparing a timing measure against itself some time later. In my case, I compared my counter against the PPS one second later to see how close I was. Given my measures, I could regularly predict the next PPS within about half a microsecond or better. This was a bit of a surprise for me, since I was hoping for 10ns or better, but without better tooling I can't tell if it's the PPS or the local clock that's responsible for not doing better. (The simulation achieved much better than 10ns resolution ... ) I suspect the local on-board oscillator. Dan
  9. @zygot, I'm still listening to your preaching, but not getting it. I just don't believe in Voodoo logic design. As background, I've helped the SymbioticEDA team build a PNR algorithm for iCE40 FPGA's, and I'm also peripherally aware of Google sponsoring similar work for the Xilinx chips as well. (Both are part of the nextpnr project) This was how I knew to comment about registered I/O's. There are also metastability problems--problems that simulation won't necessarily reveal, post-PNR or not. (Well, you might get lucky ...) I trust you've been around long enough to avoid these. My point is, having looked into the internals of the various FPGA's, and having written and debugged PNR algorithms, I'm still looking for an example of a design that passes a logic simulation, a timing check, and yet fails a post-PNR simulation. My interest is to know if there's something that needs to be done within PNR to keep this from happening. Do you have such an example that you can share? (Other than latches--you have shared about those, but we already know that latches are bad.) Even better if so, can you explain the underlying phenomenology that caused the problem? Dan
  10. @zygot, I've only ever seen a couple of bugs where one placed solution would work and another one that passes the same timing requirements would not. One bug was fixed by registering the outputs of the FPGA in the I/O elements. The same can be applied to the inputs: register them as soon as they enter the chip, in the I/O element. I've also been counseled to output clocks via ODDRs or OSERDES's. Together, these approaches have kept roblems from cropping up in a PNR dependent fashion. Are you unable to use these solutions? Or do these approaches not apply for your designs? Dan
  11. @zygot, I must be missing something. You've described a post-place-and-route simulation, but you haven't quite said why it was required. Shouldn't this simulation provide identical results to the pre-place-and-route simulation if the design meets timing? Can you share an example from your own experience of a time when a design met timing, but not post-place-and-route simulation? That's question #1. Question #2: Running a full simulation of any design can be quite costly. For a CPU this might mean starting the CPU, going through the bootloader, running whatever program that follows. For a video system, it might mean working through several frames of video. When you use this post-place-and-route simulation, do you go through this much effort in simulation, or can you cut corners anywhere? Still skeptical, Dan
  12. D@n

    working of pipelined FFT architecture

    @farhanazneen, That task cannot be done without a reference implementation. There is no "theoretical" latency value without a hardware implementation: one clock per memory access, one clock per multiply, this schedule for operations, etc. However, if you use a reference implementation, then the result is no longer "theoretical" but rather "as applied". If you wish to use the Xilinx core as a reference implementation, then start a timer when the first sample is sent into the FFT and stop it when the first valid sample comes out of the FFT. Dan
  13. D@n

    working of pipelined FFT architecture

    @farhanazneen, The FFT I just pointed you at *is* a radix-2 FFT. An FFT can be pipelined and either radix 2 or radix 4 (or radix 8 and higher--but no one does that). It can also be a block FFT that is radix-2 or radix-4. The big difference between radix-2 and radix-4 are the numbers of inputs (and output) to the butterfly. A radix-2 FFT consumes two inputs and produces two outputs. A radix-4 butterfly consumes 4 inputs and produces 4 outputs. If you follow that math, for the first stage of a N-point FFT, using a decimation in frequency approach, a radix-4 algorithm will need to store the incoming values into memory until it has values k, k+N/4, k+N/2, and k+3*N/4, for k from 0 to N/4-1. The butterflies will then only operate for 1/4 of of the time, and need to wait for inputs the other 3/4. Similarly, the FFT will produce four outputs at once, while one can move on to the next stage, all the others will need to go into a memory. Hence, your memory requirements for this stage will go up from N block RAM points to 2N, although this new stage will now accomplish the work of two of the radix-2 stages. As for delays ... aside from filling memories, I'm not sure: I've never built a radix-4 FFT butterfly in HDL (yet). I'm not sure how I'd go about handling the the three complex multiplies required. Right now for my radix 2 FFT, I only have to deal with one complex multiply which I can then turn into three real multiplies. With a radix-4 butterfly, does that mean I'd be using 12 real multiplies? Or would those 12 somehow need to be multiplexed to share DSP hardware. I'm not sure--I've never built one. Normally, you just accept the delay of the FFT in your code. Why are you so concerned about the delay? May I ask what application you are trying to solve? Dan
  14. D@n

    VHDL BASYS3 internal clock problems.

    @zygot, We are way off topic, and let I'd love to hear a reason (story) illustrating why timing simulation as you have described it is essential. Perhaps we could take this to a new topic/post? Dan
  15. D@n

    VHDL BASYS3 internal clock problems.

    @zygot, I've spent my time doing logic simulation, and not so much timing simulations. While I can simulate (logically) any logic running even at multiple clock rates, I've never gotten into the analog side of how/when transitions actually take place. In my opinion, I haven't needed it. Maybe there's something I'm missing. That''s not to say there's no use for it--I just don't feel like I've needed it. As to your first question, Verilator simulates sequential (i.e. clocked) logic, and it does so very well. While it will also do asynchronous logic, it doesn't necessarily model any timing delays in that process. Dan
  16. D@n

    VHDL BASYS3 internal clock problems.

    @Tickstart, I just watched a presentation at ORCONF comparing (two unnamed vendor simulators) with the new (open source) Verliator 4.0. Performance/speed improvement was over 5x better than the vendor simulators. Even better, if Verilator's "simulator" doesn't appear to be working, it's just C++ code. I've debugged my design within that C++ code before, usually only to discover some obscure reality of HDL that the simulator got right but that I just mis-understood. Even better, it's easy to integrate co-simulation into a Verilator based design. Consider this design's use of co-simulation to create an on-screen image (i.e. window) of the VGA output of the given design. You sure you want to use VHDL? You are missing out on this wonderful Verilog only capability. As for debugging your code ... Consider this UART 16550 design. It's been used by the OpenRISC team for years. The authors desk-checked it over and over, yet never quite realized that the transmitter could be made to send a byte that wasn't in the FIFO due to a race condition between the reset and the FIFO read request. (The design tests for whether the FIFO is empty, and moves to the read FIFO state, the FIFO is then reset, then the design "reads" from an empty FIFO just getting random garbage.) The conditions that would set this bug off aren't necessarily reliable--it's a race condition. It will only happen if you reset the transmitter at just the right time. Chances are that anyone who noticed a weird byte getting sent wouldn't notice it. Typical debugging depends upon reproducibility. Race conditions are rarely reproducible. While I highly recommend bench testing at the module level, this design was bench tested. Yet the author never found this race condition via a simple bench test. I don't blame the author, since I also struggle to imagine all of the potential bugs I might come across when building a bench test. I also highly recommend integrated simulation. I've found lots of bugs through simulation. The bug the OP wrote about would've revealed itself after enough clocks worth of simulation. However, you may or may not find bugs like this UART 16550 one through simulation. I found this particular UART transmitter bug with less than 30 minutes of staring at the core using formal methods. It was quick, and certain--once I asserted that the FIFO should not be empty upon any read. The free formal tool, SymbiYosys, works on Verilog files and finds bugs quickly and efficiently. I know of no free formal verification tool for VHDL. Again, are you sure you want to keep using VHDL? Just sayin', Dan
  17. D@n

    VHDL BASYS3 internal clock problems.

    @donwazonesko, May I ask how you are verifying your code? Is the text book your are following, or your course syllabus/professor/whatever, teaching you what you need to debug your code? I ask this because, from a simple survey, I'm finding that very few text books tell users how to find bugs in code like this. The right answer to your problem isn't the bug that you found, but the process you are missing for finding bugs. You should have a test bench of some type, or a simulation, or a formal model, that will produce a trace with all of your wires and their values/meanings within it. This trace will lead you directly to this type of bug. I'm more of a Verilog developer, so I use SymbiYosys for generating traces from formal specifications, and Verilator for my simulation. I've also got a means for pulling a trace from a live board as a last resort. For VHDL, there's a simulation program called ghdl (it's not nearly as good as verilator ) that should work similarly for you. Xilinx also provides a simulation capability, as well as an internal logic analysis capability. Let me take this moment to *HIGHLY ENCOURAGE* you to learn how to use the tools you have at your finger-tips to avoid this problem in the future. Why? Your logic will only get more complex, and more difficult to debug. The sooner you learn how to handle the tough problems, the sooner you'll be on the road to being a successful developer. Dan
  18. D@n

    working of pipelined FFT architecture

    @farhanazneen, If you want to know "how" an FFT works, don't look at Xilinx's implementation. That's a trade secret and you aren't likely to get that answer. On the other hand, you might find the answers you are looking for by examining a similar open source FFT implementation, such as this one. It's actually a full FFT core generator, so if you don't like the example 2k FFT found in the rtl/ directory, feel free to rebuild it for the size you want. An FFT consists of a series of "stages" that implement "butterflies". Then this implementation, based around a decimation in frequency approach, those stages operate on samples k and k+N/2, then on k and k+N/4, then on k and k+N/8, etc. You can see the code for each stage here, or even the top-level FFT that connects the FFT stages together here. The last two stages are special, but only because they can be implemented without any shifts or adds. Now, to your question: Since each stage operates on two elements at a time, and since these elements are separated by 2^(stage_number-1) elements, each of these stages needs a memory equivalent to 2^(stage number -1). Values can be initially stored into this memory. Once the memory is full, the next 2^(stage_number-1) elements plus the memory saved value can go directly into the butterfly. Hence, the butterfly starts operating at this point. There's also a memory storage requirement at the output of the butterfly, since only the first of the two values can move forward immediately: the second value has to wait until all the first values have past, etc. If you were to count this, there's be 1 register for the 2-pt FFT stage, 2 registers for the 4pt stage, 4 registers for the 8pt stage, 8 for the 16pt stage, and pretty soon Vivado will start using block rams: 16 for the 32-pt stage, 32 for the 64pt stage, etc. If you want your FFT output in natural order, you'll also need to do a bit reversal stage. The way I do this, there's one buffer filling and one emptying at every time step, hence you have 2N sample buffers for N points. Now let's back up and talk about delay. For an N-point FFT, there's a delay of N/2 clocks til fill the butterfly for the first stage, plus about 3-4 clocks for the butterfly (I'm not counting--could be a bit more). The same would be true for the next stage, save that it would now be N/4 + (about) 4 clocks, then N/8 + 4 clocks, etc. The delay through the bit reversal is likely to be a full N clocks. All that said, I'd trust the FFT output to tell you when the first block was complete over these calculations above. I know that for my own FFT, these delays can vary significantly from one set of FFT parameters, size, bit width, etc, to the next. Dan
  19. D@n

    USB HID (Nexys Video, etc)

    @Bobo, Yes, you can access the UART ports of the USB-UART connection directly. Be careful, the naming convention isn't always intuitive. (i.e. RX might reference the non-FPGA side of the link) Dan
  20. D@n

    BASYS3 - pushbutton creating another vector out of switches

    @donwazonesko, Please tell me you've debounced those buttons first before coming into the code you just showed me ... there's more than one student whose been burned by not debouncing buttons when doing an exercise like this. You might also want to add an edge detect to them as well. Something like: button_press = (button & !last_button); last_button = button; or some such. Finally, if this is your first project, then let me recommend you find a good simulator and get familiar with it. While I don't use VHDL (much), I've been told that ghdl is a good simulator that you can use. It'll make your project so much easier than just debugginng on your board itself, and it'll also prepare you for your next (bigger) project where you will absolutely need it. Dan
  21. D@n

    [NEED HELP] Fast Connection FPGA-PC

    @raultricking, One item I haven't seen mentioned yet is to use a mex file to help with the Matlab ingest. Basically, matlab allows you to compile and build your own matlab functions in C/C++ to get the speed of the lower level interface without needing to fight with the parser. The resulting file is a mex file that Matlab will run as its own command. Even better, using this approach you can still maintain access to all of your favorite matlab internals. Hence, if the FPGA side of your interface will work but not the Matlab side, then this is where I'd dig to find the problem. Following @zygot's advice, though, I wouldn't start here, but I might end up here. Dan
  22. D@n

    USB HID (Nexys Video, etc)

    @Bobo, You haven't said which FPGA board you are using, although in this case I doubt it matters. I do know this: Digilent only advertises the mouse and keyboard functionality of its USB HID devices. I would be surprised if any further functionality were supported. You might just manage to get lucky, but I'd be surprised. I also know that the USB wires feed the auxiliary microcontroller and not the FPGA. There's no way to get access to them within the FPGA without doing some soldering. Should you choose to do some soldering, tinyFPGA has posted some USB processing code you might consider examining. I was surprised, reading through it, to discover that USB processing within an FPGA was actually possible--even though it is quite complex. Dan
  23. D@n

    BASYS3 - pushbutton creating another vector out of switches

    @donwazonesko, I must be missing something. What's the difficulty with creating signals from pushbuttons. I mean, I'm aware of the problems with bouncing, but that's just annoying--it can be dealt with easily enough. To what are you referring? Dan
  24. I've been posting quite a few "tales from the trenches" on ZipCPU.com. Perhaps you've seen some of them? Here's an article telling of some of the times I've gotten stuck. Another article discusses reasons why simulation might not match the hardware implementation. You can read here about one of the uglier bugs I've come across, or here regarding about all the bugs I found, and have since fixed using formal methods, in the ZipCPU. Another fun one was a student's response to one of the articles I'd written, telling his tale from the trenches. In general, though, I embed most of my tales from the trenches stories into topical articles. For example, here's a fun recent one on how to build an SPI controller--including a discussion of how I had to work through a timing issue with the final SB_IO pin controller in an ODDR type of mode. (That vendor doesn't call it an ODDR, but that's the mode the I/O was placed within.) Perhaps that'll get the discussion going for you? Dan
  25. D@n

    Hi! I'm new here

    @coldfiremc, Welcome to the forums! I've also got a Nexys Video that I'm working with. Indeed, my weekends project was to teach it to display an audio spectrogram. I managed to get an HDMI simulation working, and the design looks nice in the simulation. Feel free to ask questions about the Nexys Video in the FPGA forum or the embedded systems forum within that. I tend to answer only the Verilog/RTL types of questions, leaving the SDK and Vivado schematic based design methods to the paid staff. I'll also be glad to answer any questions you might have about formal verification--but I'll wait 'til someone asks. Oh, let me also invite you to browse my own blog. Perhaps you might find something there of interest. Dan