D@n

Members
  • Content count

    1455
  • Joined

  • Last visited

  • Days Won

    112

Everything posted by D@n

  1. Waveforms Linux install?

    @RTC, You mean waveforms 2015, right? Digilent used to offer a product called just "waveforrms" which isn't what you'd want to install. I've downloaded waveforms 2015 and installed it on Ubuntu 16. The trick was that you needed to use apt-get to install the .deb file. Don't forget to install the adept utilities and adept runtime as well. Dan
  2. Matrix reception module in vhdl

    @cristian_zanetti, The link(s) above shows you how to build an readi function that you can use to read from the FPGA just like a memcpy, and an fwritei that you can use to write to your FPGA just like a memcpy. Both are very similar to an fread(). Dan
  3. Matrix reception module in vhdl

    @cristian_zanetti, Is this the sort of thing you are looking for? Something with a host interface that looks something like this in C++? I put a series of articles together discussing how to build it. While the code is all in Verilog, the articles break it down far enough that, in my most humble of opinions, you should be able to rebuild the interface in VHDL should you wish to do so. Let me know, Dan
  4. Pmod OLED

    @hananson, Having made a PMod OLEDrgb run from an FPGA, I think you'll find the documentation you need in a couple of places. Perhaps the first and primary place is in the SSD1331 datasheet. This will tell you what all of the commands are that can be issued to the device, and what those commands do. This wasn't enough for me, however, since the initialization sequence of the SSD1331 is not trivial. I also needed to examine the MPIDE code for the PModOLEDrgb to get that part up and running. You can see my project here. The Verilog portion of the OLED controller is here. The controller responds to the wishbone bus, so you might find that controller kind of sparse regarding what you need to know. That wishbone bus is controlled by the ZipCPU, rather than microBlaze, and you can find the software for a ZipCPU OLEDrgb demo here. For the most part, the only thing I ever did was copy images to the OLEDrgb using a DMA controller, so I haven't gotten so deep a to write text to the device (yet), but I may do so in the future. Hopefully this helps. If not, let me know what more you need. Dan
  5. Buttons bounce

    For all of those who have never dealt with a contact bounce, I just finished measuring the response of pushing a whole slew of buttons: those on the Arty, the CMod-S6, the PMod-Keypad, and even on an icoBoard. It was a fun project. Here's an image showing one of the responses I came across, I'm hoping to post the others soon, but I thought this was just too fun not to share, Dan
  6. Non-clocked synchronous circuits

    @inflector, You might find these techniques more reliable than a logic generated clock for handling timing. Of course, judging by the PModCLP post ... you probably know most of this already, but I thought I might just point it out anyway. (There's probably someone reading this that doesn't know a better answer, so ... that's what the post is for--to keep me from repeating myself.) Dan
  7. Non-clocked synchronous circuits

    @inflector, My apologies, it looks like I missed the thrust of your article initially. A logic "clock" is defined as the result of some logic calculation being used as the posege or negedge in an always block declaration. Further, the "logic" generated by clocked logic will be valid some time *after* the clock edge. This means that the timing of this "logic clock" will be separate and distinct from the clock used to generate it--rendering the always block a part of a new clock domain. My rule for beginners has always been never to use logic clocks. Several responses to this post included a hearty discussion of both logic clocks and synchronous vs asynchronous resets. You can read those comments here. I've looked long and hard to find Xilinx's advice on this. I think the last advice I found from their staff was, "We're all adults here. If you know what you are doing, and you know what that means in hardware, go for it." @Piasa has actually done a nice job of summarizing some of the problems. 1) While flip-flops might only change on a clock, the logic between flip-flops may take some time to settle. If you use any of this "in-between" logic as a clock to a flip flop, you may find the clock switches erratically, that some of your following logic may see these extraneous clock flips, some might not. In the end, you get unpredictable results. 2) Logic clocks are not synchronized with the clock of the logic that creates them. They will always be delayed, and this delay will change across operating conditions. As a result, you'll need to cross clock domains when moving from the logic clock back to your main clock. This usually costs several cycles, so it would slow down the CPU. If you'd like to read some more on the issue, Wikipedia has a nice discussion of metastability, and Clifford Cummings has written a nice paper about what clock domains are, why they are important, and how to cross between them. While I'm quite proud of my own post on the topic, it's not nearly as extensive of a discussion as Cummings provides. Hope this helps. Please write back if not. Dan
  8. Non-clocked synchronous circuits

    @inflector Oh, and about that posedge thing on something other than a clock .... don't do. You'll get deep into so many headaches ... it's just not worth it. Dan
  9. Non-clocked synchronous circuits

    @inflector, I see I've gotten you thinking ... that's quite a compliment, so thank you. In general, my purpose with the ZipCPU has not been to reduce the power usage. I'm not sure I could compare with such a giant as TI's MSP430 which, if I understand correctly, was rebuilt using special technology within it's gates and logic just to keep it low power. On an FPGA, you're sort of stuck there. So ... where else might you look? As I recall, power cost is measured by every changing logic level within your design. (I'm not the expert, so perhaps someone might correct me here.) I think, last I looked, it was something like a constant related to the capacitance of the wire times the number of logic levels transitioning times the core voltage squared, or some such. This helps to highlight that the faster your clock rate (i.e. the faster things transition) the higher your power will be. Hence, if your goal is lower power, you will need to 1) fix the inputs so the logic doesn't cause any changes, or 2) drop the clock rate. I've thought about fixing the inputs for the wishbone bus. This would have a lot of consequences downstream, as many of the peripherals (memory included) respond in the fashion you notice above--even when they aren't selected. Fixing inputs would minimize transitions of any of this logic--especially when the bus isn't being referenced. You can see how I've drafted this change here. (Look for the ZERO_ON_IDLE define) I've also thought about creating a sleep state for the CPU that actually suspends the clock and so also lowers power usage. The CPU would then enter this sleep state any time a user-space program issued a WAIT(for interrupt) instruction. However, you'd have to be careful about how you did this, since the peripherals that then generate the interrupt would need to continue to be clocked. Further, you'd want to guarantee that the clock would always run if there was an interrupt pending, etc. (I think modern PC's do this with some of their co-processors: SSE, MMX, etc, when these aren't used in order to keep their overall power down. They might even actually power down that voltage rails for those portions of the chips as well.) I think that answers your first question. To your second, well .... sort of. The logic within the ALU is always executed, but the results are only stored if i_ce is high. You might argue this means that the logic is being calculated for both i_ce and !i_ce, but the !i_ce wires are already present and their values already calculated (with the one exception). If you are interested in low power, Michael Keating, et al,  wrote a text book on the topic titled the "Low Power Methodology Manual: for System on a Chip Design." You might wish to look this up. However, having looked over it, my initial review of it was that it applies more to ASIC chip development than to FPGA design--but I think you may still find the principles valuable for understanding what is going on. Dan
  10. Difference between BRAM, DRAm and DMA

    @gcp Yes. BRAM is located within the FPGA fabric (PL), rather than on the PS. Further, most FPGA's have BRAM's within them as well, Dan
  11. PmodCLS - missing character on SPI

    @krzysiekch, Looks like I used CPOL=1, CPHA=0 -- assuming I've got the terminology right. Be aware, though, I always "bit-banged" the port. You can see my software here, if you'd like. Perhaps that'd help. (In the exaple software, the GPOSETV() macro sets a pin high, whereas GPOCLRV() sets the pin low.) Dan
  12. PmodCLS - missing character on SPI

    @krzysiekch, Why is your clock idling low? Shouldn't it idle high in CPOL=1 mode any time the select line is high as well? I also thought the clock had to drop after the select line did. I'm not sure if this makes a differences with the CLS or not ... it's just the way I've used it successfully. Dan
  13. Nexys Video HDMI Capabilities

    @Flux, I'm going to violate the "never do math in public" rule here, but let's see if I can get this right: A 1080p60 video stream contains 8'bits per pixel for 3 pixels for a total of 24 bits. In order to facilitate further processing, we'll store this in 32-bits. Further, there are 1920x1080 pixels per frame, and 60 frames per second, so storing this information requires storing 32'bits * 1920 * 1080 * 60 Hz or 3.98 Gbps. Reading this information to display it would also require the same bandwidth, or another 3.98 Gbps from memory. The Nexys Video memory can handle eight 16-bit transactions per 100MHz clock, or 12.8Gbps. There's a lot of latency involved, though ... on the order of 24 clocks (I can't remember the exact number--Xilinx messed up on the performance of their MIG controller). You also have to account for the system taking the SDRAM down for maintenance, so figure you can get about 12Gbps or so. That means that two 1080p60 data streams, one going to memory and one coming from memory, will consume 66% of your memory bandwidth. If you choose to fracture those transactions by having a CPU work on them via some sort of random access, you could easily and quickly consume the last of this bandwidth. My point is simply this: think through your entire design as you plan it out. As with any FPGA design, the devil lurks in the details. Dan
  14. Nexys Video HDMI Capabilities

    @Flux, I have 1080p60 working on my desktop using the Nexys Video, both transmit and receive. The limitation I'm running into so far isn't a video rate bottleneck, but rather a memory bandwidth bottleneck. Dan
  15. PmodCLS - missing character on SPI

    @krzysiekch, Try setting it high as part of your initialization sequence, and only lowering it to send a single character. This is how I've interacted with the PModCLS quite successfully. Dan
  16. PmodCLS - missing character on SPI

    @krzysiekch, How are you controlling the SPI CS_n pin? That should help you frame your data properly. Dan
  17. Basys3 sine wave simulation

    As a follow up to some of my previous comments, I've now posted some articles describing much of how to do this on ZipCPU.com. One topic I've addressed includes how to generate a sine wave. The examples provided are generic Verilog solutions, so they should work across all versions of Vivado. (As well as across FPGA vendors.) Examples include how to calculate a sine wave from a table, how to calculate a sine-wave from a quarter wave table, how to generate a sine wave using a CORDIC, how to tell if these algorithms work, etc. Another topic discussed includes how to build a generic FIR filter in Verilog. I'm still working on articles showing how to test such a filter. I'd also like to post about how to build a filter that operates on data that comes in every N clocks--I just haven't gotten that far yet. Dan
  18. Arty hello world

    Have you found the link to the master Arty XDC file from the Arty-resource page? You should be able to copy that file, change pin names, comment any pins you aren't using, and then use that file for any of your projects. Dan
  19. USB uart bridge

    @aitan, Ah, got it now. Check out this (Linux) example. It sets the port for 1MBaud, but if you change the B1000000 number to B9600, you should get 9600 Baud. I use the program to forward a UART port onto a TCP/IP connection. You might find that useful. If not, the example setup within it should be valuable for you. If you are on a Windows machine, I think the setup is similar although probably not identical. Dan
  20. USB uart bridge

    @aitan, You mean ... how fast the USB side of the bridge runs at? That's given by USB speeds which are (fairly) fixed. However, if you are interested in more details, as I recall the bridge is based upon the FT2232H chip. You can find its specification here. Dan
  21. USB uart bridge

    @aitan, I've been universally successful with baud rates up to 1MBaud. On some devices, I've gone up to 4MBaud. The spec say the chip can get up to 12MBaud, but I've never gotten the baud rate quite that fast. Dan
  22. @pratikto.sulthoni.h, Since you asked, I'll answer. However, be aware, my answer comes from a design that doesn't have any microblaze within it. I use a ZipCPU instead. This also means that I've never followed the tutorial above. I instantiate my block RAM with a Verilog memory module. Vivado infers the block RAM from there. The MIG fails because you are using block RAM. The MIG is designed for SDRAM--something you don't have. Don't worry --- you don't need MIG to do this. Is it possible to just save the ELF file to QSPI flash? Yes. I used to do this for many of my projects. In the end, I adjusted my linker script to place a part of my program within the flash that would then later be placed into RAM--the CPU was just too slow otherwise. (16 flash clocks to fetch a 32-bit instruction from flash, 50MHz flash clock, so 32 clocks at 10ns each per CPU instruction ... you get the picture.) I also connect the flash to a wishbone bus, and then command that bus externally. For the Basys3, I used a TCP/IP port to command an internal bus within the Basys3, a bus that had a flash device connected, and then I just used software to read/write the flash from there. My solution to the Basys3 problem doesn't use any of the microblaze tools, or any proprietary flash loader other than the initial configuration loader. Once I have an initial configuration, I then use that to write my program to the flash and then I command the FPGA to start running the CPU. As a second step, if you want the CPU to start automatically, you can write a configuration to flash that will do that--once you know where to place the CPU instructions into the flash so that they don't sit on top of the FPGA configuration. You might wish to keep an eye on your resource utilization. A good high speed MicroBlaze CPU with all its peripherals might just consume your entire board. Something to think about. Dan
  23. Vivado is slow - speedups?

    @ntm, Is this "normal"? Not at all. The number you cite above is actually pretty fast. There are settings that you can adjust. From my standpoint, the defaults are usually set for pretty fast by default. Hence, you can slow things down if you want to play with the parameters. Is there anything else you can do? Yes. Take your Verilog design and run it through Verilator with the -cc and -Wall options. That will often find any mistakes in your design in less time than it takes for Vivado to even start looking for syntax errors. Even better, if you choose to, you may find that you can find problems using a Verilator based simulation faster than it takes Vivado to complete. Dan
  24. implementation of a filter in an FPGA

    @cristian_zanetti, While I've watched others build code with Matlab, I've never done so myself. Neither do I use VHDL. That said, you can find descriptions of how to implement fairly generic filters in Verilog here. That should go over most of the fundamentals. If the Matlab code is creating VHDL, it should be simple enough to compare to in order to understand what's going on. Dan
  25. FFT problems doesn't finish never

    @neha, This sounds fairly off topic for the current question, so I might suggest you start a new topic. To use the IP catalog IP's, click on the window menu, go to "IP Catalog", double click on the IP you are interested in and then configure it for your use. You can then use this IP in your system. As I recall, Vivado will create a stub for you, showing you all the details you need to know in order to reference it within your design. As for other FFT types of questions, check @mohamed shffat's other forum requests. He has some examples within them of how he's hooked up the FFT, as well as a discussion of how doing so was write (or wrong) and what should be done differently. I'd still recommend starting a new topic, though. Dan