• Content Count

  • Joined

  • Last visited

  • Days Won


D@n last won the day on August 29

D@n had the most liked content!

About D@n

  • Rank
    Prolific Poster

Contact Methods

  • Website URL

Profile Information

  • Gender
    Not Telling
  • Interests
    Building a resource efficient CPU, the ZipCPU!

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. D@n

    Genesys 2 SD card slot

    @DanK, I've been using this code for working with SD Cards recently. So far, I've been successful reading and writing files. While I've used the reset, and while I start the design with the SD card in reset, I'm not quite certain that it actually resets the card like it's supposed to. I had some test failures along the way to getting this working that suggest that the reset didn't truly pull power from the card as well as removing the card from the board did. Dan
  2. Seriously? You didn't look at the material I offered then. Allow me to post it here, at the expense of any readability, since the following pictures were taking from the documentation I referenced above: Once you fire up the Memory Interface Generator IP product guide, it will lead you through a series of dialog boxes used to configure the core. Step one is to create a new design. I like to use the AXI interface for my designs. There is another interface available that I have yet to find sufficient documentation for. Step two: skip picking any pin-compatible FPGA's--you'll only need to support the one that was already selected when you created your project Step three, specify that you want a DDR3 SDRAM configuration Step four: Identify your chip, clock rate, voltage, data width, and any number of bank machines that you want. Do note that you cannot go faster than about 325MHz with this board--the chip won't allow it. Likewise, you can't go slower than about 300MHz since the DDR3 SDRAM spec doesn't allow that. Since this is rather high speed, and do to limitations of the Xilinx chip on the Arty, you'll need to use a 4:1 controller. Note that this is grayed out because the chip won't support the alternative 2:1 controller at this speed. The MIG controller will generate this 325MHz clock from the incoming clock you give it (100MHz), and give you an output clock which you can use for your design at 325/4 = 81.25MHz. Depending on whether or not your controller can handle out-of-order returns, you may wish to select "Normal" or "STRICT" ordering. I used Normal when this was written. I've since rewritten my AXI controller for strict ordering--it's just simpler and easier to deal with. You can then choose what data width you want for your AXI bus. 128-bits is the natural data width for this memory. Anything else will bloat the logic used for your memory controller with a data width converter. Depending on how you choose to generate your AXI requests, you may or may not want narrow burst support. (Xilinx's AXI example doesn't support Narrow Bursts, although it is part of the AXI spec.) This would also bloat your design, and I'm not sure it's really necessary. I have written an article on how to handle narrow burst support. Moving on, AXI allows out of order transactions. My original design used out-of-order support and allowed every transaction to have a different AXI ID and thus to get returned in a different order. It was a pain to deal with. If you want to do this though, an ID width of 5 with unique request IDs per beat will allow the controller unconstrained freedom to reorder your requests. If you choose to use strict ordering, you don't need to support any more ID widths than the rest of your system requires. Step six, where you were asking your question, is spelled out here as well: a 100MHz input clock period works just fine. It will be multiplied up to 1.3GHz and then divided back down to 325MHz to support the clock rate we selected earlier. Read burst type really depends upon how your CPUs cache wishes to access memory. If you don't care, just pick sequential--it's easier to optimize for. Output driver and RTT impedence come straight from the Digilent documentation. This board has a Chip Select pin on it's DDR3 interface, so you'll need to enable that here. I also like Row-Bank-Column ordering, since it's easier to guarantee 100% throughput, although I have no idea if the MIG controller will guarantee non-stop throughput across banks. Perhaps I'm just superstitious that way. Step seven, clock inputs: Since you will need a 200MHz reference clock (this isn't optional, nor is the frequency up to the user), and the only place you can get that on this board is from the 100MHz input clock, your only option is to select "No Buffer" for both of these clocks and to run your 100MHz through a PLL. You can then send the outputs of that PLL to the MIG. Picking 100MHz gives you greater flexibility for the rest of the clocks coming from that PLL. Reset type is up to you, I'm using Active Low. You won't need the debugging signals--if everything works well. (I have had problems before, but solved them without the debugging signals.) The board also provides you with an incoming voltage reference. I/O power reduction is the default, but not something I've experimented with, and unless you want to feed the core with the local temperature yourself, you'll need to enable the on-board XADC and give the MIG control over it. Step eight: You'll want to select the internal termination impedance of 50 Ohms. Step nine: Since Digilent already built this board, you'll want to select a "fixed pin out"--the pins are already laid out and selected for you Step 10: This one's a real pain if you don't want to reference my design as a place to start from, but ... you'll need to identify each of the external wires to the MIG for layout. You can reference my design and the UCF file within it to make this part easy, or you can painstakingly dig for that information. As I recall, I found this information from the Arty project file. Step 11: The final step I took was to make the system reset, calibration complete, and error signals "No connect"--that keeps them from being assigned I/O buffers if they aren't directly connected to pins. You can still use these signals within your design if you want. Do be aware that the latency of the MIG controller is around 20+ clocks. It works ideal for high throughput designs of 128-bits per beat, such as a cache might generate, but it's going to be a speed bump in any more random memory access pattern. Now, that's an awful long post to answer a basic question which would likely have followups, but it does show in gory detail how you can set up the MIG. If you want more detail, the project itself is an example design which you can reference--or not. Take your pick. Oh, and Yes, this information was in the project's documentation which I referenced in my first response. Dan
  3. @binry, You can see all the settings I used here. Do take note, though, that most of those who use the Arty platform tend to use the Vivado schematic editor. I'm one of the few that does not. Yes, I choose not to use buffers for the clocks specifically because they come from internal MMCM/PLLs. The other option is for clocks coming directly from pins into the MIG core. In those cases, the MIG will attach I/O buffers to the clock pins--something that can only be done to input pins before any other logic is applied. Since I needed the 200MHz clock, I went with the PLL first--forcing the MIG clocks to avoid instantiating I/O buffers lest the design not be able to match to the Xilinx hardware. I also use the 100MHz incoming clock, after passing through a PLL, directly into the MIG--so the incoming system clock rate is 100MHz rather than 166.67MHz. This is the value in the IP catalog setting, as linked above. This is separate from what the core wants as a "reference clock"--that needs to be 200MHz. The MIG then produces it's own clock, which I then use as the system clock throughout my design. Dan
  4. @binry, I usually feed the MIG controller with the 100MHz clock as the system reference clock if possible. The MIG will handle dividing this as appropriate. The reference clock must be at 200MHz. This is to support the IO delay controller within the chip, which only accepts a 200MHz clock Yes, you can connect the 100MHz system clock into the MIG. The MIG will generate a reset that you can use--based off of both when the PLLs settle and when it's internal calibration is complete I'm not familiar with the example project you cite. My own example Arty A7 project doesn't use the traffic generator at all. Dan
  5. D@n

    Advanced topics

    For CDC, consider these articles on 1) basic CDCs, 2) formally verifying asynchronous designs, and 3) Asynchronous FIFOs. For speed optimizations, you'll need to learn about pipelining. Here's an article on pipeline control. Dan
  6. Welcome to the forums! I'm an FPGA enthusiast myself, known for my blog. Feel free to check it out and let me know what you think of it. Dan
  7. @Kenny, If you are an FPGA beginner, then ... I would start somewhere else. I would recommend starting by learning how to debug FPGA designs. The problem with FPGA development is that, unlike software, you have little to no insight into what's going on within the FPGA. There are a couple keys to success: Being able to insure your design works as intended before placing it onto the FPGA. Simulation and formal methods work wonders for this task. Unlike debugging on hardware, both of these approaches to debugging offer you the ability to investigate every wire/signal/connection within your design for bugs. If you are unfamiliar with these tools, then I would recommend my own tutorial on the topic. Being able to debug your design once it gets to hardware. This should be a last resort since its so painful to do, but it is a needed resort. To do this, it helps to be able to give the hardware commands and see responses. It helps to be able to get traces from within the design showing how it is (or isn't) working. I discuss this sort of thing at length on my blog under the topic of the "debugging bus (links to articles here)", although others have used Xilinx's ILA + MicroBlaze for some of these tasks. Either way, welcome to the journey! Dan
  8. @Kenny, Any entry level board can and will do an FFT--the real question is how much of an FFT do you want to do. I've done FFT's on an Artix-7 35T, so you should be okay with the S7-50. That said, size and complexity are tightly coupled with both the size of the FFT and the precision of the bits within it. If you want an example, this FFT Demo was built using Digilent's Nexys Video board. It works by reading data from an external Pmod MIC3 microphone, filtering the data, windowing it, FFT'ing it, taking the log of the magnitude, and then writing it to memory. The memory is then treated as a framebuffer and used to display a scrolling raster via an HDMI output. Optional software will make the raster scroll in simulation window (also provided, tested on Ubuntu). Dan
  9. D@n

    Arty S7 Board Layout

    @ykaiwar, It sounds like there's an issue with the DDR3 SDRAM on your new board, but you've declared an issue with the ability to configure the FPGA at all. So let me ask about some basic things in between configuring an FPGA and getting the DDR3 SDRAM working: Can you turn an LED on? Can you turn it off? Can you make it blink? Can you mirror the serial port (you do have one, right) from receive back to transmit and verify that it works? If these tests fail, then you aren't yet ready to discuss possible SDRAM problems. If they succeed, then you can start to bootstrap your FPGA's capability more and more until you know exactly what is and isn't working. Dan
  10. D@n

    Basys3 Memory

    @trian, Truly answering this question is rather complex and will likely depend upon more details than you have shared. You'll need to check the camera and display's I/O capabilities, the FPGAs I/O capabilities, pixel clock rates, the size of the display in pixels, your memory bandwidth and more. If you intend to have a CPU on board, you'll also need to make sure you allocate space for it's memory. That's a lot of analysis you'll need to do. It's doable, but expect to make a couple mistakes along the way. We all do. Some of us seem to make more than others ... I personally tried to do something similar (framebuffer to VGA output) using the Basys3 only to discover in hindsight that it doesn't have a lot of internal RAM. The Basys3 doesn't come with any off-chip RAM, so if you want RAM all you have to work with is the block RAM. In my own design, I only managed to scrounge 128kB together before running out of resources on chip. I was then sadly disappointed when I couldn't fit any decent sized framebuffer on the Basys3. What good is a VGA when you don't have a framebuffer? Instead, I used flash memory as a ROM-based framebuffer. Flash, however, is slow so it took a lot of work to get it fast enough. (I compressed my images) This allowed me to present a basic slide-show at 640x480. I haven't tried higher resolutions (yet). At the time, I thought I'd never do any better and gave up trying. I later discovered that someone on Digilent's staff (I'll let her remain nameless ...) and often on the forum had managed to get a lot of PacMan running in this small memory space. (The project was never complete, collision detection I hear didn't (yet) work by the time the project needed to be handed in. Probably why she doesn't share more ...) So, careful engineering can overcome a lot of problems. You may find yourself limited by your creativity. Certainly necessity is the mother of many inventions. In light of all of this, I'd recommend looking into an off-chip RAM of some-type. Perhaps a hyperRAM? Be aware, though, if you are new at this then you'll have a challenge ahead of you to get something you are unfamiliar with working--even if you choose to use someone else's "proven" core. Dan
  11. @sab, Fascinating! Which core are you using? Is it a public core, a Xilinx core, or commercial core, or one of your own that you are evaluating? If you need a public core that can be used (mostly) cross-platform, then I can provide some that you can then reference in your study if you need to. Let me also suggest that your coefficients need to be run-time settable for performance measurements. If you don't, the synthesis tool might remove certain multiplies (multiply by +/- 2^n) and so otherwise bias your result. Dan
  12. @sab, Just on its face, I'm surprised you were able to implement an FIR filter in only 430 LUTs and one DSP. This sounds rather light. Tell me, though, how many coefficients did your FIR filter have? (I'm typically looking at 10+, this looks too small for that.) How bits did each of those coefficients have? (I like between 8 and 16, to match the incoming ADC) Were the coefficients fixed? (Vivado can do a lot of optimizations on fixed coefficients, not so much on run-time programmable coefficients.) How many bits did each of the input samples have? How many bits were in the output? All of these have an affect on how much logic an FIR uses. Dan
  13. @Tejna, Welcome to the forums! I'm an FPGA blogger, and so I'd invite you to check out any of the articles I've written. </shameless plug> Dan
  14. @ekazemi, The code I presented has user-selectable phase resolution, subject to the accuracy of the originating clock and some quantization error on the output. This can easily be adjusted to create whatever phase clock signal you want. Be aware, as with everything FPGA, the devil is in the details. For example, if you aren't careful, you could create a glitchy clock without intending to. On the other hand, the technique is simple enough as to offer lots of possibilities--which sounds just like what you are looking for. Dan
  15. If you are trying to create clock "glitches", then you definitely want to avoid using the MMCM. Try the above linked method. I think you'll find no problems at rates as slow as 16MHz. Dan