• Content Count

  • Joined

  • Last visited

  • Days Won


Everything posted by D@n

  1. @RCB, Did you notice the glitch in your source signal in the second plot? It's in both data[] and frame_data. You'll want to chase down where that glitch is coming from. After looking at that source signal, I noticed that the incoming frequency of your first image didn't match the 1MHz frequency you described. At 1MHz, you should have one wavelength inside of 1us. In your first plot, it appears that one wavelength fits in 20us, for a frequency of closer to 50kHz? Further, I don't get your comment about holding config_tvalid = 1. If you have created an FFT that isn't configurable ... then why are you configuring it? It's been a while since I've read the book on the configuration --- did you hard code the scaling schedule into the FFT, or are you configuring that in real time? I can't tell from what you are showing. You also weren't clear about what config_tdata is. Was that the all zeros value you were sending? Finally, the difference you are seeing between natural order and bit-reversed order is not explained by the simple difference between the two orderings. There's something else going on in your design. Dan
  2. @xc6lx45, Thank you for pointing that out. That explains some things I've seen on a recent design of mine. Here I just thought it was because I was cascading so many PLL's together--you know, typical Voodoo design methodologies ... Dan
  3. @Sid Price, Oops ... this is the scopes and instrumentation forum and not one of the FPGA forums, isn't it? Ok, then your request makes more sense. I'm sure @attila will have a comment for you then--he's been very diligent about answering and resolving AD2 requests. Me? I'd probably build a basic Octave script to do what I described above with a data capture. I'd first run everything through a filter, as discussed above. I think Octave has a "conv" (or is it "convolve"?) function that can apply the filter. Remember--the filter isn't a rectangle, but a pair of rectangles with opposite polarity. I'd then make a copy and calculate something like X(t-T/2)*x(t+T/2), where s(t) is the filtered sample stream and T is the number of samples per doublet. If you filter that new stream narrowly around your sample rate, perhaps even via an FFT based filter, you should find a maximum at the mid-point of every sample. That would save you the hassle of needing to run the PLL--something not nearly as appropriate on sampled data. Once you know where to sample, you should then be able to just take a threshold to get each individual bit. It's easy once you've done it once or twice, although I can understand why you might think of it as black magic if you'd never done it before. Dan
  4. @Sid Price, Do you really need a custom decoder? Manchester is pretty easy to decode. There's lots of ways to do it too. I'd be tempted to apply a matched filter to the doublet, and then treat the signal as BPSK. That'd be the "optimal" solution, but often you don't need the "optimal" solution so a lot of other ways exist to do it as well. Let's see ... you can find an example filter here that might work well, and here's an example PLL you could use too. You could even get fancy, given that you are only ever "multiplying" by one or negative one, and so remove all the multiplies from the filter as well. You could go even fancier and adjust this symmetric filter for this purpose too ... Indeed, there's lots of options. What more do you need? Dan
  5. @HasanWAVE, Might it be because your Zynq design only supports AXI3 and not AXI4? The maximum AXI3 burst length is only 16 beats. Which board are you using? Also, this really belongs in the FPGA/embedded forum, not the microcontroller forum. Dan
  6. @bitstre@m, It sounds like you are working with a MicroBlaze based design, since you mentioned a .mcs file above. Xilinx provides an AXI QSPI core that you can use for this purpose. I've personally used my own QSPI driver, and my own CPU. You can find my CMod S6 project here. Perhaps it might serve as an example. To use the QSPI flash, I had one design that I used to program the QSPI flash device on my CMod S6 with (whatever), and then a second design that would use the flash within the design. Sadly, the S6 is such a small device, that you really need to be careful with the logic you try to place onto it. I was pleased to be able to get a "multi-tasking O/S" running on it, although it was only the barest minimal one. Storing instructions on the flash was key to the success of that project. You can read about the instruction fetch that I ended up using here if you would like. Dan
  7. @Dan Lyle, I assume this is your post on Reddit as well? Rather than answering twice, I'll let you find my answer there. Dan
  8. @rddlr, It doesn't look like you've done anything to debounce those buttons. Check out the diagrams here for example waveforms of what a single button press (or release) might look like without debouncing. Indeed, some of those waveforms (not all) were even made with an Arty FPGA board. Looking over your code, I don't see anything in what you've written that will take care of button bounces. You might find this article gives the instruction you need to deal with them. Both of them are part of a series I wrote some time ago on debouncing buttons. Dan
  9. @weilai, Both Digilent and Xilinx provide training materials for how to use the SDRAM on the Arty board. These materials all involve using the schematic based board design. I have not used this board design, as my own approach is an all Verilog approach. If this is the approach you are interested in, than I would not recommend an SDRAM as your next project. I would instead recommend a next project that you could use as a spring-board for other things--a project where you learn how to command/control your design externally, and to debug the design from within. Only after you've learned these lessons will you be ready to go after the SDRAM. These are all things I discuss on my blog. Once you are ready to go after the SDRAM, there are a couple of approaches. My own approach, and the one I use with my Arty board, is to instantiate Xilinx's controller via their Memory Interface Generator. I then interact with this controller using a Wishbone bus structure common to all of my designs. A Wishbone to AXI bridge renders this controller available to me. I also use a UART to Wishbone converter, which then makes reading and writing peripherals within my design from a serial port fairly easy to do. This, of course, is all before I get to the SDRAM--and becomes the infrastructure I use to figure out what is (or is not) going on when I build an SDRAM controller. In other words, if you don't want to use a canned solution, then take a deep breath, slow down, and get there methodically. Dan
  10. @HasanWAVE, You don't. You can't set a submodule's register from the top level. Verilog provides access to this register for verification purposes only, but it's not available otherwise. Dan
  11. @weilai, The .prj file @JColvin refers to is an XML (i.e. text) file. I was able to use it to create this UCF file. That's going to be the least of your problems. See my answers in the other thread for more of what you'll need to deal with. Dan
  12. Weilai, Okay, trying to build your own DDR3 SDRAM controller ... and not sure what you are getting into, yet, okay ... Dynamic RAM is not like static RAM. It's built out of capacitors, rather than FF's, so it needs periodic refreshing as the capacitors drain their charge over time. Synchronous Dynamic RAMs (SDRAMs) are typically organized into banks of memory, where each bank has a special row that may be "activated" and placed into FF memory. Only the activated row may be read or written at any given time It takes some time to activate a row, some time to deactivate it (called pre-charging), and some time to get the memory from the active row. This all requires clock ticks. The DDR3 SDRAM's use a very fast memory clock--typically 4x faster than your logic. It's also 90 degrees offset from the logic as well. Since most SDRAM's work based upon commands given to them, you'll need to examine and get familiar with the command set--there are also very strict requirements between commands too--for example, all banks must be precharged before issuing refresh commands, rows must be activated before access, etc. You'll need an OSERDES of some type to issue these commands. The data wires are the tricky part of the interface. In general, the outgoing signals can be created with an 8:1 DDR IOSERDES. That's not the hard part. The hard part is on receive, where the "clock" for a given byte is returned in a data strobe wire. This clock is discontinuous and only activated when return data is incoming. Indeed, this was such a challenging part of the protocol to handle that Xilinx created special hard-IP blocks in most FPGAs for this purpose--the IOPLL, PHASER, and ... there's an I/O FIFO block of some type as well to handle the asynchronous clock rate conversion--since the DSTB return clock, while at the same frequency as the master clock, will have an uncontrolled phase difference to it. To make matters worse, the key details of these components have been kept Xilinx proprietary, and so they are undocumented. This is all covered by the JESD79-3E specification. If you really want to interface with the SDRAM, you'll need to do some reading. (JEDEC charges a small fee for a copy.) You can read about my own attempts to build an open source DDR3 SDRAM controller here. The code is still public. I think I got all the logic right, up until the required Xilinx primitives and particularly the incoming DSTB logic. Of course, since it has never been successfully demonstrated on real hardware you can just about rest assured that it's broken. There is an open source DDR3 SDRAM driver that has been demonstrated. I haven't used it myself, although I have browsed through the code. It's not simple to do. So, that said, let me ask, are you sure this is what you want to do? Dan
  13. @weilai, I'm a bit confused. 1) I don't have MS, so I can't read your PPT image, 2) I don't remember RAM pins like WE, RE, or ADDR[2:0] on the top of the ARTY schematic, 3) you seem to have a RAM module within your code as well that, 4) checks for re && !we ... so let's back up a bit. Which board are you working with? Is it the Arty A7-35T board? Are you trying to connect with any chips which are off of the board, or are you attempting to connect to components within your design, or the DDR3 SDRAM chip on the board? If the latter, your interface is ... nowhere near the correct one. Have you simulated your design? I discuss how to go about simulating a serial port design very similar to this one in my tutorial. Have you formally verified your design? Also discussed in my tutorial. Dan
  14. @Burak Maden, That looks like an awesome example! Unless I've misread it (which is quite likely), it looks like you should be able to just modify the constraint file to match the Basys3 board and then use it. Are you expecting more trouble than that? Dan
  15. @Burak Maden, How different do you expect the two to be? Dan
  16. @globieai, Octave isn't much of a "simulation tool". It's more of an ad-hoc scripting tool that looks and feels very similar to matlab. It works well for verifying that the output of a simulation is the output it should be. For simulation tools, let me encourage you to look into ghdl (and open source simulator) in addition to xsim (Vivado's simulator). Dan
  17. @Burak Maden, I believe you can find 7-segment display code here, together with some explanation. I've never tried it myself, so I have no idea if it works or not. Dan
  18. @globieai. There's only so much I can tell by looking over your design sources I have a rule in my own practice that any reads from memory should be done in a process of their own, with nothing else in the process Likewise, any multiplies should be in their own process. Xilinx's DSP's support multiply and accumulates in the same cycle, but my rule is intended to 1) make inference simpler, and 2) be more portable across architectures. These two requirements will force some amount of pipelining on your filter, while also allowing you to increase your system clock speed. You are picking particular bits of your output. I cannot tell from just looking that you are picking or sign extending the right values. The error regarding the REFCLK is something you need to pay attention to. The reference clock *must* be at 200MHz. It looks like you changed some clock in your design and didn't think through all of the consequences. In any audio application, there's a relationship between the number of clocks required to process audio input samples, and the number of clocks between samples. This issn't apparent from your discussion above. The proper way to find and fix many of these bugs is through simulation. This is the source of your problems. I don't normally use VHDL myself, or I'd offer more help here. My favorite simulation tool, Verilator, works very well with Verilog--not VHDL. It has no problems with Audio signals. I would typically feed a signal into the simulator, and write the output to a file that I can then read in Octave. I'm not sure how you would do this with a VHDL tool, or which VHDL tool you might use for that task. Others on the forum who work with VHDL, such as @zygot or perhaps even @xc6lx45 (not sure if he usees VHDL) might be better at suggesting the proper simulator and how to go about accomplishing these tasks for a VHDL design. While it is possible to find and fix bugs within an FPGA design in hardware, doing so can be quite a challenge. You'll need to set up some tooling, so that you have access to values within the FPGA. Such tooling would allow you to "see" values within an FPGA. Xilinx offers an ILA capability for this purpose. I use a Wishbone Scope myself, although doing so requires that you have a Wishbone (or AXI) bus already existing within your design. There's also a compressed version of the same that would work nicely for problems where things don't change very fast--such as in any audio problem. You should be able to simulate your design with the scope installed within it, as well as in hardware, to know that you will be able to properly capture the right values from hardware. If your simulation was working, and if you had that kind of infrastructure available, I might then suggest That you replace your filter with a square wave generator, to verify that your output works--especially since you played with the clock in the design you started from this is no longer certain Once you've verified that a basic square wave works, you should then replace the square wave with a sine wave generator. Repeat the experiment. (Still without the filter in place) You should be able to adjust the amplitude and frequency of both square wave and sine wave, and verify that the change affects the output as expected. Only after you've verified your output in isolation, would I then turn to the filter. I would still keep it separate from the input though--only repeating the input with first a square wave, and then replacing the input of the filter with a sine wave. I would also recommend capturing audio data (at audio rates!) from the input, and going to a file. (You should be able to simulate this before trying it!) Play a note of some type into the audio, verify that you get the same note in the data produced. Repeat the test above using your filter--but capturing to a file instead of the output. I guess the bottom line that I'm trying to get across is that there's a lot of work that needs to take place between the problem you are currently struggling with and the solution you want to achieve. Dan
  19. @globieai, Sadly, building a project that doesn't work is a sad but very common state of affairs among beginning FPGA designers. This is why, when I built my own audio filter, I also built an FFT-based test bench that would verify that my filter worked. This included predicting what level of output would be expected as well. Have you simulated your design? Can you demonstrate that it works in a simulation? Run sine waves through it and verify that you can get sinewaves out of it? You'll find simulation based debugging much easier to do than hardware debugging. Even more, I'm a strong proponent of debugging designs using formal verrification--something that often ends up faster and easier than simulation based debugging, while not missing half as many bugs. Dan
  20. @SigProcbro, There's a better way to implement an FIR than an adder tree. Basically, you apply the input to every coefficient in the FIR at the same time, and then add things together in a line, with FF's between them. I wrote about this method some time ago. Even better, DSP's are optimized for this kind of operation, so they can handle it at pretty high speeds. That said, there's a limitation associated with the bit widths you choose, so I would second @hamster's advice to read the fine manual. As for knowing what the synthesizer does, there's lots of ways to visualize that. One common method is to run the synthesis engine and then open the synthesized design. You can then examine your logic as a group of connected logic blocks. This can be very informative--for small designs. Generally, once the design gets large enough this display gets too difficult to read and so you'll likely start resorting to other methods. Still, it does have its place. Dan
  21. D@n

    image processing in vhdl

    @trian, Let me back you up a bit and ask, what are you doing to simulate or otherwise verify your design before it hits hardware? What confidence do you have that this design works? Dan
  22. @zygot, That's certainly how I took it--it brought quite the smile to my face, even though it's not quite true. I've used other boards as well, but I do enjoy my Arty as a nice all around board. Other boards include: Digilent's Nexys Video, Lattice's ECP5 Versa, the TinyFPGA BX (iCE40 8k), Arrow's MAX1000 (actually Trenz's) featuring an Intel Max10, a Cyclone V DE10 Nano, Folknology's IceCore (iCE40 8k), and more. In general, I like the Xilinx Verilog flow better and find Digilent's documentation more thorough for my purposes. I find it disappointing that Arrow sold the MAX1000 before getting a JTAG loader to work for it, that Lattice did not provide a DDR3 driver for their SDRAM (they wanted more money to get their DDR3 support), that the buttons keep getting torn off of my MAX1000, etc. I guess you get what you pay for. I don't generally use VHDL, but I would note that many folks have used ghdl as a VHDL simulator and have been quite pleased by it. I'm also aware that of work that's being done to take the output of the ghdl parser and put it straight into an open source synthesizer (Yosys). While this might not be a big deal in comparison to Vivado, it can be a great help when trying to recover from ISE's out-of-date language support or manage projects across multiple platforms without needing multiple synthesis tools. Dan
  23. @SigProcbro, I have the Arty A735T device. It has certainly kept me busy for a while. The one place I have struggled a bit is with the number of DSPs available. It hasn't stopped me, however. Instead it's taught me creativity. For example, most of the FIR filters I work with tend to be symmetric--that knocks the DSP requirement down by 2x. Further, I've been able to get away with combinations of CIC filters and half-band filters, again knocking the requirement down further--sometimes another 2x. Finally, when working with audio samples, there's plenty of time to timeshare the DSPs across filter coefficients. There's a lot you can do with it--all depending on what you want to do. If your goal is processing a 75MHz+ band of spectrum with -90dB stopbands, you might want something more powerful. If your goal is 100kHz or below, you'll be just fine. Dan
  24. @Luke Abela, I recently had the opportunity to write a data processing application that used an FPGA as an "accelerator". Sadly, it probably slowed down processing, but the infrastructure is something you are more than welcome to examine and work with if you would like. Data was sent to the FPGA using UDP packets over ethernet, read on the FPGA, assembled into larger packets for an FFT engine, processed, and then returned. Dan
  25. @SigProcbro, These all sound like fun projects! I implemented a STFT algorithm some time ago, so feel free to look at that design for reference if you'd like. That design wrote the STFT results to a framebuffer in memory--something the Basys3 doesn't have. I used a Nexys Video--but that might not be within your budget as a beginner. Still, if you skip the framebuffer you might still do quite well with the STFT on the Basys3. Signal processing is one of my favorite applications. I've written about filter design, testing, and even (logic) PLL designs--you can read all about these at ZipCPU.com. You should be able to test/verify all of these designs in software at your desktop without any hardware. For any computer vision applications, I'd make certain you understand your memory requirements. The Basys3 is quite limited on memory, and so can't handle a lot of image processing requirements very well. Well, that and it doesn't have any video inputs. The Nexys Video is a good board for video input/output/processing, but like I said above it's a touch pricey--especially for a beginner. Can you build your designs in software without any hardware? I personally use Verilator a lot myself. Verilator translates Verilog to C++, which I can then test and simulate on my bench. I've used this method often for testing video, serial ports, and other things. Indeed, when I bought a Basys3 board for myself, I had my design running on it within about two days simply because I had all the details worked out in simulation ahead of time. I've personally found the Arty A7 more satisfying than the Basys3 board, but that's partly application specific. I'm personally very interested in CPU design and its related bus-based development. The Arty provides this capability for me. I've also used the network with success, so again--it's something I"ve personally been very pleased with. As for the memory interface, I've been using Xilinx's memory interface generator (MIG) generated controller. I've used the AXI bus interface again to great success. I personally don't give Xilinx high marks for their various AXI implementations, but this one seems to have been done well--even if the latency through the core is horrible. (20+ clocks at 82MHz) If you are at all interested in Verilog, I've written a tutorial on the topic. Feel free to check it out. So my vote is for the Arty A7, but I'll readily admit a certain bias. Dan