• Content Count

  • Joined

  • Last visited

  • Days Won


Everything posted by D@n

  1. D@n

    ethernet communication with pc

    @PhDev, Have you tried any of those cores? If so, would you recommend them as workable? Dan
  2. D@n

    Display image using VGA from block RAM

    @khaledismail, Sounds like you got yourself stuck in FPGA Hell. Looking over your code, a quick first glance shows that you are using a logically generated clock. This is, in general, a very bad idea--one that can lead to hardware/simulation mismatch. A better approach would be to use a clock enable line. A common reason for ending up in FPGA Hell is not simulating your IP. The difficulty with the position you are in is that not many simulators will co-simulate the VGA your design connects to. I know I had similar problems when using my Basys3 board and getting video to work faultlessly. (I was reading from flash, and decompressing the video stream since the Basys3 flash wasn't fast enough to keep up.) In the end, I needed to write a VGA simulator so that I could "see" what was going on within all the traces within my design in order to find the bugs. You can find that VGA simulator on-line here, or even read about it here. The repo that contains it even includes an example project that reads from block RAM, and outputs the results onto a VGA output for the simulator. The sad part of this design is that it is in Verilog, and uses Verilator--a Verilog only simulation utility. However, I know there exist VHDL simulators that support a VPI interface--you might be able (with a bit of work) to get your design to work within that environment. That might help. (Alternatively, you might choose to re-implement your design in a real language ) Hope this helps, Dan P.S. For those others reading, you may wish to know that @khaledismail has also posted on Xilinx's forums. If you don't see a solution here, you might (eventually) find one there.
  3. D@n

    ethernet communication with pc

    @Aamirnagra, You can find my own RMII interface work here. There's also a simple-ping program to drive it here. Don't forget you'll want an MDIO controller as well! That one is nearly as trivial as a SPI core--it's pretty easy to write. You can find my own MDIO controller here. I also have a software decoder for  the results of that here. Being able to quickly read and get a status from the MDIO interface can be quite valuable. Don't let @zygot discourage you. Yes, there's a lot to learn about, however wikipedia does a decent job describing most of what you'll need for the various protocols, and the Ethernet data sheet for your device should describe the RMII interaction well enough. Yeah, there are a couple of gotcha's, and you'll find some surprising things along the way. For example, I expected the nibble order to be the other way around. In most cases, you can just debug any problems you run into with wireshark. At this point, though, I'll agree with @zygot: network interface work is not trivial. It's a lot of fun though! Dan
  4. @Archana Narayanan, Try working this through from one end to the other. Do you know that your PC->FPGA link works? I usually debug this link by first making sure the transmit FPGA->PC link works. That will verify that your serial port is using the right pin for transmit. Once I know FPGA->PC works, I'll typically composing a PC->FPGA->PC design together. The first time I do this, I place no logic between RX and TX. This verifies that I have the right transmit pin. (Be aware, the labels on a lot of Digilent's schematics are misleading!) The second time I do this I use a serial decoder followed by a serial encoder. This will verify that your serial receiver works. Once you know both work, then let's go back to simulation: can you simulate your entire design end to end before placing it onto your board at all? Dan
  5. D@n

    Keyboard interfacing

    @gummadi Teja, Have you tried your design with a video simulator? A complete trace of everything going on within the design could be very valuable for determining what's going wrong. Dan
  6. D@n

    Digilent Github Demo

    @AlGee, If you are at all interested in Verilog design, then this tutorial is where I would recommend starting. You might even find other topics of interest on the associated blog as well, depending on what you are interested in doing. (Yes, that is a shameless plug.) If you want an example design, here's one I've put together for the Arty. The documentation describes getting the memory up and running, should you wish to interact with it from Verilog. (The flash controller still has issues since Digilent swapped flash chips though ...) Welcome to a fun journey! Dan
  7. D@n

    Basys-3 USB storage compatibility

    @Jonathan.O, Electrically possible? Probably. Practically possible? It would be a long shot. The Basys3 microcontroller that interacts with with the FPGA contains source coded that Digilent has not released. To my knowledge, they have no plans to release this source code, so you might consider this an "unsupported feature". Even if you could do it, Digilent isn't likely going to answer any questions on this method of doing business. The road to doing this would involve using the JTAG port (again, not publicly described) to reprogram the device. Doable? Perhaps. Easy? Not at all. (It'd be hard with the documentation you'd need, even harder without.) My recommendation would be to use a PMod SD (SD-card adapter) instead. That should be easier to integrate into what you want to do, while also accomplishing the same purpose. Dan
  8. D@n

    How to generate another, faster clock (CMOD S7) ?

    @TestDeveloper, I instantiate my PLL's and MMCE's like that all the time. Vivado has been fairly robust in how it handles it, so in spite of the pitfalls it has always worked for me. One thing you might consider doing is to create a counter that counts down (CLOCK_RATE_HZ/2), and then toggles an LED. You might find that useful to know that you got the clocking right. Dan
  9. All, If you've followed the Vivado tutorial to build an AXI-lite peripheral, you'll know to ask Vivado to generate an IP core for you which you can then edit to your taste. What's not so commonly known is that this core has bugs in it : it does not comply with the AXI standard. Specifically, if the Vivado demonstration core receives two requests in a row while the return channels ready line is low, one of those two requests will get dropped. This applies to both read and write channels. The failure is so severe that it may cause a processor to hang while waiting for a response. Worse, since this is caused within vendor provided code, most users won't see any need to examine it, instead choosing to believe that their own code must somehow be at fault. The article demonstrates the bugs in the 2016.3 AXI-lite demonstration core. Since that Vivado 2016.3, Xilinx has updated their AXI-lite demonstration to add another register to its logic--presumably to fix this issue. As of version 2018.3, even this updated logic continues to fail verification. Should you wish to repeat this analysis, this same article discusses how it was done. Only about 20 lines of logic need to be added to any Verilog AXI-lite core, plus the lines necessary to instantiate a submodule containing a property file. That's all it takes to verify that any AXI-lite core properly follows the rules of the road AXI-lite bus using SymbiYosys--a formal verification tool. The steps necessary to correct this logic flaw are also discussed. Since writing that article, I have posted another basic AXI-lite design which doesn't have these flaws. Moreover, the updated design can process bus transactions with a higher throughput than the original design ever would. While I'm not sure quite how fast MicroBlaze or even the AXI interconnect can issue bus requests, this design at least shows how you could build a slave peripheral that can handle two requests at once. Feel free to try it out and let me know if you find any flaws within it. Dan
  10. D@n

    NEXYS 4 Programming Flash

    @bhall, No, this makes perfect sense. Xilinx, in their infinite wisdom, created a SPI port clock pin to be used for configuring the device. It's controlled internally. When they then realized that customers would want to use it as well, they created a STARTUPE2 primitive that you need to use to get access to it. As such, its often not listed in the port lists, but still usable. On several of the newer Digilent designs, Digilent has connected that pin to two separate ports. This allows you to control the pin like a normal I/O. However, doing this requires special adjustments at the board level--not the chip level. Dan
  11. At the invitation of @zygot, I thought I might share about my own experiences using formal verification when building FPGA designs. This comes up in the context of debugging Xilinx's AXI-lite demonstration code, and from demonstrating that with an interface property file any AXI-lite core can simply be debugged with only about 20 or so lines of code. So how did I get here? I imagine that most FPGA users in this community probably start out their journey without formal verification. They either start out with VHDL/Verilog or with the schematic entry form of design. I personally started out with Verilog only. I never learned to simulate any of my designs until a couple of years into my own journey. About two years after that, I learned about doing formal verification with yosys-smtbmc, and then with SymbiYosys. (SymbiYosys is a wrapper for several programs, including yosys-smtbmc, that has an easier to use user interface than the underlying programs do.) The first design I applied formal verification to was a FIFO. By this time I was quite confident that I knew how to build FIFO's. (Imagine a puffed out chest here ...) In other words, I was very surprised when the formal tools found a bug in my FIFO. (That puffed out chest just got deflated ..) This was my first experience, which I would encourage you to read about here. Why didn't I find the bug with simulation? Because there was one situation/condition that I just never checked with simulation: what should happen when reading and writing an empty FIFO at the same time. Perhaps I'm too junior when using test benches, perhaps I'm just not creative enough to imagine the bugs I might find, either way this just wasn't a situation I tested. Some time later, I was working with my ZipCPU and had to trace down a bug. You know the routine: late nights, bloodshot eyes, staring at GB of traces just to find that one time step with the error in it, right? You can read about it the mystery bug here. (Spoiler alert: I found the bug in a corner case of my I-cache implementation.) Wouldn't you rather instead use formal methods to find bugs earlier and faster? I certainly would! Anything to avoid needing to dig through GB of trace files looking for some obscure bug. You can read about the three formal properties you need for verifying memory designs (such as caches) here--had I verified my I-cache using these three properties, I would've found the bug much earlier. After that first experience with the FIFO, I started slowly working my way through all of my cores and applying formal methods to them. I discovered subtle errors when building even a simple count-down/interval timer. With every bug I found, I was motivated all the more to continue looking for more. I was haunted by the idea that someone might try out one or more of my cores, find that it doesn't work, and then write me off as a less than credible digital designer. (Remember, I was looking for work during this time and using my on-line digital designs and blog as an advertisement for my capabilities.) One ugly bug I found in my SDRAM controller would've caused the controller to read or write the wrong address in certain (rare) situations. None of my test benches had ever hit these situations, nor is it likely they would have. You'd have to hit the memory just right to get this error. In other words, had I not used formal methods, I would've found myself two years from now wondering why some ZipCPU program wasn't working and not realizing that the (proven via simulation and testbench) SDRAM controller still had a bug within it. What does design using formal methods look like? The design flow is a bit different. Instead of simulating your design through several independent tests in a row, and then looking through a long trace to find out if the design worked, formal methods get applied to several steps of a design only. If your design fails, you will typically get a trace showing you how the design could be made to go into an unacceptable mode. You can then adjust your design, and run again. In that sense, it is very much like simulation. However, unlike simulation, the trace generated by the formal tool stops immediately at the first sign of a bug, leading you directly to where the problem started. I said the trace tends to be short. How short? Often 20 cycles is plenty. For example, 12 steps are sufficient to verify the ZipCPU for all time, whereas 120 steps are necessary to verify a serial port receiver with an assumed (asynchronous) transmitter with a dissimilar clock rate. This is very different from running an instruction checking program through the CPU, or a playing a long speech through the serial port receiver. As you might imagine, like the other examples, when I finally got to the point where I was ready to verify the full CPU I found many more bugs in my "working" CPU than I expected I would find. As a result of these experiences, I now apply formal verification during my design process--long before I enter any "verification" phase. Any time I need to re-visit something I've written before, I immediate add properties to the design so I can use formal methods. Any time I design something new, I now start with formal methods. I will also use formal before simulation. (It's easier to set up than test benches were anyway.) I use cover() statements to make sure my design can do what it is supposed to, and assert() statements to guarantee that it will never do what it isn't supposed to. While formal can find bugs in corner cases, and it can do so relatively quickly, it isn't the solution to every problem. The formal engines struggle with an exponential explosion of possibilities that need to be checked. As an example, both multiplies and cryptographic algorithms have been known to hit this combinatorial explosion limit relatively quickly. (I can exhaustively verify a multiply with Verilator in 15-minutes that would take over 4-days with formal methods.) For this reason, you want to keep the design given to them relatively simple if possible. That doesn't mean that you can't formally verify large designs. As an example, I recently formally verified an FFT generator. I just did it one piece at a time. I still use simulation. Why? Because of the combinatorial explosion problem associated with formal methods, I tend to only verify components or clusters of components but never my whole design. Using Simulation, I can get the confidence I am missing that my entire design works. Besides, would you believe me if I told you my FFT worked without showing you charts of the inputs and outputs? Only after a design (component) passes both formal verification and simulation will I now place it on hardware. Why? Because it's easier to debug a design using formal than using simulation, and it's easier to debug a design using simulation than it is to debug it on the actual hardware. Eventually, SymbioticEDA approached me in January of last year, after I had been convinced to use formal verification for all my projects and after I had started writing about my experiences, and asked me to do contract work for them. If you aren't familiar with SymbioticEDA, they are the company that owns and maintains SymbiYosys. My first job for them was to create a training class to teach others how to use their tool. I was a natural fit for this job, since I was already convinced in the efficacy of the formal methods they had made available through SymbiYosys. I've had other jobs with them since, to include formally verifying a CPU that had been committed to silicon several times already. (Yes, there were bugs in it.) In other words, while this may sound like a paid advertisement, the experiences I've had and discussed above have been genuine. If you've never tried SymbiYosys, then let me invite you to do so if for no other reason than to determine whether or not my own experiences might play out in your designs. SymbiYosys is free and open source, and using it to verify a Verilog component will cost you nothing more than your time. Feel free to share with me your experiences if you would like. Likewise, you can find many other articles on these topics at zipcpu.com. Dan P.S. I'm slowly building a tutorial on how to design using Verilog, Verilator, and SymbiYosys. The course is intended to be (somewhat) hardware agnostic, so you should be able to use Yosys, Vivado, ISE, or even Quartus with it. Each lesson consists of a new concept and a design that will teach that concept. The lesson then goes over the steps to building the Verilog, Simulating the design and co-simulating any peripherals using Verilator, as well as formally verifying the design using SymbiYosys. While the tutorial remains a work in progress, many have enjoyed it even in its incomplete state. P.P.S. In my "story" above, I didn't pay attention to chronology, so some of the events may appear to be out of order with reality.
  12. D@n

    NEXYS 4 Programming Flash

    @bhall, You should thank @jpeyron for calling me out. I tend to ignore any threads with block diagrams in them--I just don't seem to be able to contribute to them that well. @jpeyron also cited the wrong reference to my article (Oops!). I think he meant to cite this article here on flash controller development. In general it's not really all that hard to do--you just need to spend some time working with the specification and your hardware, and a *really* *good* means of scoping out what's going on. The design built in this article assumes a DDR output. As such it can read a 32-bit word in (roughly) 72 system clocks. I have an older QSPI controller as well that I've used on many of my flash designs. It takes up twice as much logic. The link above should show you where and how to find it if you would like. This one doesn't use the DDR components. The other trick in what you are attempting to do will require you to read an ELF file. Check out libelf for that purpose. It's really easy to use, and should have no problems parsing your executable file--before it turns into an MCS file. Hope this helps, Dan
  13. D@n


    @Junior_jessy, Ok, I take that back then ... it sounds from your description like you might be ready to move forward. I've heard that Cepstral processing works quite nicely for speech, although I've never tried it myself. So, your algorithm will have several parts ... I like how you've shown it above to work in several sections. Now imagine that each of those sections will be a module in your design. (An entity for VHDL types.) That module should take no more than one sample input per clock, and produce one sample output per clock. Your goal will be to do all the processing necessary one sample at a time. This applies to the Cepstrum as well. As I recall, a Cepstrum is created by doing an FFT, taking the log of the result (somehow), and then taking an IFFT. FFTs in FPGAs tend to be point by point proccesses: you put one sample in at a time and get one sample out. So expect to do a lot of stream processing. Alternatively, for audio frequencies, it might make sense to do some block processing ... but that's something you'll need to decide. Either way, it's likely to look to you like you are processing one sample at a time. You mentioned above that all you knew how to do were minimal counters, shift registers and such. Relax: You are in good hands. Most of the designs I just mentioned, to include the FFT, are built out of primarily shift registers and counters and other simple logic. However, there are two other components you will need to work this task: You'll want to know how to do a (DSP-enablerd) multiply, and how to use the Block RAM within your FFT. The rule for both of these is that they should use a process all to themselves. One process to read from RAM, one process to write to RAM, and one process to do your multiply--don't do any operations in any of those three types of processes. That's the tableau you have to work with. How you put it together--that's up to you and your skill as an engineer. As for the FFT, you are welcome to use mine or you can use Xilinx's. That'll at least keep you from rebuilding that wheel. You might find it valuable to rearrange the Octave script you've highlighted above so that it works on one sample at a time as I'm describing. Think about it. Hopefully, though, that gets you going ... a bit. Keep me posted, Dan
  14. D@n


    @Junior_jessy, Well, let's start with those items that will be constant. Constant values should be declared as generics, constant vectors should be placed in some kind of RAM--either block RAM or SDRAM depending upon the size. How you then interact with this data will be very different depending upon which you choose. I notice that you are comparing your data against a known pattern, but looking for the maximum difference in (signal - template). Have you considered the reality that the signal of interest might have a different amplitude? That the voice speaking it might be using a different pitch or a different cadence? Testing against ad-hoc recorded signals (not your templates) will help to illustrate the problems with your current method. Ask your significant other, for example, to say some of the words. Then you say it. Then see which matches. It looks like you might be doing a correlation above. (A correlation isn't recommended ... you aren't likely to be successful with it--see above.) If you find that you need to implement a correlation, your method above won't accomplish one very well.. Your approach above is more appropriate for attempting to match a single vector, not for matching an ongoing stream of data. Correlations against a stream of data are often done via an FFT, point by conjugate point multiplication, followed by an inverse FFT. If you do one FFT of the input, and then three IFFT's depending upon which sequence you are correlating with, you can save yourself some FPGA resources. Be aware of circular convolution issues when using the FFT--you'll need to understand and deal with those. Once done, looking for the maximum in a stream of data is fairly simple. This change should be made in Octave still. All that said, your algorithm is not yet robust enough to work on real speech. Looks like you haven't tried it on ad-hoc recorded speech yet, instead trying it on your small recorded subset. Record an ad-hoc subset and see what happens. I think you'll be surprised and a bit disappointed for all the reasons I've discussed above. Have you seen my article on pipeline strategies? It would help you to be able to build some pipeline logic here when you are finally ready to move to VHDL. (You aren't ready for the move yet) How about my example design that reads from an A/D, downsamples and filters the input, FFT's the result, and then plots the result on a simulated VGA screen? You might find it instructive as you start to think about how to go about this task. (You aren't ready for this yet either--but it might be a good design to review.) Realistically, though, the bottom line is that you really have some more work to do before moving to VHDL. In particular, you need to test your algorithm against ad-hoc sampled data separate from your training data. After that, and after you fix the bugs you'll discover by doing so, you'll have a better chance of success once you do move to VHDL. Dan
  15. @zygot, Thank you! I'd given you up as a lost cause. We can continue this discussion in the forum thread we started it in. I've heard enough from others to suggest you are right in this, I just don't know enough that I could explain it convincingly, so I need to learn more about the issue. Let's start here, since you have just highlighted one of my own struggles when analyzing my own experience. I cannot separate how much of the problems I found and solved via formal were because I didn't know how to write a good test bench, and how many of them are a result of the formal tools just being fundamentally better approach to digital design. I appreciate your insight here, and would love to hear your insight again after you've tried the tools a couple of times on your own designs--should you choose to do so. Your experiences might help me answer this question. Quick hint here: start simple, work up to complex--as with anything in life. It's a work in progress, but I'd be glad to let you know when the entire tutorial is complete if you would like. There is already a discussion on how to use SymbiYosys that starts in lesson 3 on FSMs. SymbiYosys tasks are covered in lesson 4, as is the $past() operator. My current work is focused on the block RAM chapter as well as the serial port receiver chapter--to include how to verify each. Well, not quite. Let me try to clarify. Yosys (not SymbiYosys yet) is a Verilog synthesizer. It can ingest Verilog and output one of several outputs types including EDIF (for working with Xilinx tools), VQM files (for working with Altera), JSON (for working with the open source NextPNR tool), and Aiger and SMT2 (for working with formal solvers). That's one program. Next, there are several formal solvers available for download from the internet: abc, z3, yices, boolector, avy, suprove, etc. These are separate from Yosys and SymbiYosys, although SymbiYosys very much depends upon them. These solvers accept a specially formatted property file as input--including Aiger (avy, suprove) and SMT2 (z3, boolector, yices) as examples. SymbiYosys is a fairly simple python script that connects these two sets of tools together based upon a setup script. The official web site for SymbiYosys can be found here, although I'll admit I've blogged about using it quite a bit. (Try this article regarding asynchronous FIFOs for an example.) SymbiYosys itself can be found on github, just like Yosys and the various solvers, together with some examples that can be used to test it. As for the the comment about SymbiYosys implementing an extension of Verilog, please allow me to clarify again. The "extensions" you refer to are actually a subset of the System Verilog assertion language found in the SystemVerilog standard. (No, the entire SV language is not supported in the open version, neither is the entire SVA subset supported.) Yosys supports the "immediate assertion" subset of the SVA language. In particular, it supports making assertions, assumptions, and cover statements within always/initial blocks, but not these statements on their own. Please do be skeptical. I've been very skeptical of it, but you've just read my conclusions from my own experiences above. I'd love to hear your thoughts! Also, feel free to send me a PM if you have any struggles getting started. As for your next statement, SymbiYosys doesn't understand the synthesis tool outputs from the various FPGA vendors. It understands (through Yosys) Verilog plus the immediate assertion subset of SVA (System Verilog Assertion language). That is, you'll need to provide the properties together with your design in order to have SymbiYosys verify that your properties hold as your design progresses logically. Further, I normally separate my design logic from the formal properties within by `ifdef FORMAL// and `endif lines. This keeps things so that Vivado/Quartus/Yosys can still understand the rest of the logic, while SymbiYosys or rather Yosys can understand the stuff within the ifdef as well. One last item: Verilator understands many of these properties as well--they aren't unique to SymbiYosys. Hope this starts to make some more sense, Dan
  16. All, I'd like to continue an ongoing discussion that's now taken place across many forum threads, but I'd like to offer for everyone a simple place to put it that ... isn't off topic. (Thank you, @JColvin for inviting my ... rant) For reference, the tools I use include: Verilator: for simulating anything from individual components (such as this UART), to entire designs (such as my Arty design, CMod S6 design, XuLA2-LX25 design, or even my basic ZipCPU design). (Read about my debugging philosophy here, or how you can use Verilator here.) Drawbacks: Verilator is Verilog and System Verilog only, and things the Verilate don't always synthesize using Vivado. Pro's: compiling a project via Verilator, and finding synthesis errors, can be done in seconds, vice minutes with Vivado. Further, it's easy to integrate C++ hardware co-simulations into the result to the extent that I can simulate entire designs (QSPI flash, VGA displays, OLEDrgb displays, simulated UART's forwarded to TCP/IP ports, etc) using Verilator and (while it might be possible) I don't know how to do that with any other simulation tool. Time is money. Verilator is faster than Vivado. GTKWave: for viewing waveform (VCD) files yosys: Because 1) it's open source, and 2) it supports some (the iCE40 on my icoboard), though not all, of the hardware I own wbscope (or its companion, wbscopc): for any internal debugging I need to do. (Requires a UART to wishbone bus converter, or some other way to communicate with a wishbone bus within your design ...) Vivado: for synthesis, implementation, and any necessary JTAG programming wbprogram: to program bit files onto FPGA's. I use this after Vivado has placed an initial load onto my FPGA's. I also use wbicapetwo to switch between FPGA designs contained on my flash. zipload: to load programs (ELF files), and sometimes bit files, onto FPGA's ... that have an initial load on them already. While the program is designed to load ZipCPU ELF files, there's only two internal constants that restrict it to ZipCPU programs. ZipCPU, as an alternative to MicroBlaze (or even NiOS2, OpenRISC, picorv, etc). (GCC for compiling programs for the ZipCPU) The only program above that requires a license to use is Vivado, although some of the above are released under GPL Further, while I am solidly pro-open source, I am not religiously open source. I believe the issue is open for discussion and debate. Likewise, while my work has been very much Verilog focused, I have no criticisms for anyone using VHDL. To start off the discussion, please allow me to share that I just spent yesterday and today looking for a problem in my own code, given one of Vivado's cryptic error messages. Vivado told me I had two problems: a timing loop, and a multiply defined variable. The problem turned out to be a single problem, it's just that the wires/nets Vivado pointed me to weren't anywhere near where the error was. Indeed, I had resorted to "Voodoo hardware" (fix what isn't broken, just to see if anything changes) to see if I could find the bug. (Didn't find it, many hours wasted.) Googling sent me to Xilinx's forum. Xilinx's staff suggests that, in this case, you should find the wire on the schematic (the name it gave to the wire wasn't one I had given to any wires). My schematic, however, is .... complicated. Finding one wire out of thousands, or tens of thousands, when you don't know where to look can be frustrating, challenging, and ... not my first choice to finding the result. I then synthesized my design with yosys this morning and found the bug almost immediately. +1 for OpenSource. Time is money, I wish now I'd used yosys as soon as I knew I had a problem. Did I implement the design yosys synthesized? No. I returned to Vivado for ultimate synthesis, implementation, and timing identification.. If you take some time to look through OpenCores, or any other OpenSource FPGA component repository for that matter, you will quickly learn that the quality of an OpenSource component varies from one component to another. Even among my own designs, not all of them are well documented. Again, your quality might vary. +1 for proprietary toolchains, ... when they are well documented, and when they work as documented. There's also been more than one time where I've had a bug in my code, often because I've mis-understood the interface to the library component it is interacting with, and so I've needed to trace my logic through the library component to understand what's going on. This is not possible when using proprietary components--whether they be software libraries or hardware cores, because the vendor veils the component in an effort to increase his profit margin. Indeed, a great number of requests for help on this web site involve questions about how to make something work with a proprietary component (ex. MicroBlaze, or it's libraries) that the user has no insight into. +1 for OpenSource components, in spite of their uncertain quality, and the ability you get to find problems when using them. Another digital designer explained his view of proprietary CPUs this way, "Closed source soft CPUs are the worst of two worlds. You have to worry about resource use and timing without being able to analyze it". (Olof's twitter feed) In other words, when you find a bug in a proprietary component, you are stuck. You can't fix the bug. You can request support, but getting support may take a long time (often days to weeks), and it might take you just as long to switch to another vendor's component or work around the bug. +1 for OpenSource that allows you to fix things, -1/2 for OpenSource because fixing a *large* design may be ... more work than it's worth. Incidentally, this is also a problem with Xilinx's Memory Interface Generated (MIG) solutions. When I added a MIG component to my OpenArty design, I suddenly got lots of synthesis warnings, and it was impossible for me to tell if any were (or were not) valid. +1 for OpenSource components, whose designs allow you to inspect why you are getting synthesis warnings. I could rant some more, but I'd like to hear the thoughts others of you might have. For example, @Notarobot commented at the end of this post that, "using design tools introduces additional additional unnecessary risk. I'd like to invite him to clarify here, as well as inviting anyone else to participate in the discussion, Dan
  17. @zygot, This is very fascinating, thank you for sharing! I had been wondering how others had managed to simplify the ARM+FPGA design methodology. I used a different approach on a Cyclone-V. I created a bus bridge that I connected to the CPU. Since it was a Cyclone-V, I could connect from an Avalon to a WB bus with library code connecting the ARM's AXI interface to the avalon interface. I had two problems with this approach. First, any bug in my Avalon->WB bridge would halt the ARM hard--requiring a power cycling to correct. This left me with no effective means of debugging the problem in hardware. ("No effective means" is a reflection of the fact that I never tried to use the vendor's JTAG scope interface...) I wouldn't have found the problem if I had not managed to create a simulation of my design--not cycle accurate, mind you, but enough to find the bug. Second, the throughput was horrible. (I was using the "low-speed" bus interface port.) Because of the difficulties I had, I wouldn't recommend such chips to others. Perhaps your approach gets around those difficulties? Second, I'd be curious to know if you knew about the bugs in Xilinx's demo AXI-lite peripheral code? It seems as though the interface is so complicated, even Xilinx struggled to get it right. Score another point for using formal verification. I'm still hoping to return to this problem in order to create a full AXI to WB bridge (burst support, ID support, etc.). So far, however, the interface protocol has been so complex that I have yet to be successful at it. Thank you again for sharing, Dan
  18. D@n


    @Junior_jessy, Please allow me to add a word or two to @zygot's advice. In particular, you've done your testing (so far) on recorded speech. Moving from recorded to live speech is a testing challenge (nightmare?). You'll want to be able to not only process the live speech, but also to be able (just after the fact) to grab recordings of any live speech that didn't process as you wanted it to, so that you can place these recordings into your MATLAB framework and see what went right (or wrong) about them. It is possible to use your computer microphone audio for this. You'll learn a lot from host audio, just not enough. Since @zygot mentioned them, I'll admit to having worked with the MATLAB simulink to HDL tools before. I would not recommend them to anyone who is considering them. They are great for graphically designing code, horrible when trying to do a diff to see if anything changed and horrible when digging for the details within a design. Likewise I had bad experiences trying to find all the top level ports and internal comments. Finally, once I stopped paying for the license (and swapped laptops), the code was lost to me. Not something I'd recommend. One other point: @zygot said "Simulate, simulate and simulate." I agree completely. Were I a professor, I would even stomp my foot at this point. It will be on the final exam. Simulate the A/D, simulate your algorithm, simulate both together! Create a simulation application, using your favorite HDL utility, that will allow you to input any audio file and test it. Examine what happens if the audio is earlier or later, stronger or weaker, faster or slower, etc. Examine up to two audio files in succession in this manner! Don't stop at one. (I've found a lot of bugs in the dead period between audios, where the algorithm is recycling.) In addition, I'd also suggest formally verifying whatever you can. (The A/D is a known candidate for formal verification ...) I've been caught multiple times over by bugs I'd never find in simulation that I then find with formal verification, so I highly recommend that to you. ( @zygot if you want to dig into this further, I'd be glad to, but let's start another topic for it if you are so inclined.) The most recent example? A UART transmitter that couldn't handle a transmission in 10*baud clocks, but rather 10*baud clocks + 1. Just my two cents, Dan
  19. @sittinhawk, Waayy too locked down? Buggy too. It's not a total loss. I'm still using Verilog. The Borg has not gotten to me yet. As you suggested above, I now have a large library of Verilog code that I bring to any new project. This includes a CPU in a couple of implementations, several peripherals, and even an interconnect builder. I've also rejected AXI. The protocol is way too complex for its own good. I've been using Wishbone (WB) instead. For half the logic of even AXI-lite, you can get about twice the speed. (Well, that's not quite true ... I do have an AXI-lite peripheral implementation that recovers the speed loss ...) Using a WB to AXI bridge still gives me access to the Xilinx MIG. There is an open source DRAM controller generator out there, but I haven't tried it yet. My first ARM+FPGA project used an Avalon bus. That bus is at least easy to bridge to WB, and even to formally verify the bridge. Since that time, I've now written a formally verified AXI-lite to WB bridge. (One that worked the first time out too!) I could use that if I ever needed to interface with an ARM processor on board. However, as you've noticed, there are a lot of individuals writing into the forums, both this one and Xilinx's) who are clueless about how to use the canned IP, and struggling to figure out how to integrate their own IP into it. Worse, they can't figure out how to debug it when it doesn't work--which is to be expected with any closed source solution. Once you got off the ground, Verilog never had these problems. Dan
  20. D@n

    GPS Pmod

    @HelplessGuy, The fundamental discipline of all engineering is to be able to take a problem, such as the GPS not working for you, and to break it down into pieces in order to figure out which piece is broken. This was my advice to you above. I'm a little confused by your response saying that you don't need to do this. Does that mean that things are finally working for you? Dan
  21. @Josef, I'm not sure I know enough to say the problem only exists in the case where a user places the flash in a specific configuration. I do know that I never loaded the flash using the GUI. I always loaded my design into the flash myself, so I can't really comment on the GUI approach and how well it works. I know that more than one person has struggled to load their design from the GUI, and I've always suspected that this was part of the cause. I do know enough to say that you want your flash device in a QSPI/XIP configuration for fastest access. I also know that, if you are concerned, it's not that hard to read the ID off of the flash to see who the manufacturer is, what the size of the chip is, and which chip from the manufacturer it represents. Dan
  22. @Josef, At the time, I was trying to build a high speed flash controller. As part of any design work I do, I start by downloading the specifications for all the parts on any board you will be working with. Then, when I'm ready to work with a given part, I start reading the specification for the part I want to work with. I would recommend this to you as well. In this case, I found the "problem" by reading the specification for the Micron flash chip. In Micron's zeal for creating a faster/better/cheaper chip, they created a chip that could start in QSPI/XIP mode. This is in many ways a sales point, since a chip that can start in that mode is going to run faster than one that starts in SPI mode and then needs to transition to QSPI. The problem, though, is trying to figure out which mode the chip started in if you have no idea what the configuration register is set to. In the middle of this, there arose a question as to whether or not the FPGA could properly place the chip into the necessary SPI or QSPI mode, from whatever state the user had left it in, in order to configure the chip initially. I'm still not certain of the answer to that question. In my current work, I'm resolving this problem by sending a carefully chosen command to the flash that, if the flash is in SPI mode, will have no meaning, but if the flash is in QSPI mode it will return the flash to SPI mode. Once the flash has been returned to a known state, I can then place it into the state that I want it to be in--usually the QSPI/XIP mode I wanted to work with in the first place. This approach has another benefit as well. When you load a design onto an FPGA via the JTAG, the flash chip doesn't change modes like it will if you load your design from the flash. In other words, the flash might already in its QSPI/XIP mode when my design starts up separate from whatever reset mode the flash might be in. (This issue has caught me by surprise more than once, and not always with the flash, where some piece of hardware is already initialized upon design startup--something worth watching out for.) By first taking the flash out of whatever mode it is in initially and placing it into a known mode, the design can reliably start when loaded from JTAG as well as when loaded from flash. Dan
  23. @Josef, One of my personal goals is to try to build a single/universal flash driver that can work with both boards. This includes a startup script that will pull the Micron out of whatever reset configuration/state it is in. I think I've got it, but the Spansion flash will need at least two configuration changes: 1) the number of "dummy" cycles needs to be changed. These are the number of cycles between the read address and the first clock with read data in it. This change needs to be made both to the open source flash simulator I have, as well as to the RTL code and the software driver I have. All are possible, just annoying to do. 2) The startup script within the driver itself (probably) needs to change. Once done, my OpenArty design should work with the new hardware as well. At least, the different flash is the only change I know of. Since I don't have one of the newer Arty boards, it will be hard for me to know for certain. Dan
  24. I know of at least one difference. Dan
  25. D@n

    GPS Pmod

    @HelplessGuy, This point is actually going in the wrong direction and taking you away from finding whatever bug you are struggling with. Your first question should be whether or not you are receiving valid GPS data, and only secondly whether or not that data is processed properly. GPS data is provided via a stream of text from the PMod GPS's UART port. It's pseudo-human legible. At this point, you need to first demonstrate that you are able to receive the stream of text. Only after that does it make sense to talk about whether the text can be decoded properly. This sends us back to the issue with the character pointers that we were discussing above. Have you tried a simple design that just connects the serial output port to a serial input port of your zedboard? (I assume it has one in MDIO somewhere, although I haven't used the Zedboard enough to know.) You would then need to process, by hand, enough of the NMEA messages to know that a signal exists, that it has been properly received, and that the GPS receiver had actually gotten a fix. IMHO, this is the step you are missing. Dan