• Content Count

  • Joined

  • Last visited

  • Days Won


D@n last won the day on September 26 2018

D@n had the most liked content!

About D@n

  • Rank
    Prolific Poster

Contact Methods

  • Website URL

Profile Information

  • Gender
    Not Telling
  • Interests
    Building a resource efficient CPU, the ZipCPU!

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. D@n

    ethernet communication with pc

    @PhDev, Have you tried any of those cores? If so, would you recommend them as workable? Dan
  2. D@n

    Display image using VGA from block RAM

    @khaledismail, Sounds like you got yourself stuck in FPGA Hell. Looking over your code, a quick first glance shows that you are using a logically generated clock. This is, in general, a very bad idea--one that can lead to hardware/simulation mismatch. A better approach would be to use a clock enable line. A common reason for ending up in FPGA Hell is not simulating your IP. The difficulty with the position you are in is that not many simulators will co-simulate the VGA your design connects to. I know I had similar problems when using my Basys3 board and getting video to work faultlessly. (I was reading from flash, and decompressing the video stream since the Basys3 flash wasn't fast enough to keep up.) In the end, I needed to write a VGA simulator so that I could "see" what was going on within all the traces within my design in order to find the bugs. You can find that VGA simulator on-line here, or even read about it here. The repo that contains it even includes an example project that reads from block RAM, and outputs the results onto a VGA output for the simulator. The sad part of this design is that it is in Verilog, and uses Verilator--a Verilog only simulation utility. However, I know there exist VHDL simulators that support a VPI interface--you might be able (with a bit of work) to get your design to work within that environment. That might help. (Alternatively, you might choose to re-implement your design in a real language ) Hope this helps, Dan P.S. For those others reading, you may wish to know that @khaledismail has also posted on Xilinx's forums. If you don't see a solution here, you might (eventually) find one there.
  3. D@n

    ethernet communication with pc

    @Aamirnagra, You can find my own RMII interface work here. There's also a simple-ping program to drive it here. Don't forget you'll want an MDIO controller as well! That one is nearly as trivial as a SPI core--it's pretty easy to write. You can find my own MDIO controller here. I also have a software decoder for  the results of that here. Being able to quickly read and get a status from the MDIO interface can be quite valuable. Don't let @zygot discourage you. Yes, there's a lot to learn about, however wikipedia does a decent job describing most of what you'll need for the various protocols, and the Ethernet data sheet for your device should describe the RMII interaction well enough. Yeah, there are a couple of gotcha's, and you'll find some surprising things along the way. For example, I expected the nibble order to be the other way around. In most cases, you can just debug any problems you run into with wireshark. At this point, though, I'll agree with @zygot: network interface work is not trivial. It's a lot of fun though! Dan
  4. @Archana Narayanan, Try working this through from one end to the other. Do you know that your PC->FPGA link works? I usually debug this link by first making sure the transmit FPGA->PC link works. That will verify that your serial port is using the right pin for transmit. Once I know FPGA->PC works, I'll typically composing a PC->FPGA->PC design together. The first time I do this, I place no logic between RX and TX. This verifies that I have the right transmit pin. (Be aware, the labels on a lot of Digilent's schematics are misleading!) The second time I do this I use a serial decoder followed by a serial encoder. This will verify that your serial receiver works. Once you know both work, then let's go back to simulation: can you simulate your entire design end to end before placing it onto your board at all? Dan
  5. D@n

    Keyboard interfacing

    @gummadi Teja, Have you tried your design with a video simulator? A complete trace of everything going on within the design could be very valuable for determining what's going wrong. Dan
  6. D@n

    Digilent Github Demo

    @AlGee, If you are at all interested in Verilog design, then this tutorial is where I would recommend starting. You might even find other topics of interest on the associated blog as well, depending on what you are interested in doing. (Yes, that is a shameless plug.) If you want an example design, here's one I've put together for the Arty. The documentation describes getting the memory up and running, should you wish to interact with it from Verilog. (The flash controller still has issues since Digilent swapped flash chips though ...) Welcome to a fun journey! Dan
  7. D@n

    Basys-3 USB storage compatibility

    @Jonathan.O, Electrically possible? Probably. Practically possible? It would be a long shot. The Basys3 microcontroller that interacts with with the FPGA contains source coded that Digilent has not released. To my knowledge, they have no plans to release this source code, so you might consider this an "unsupported feature". Even if you could do it, Digilent isn't likely going to answer any questions on this method of doing business. The road to doing this would involve using the JTAG port (again, not publicly described) to reprogram the device. Doable? Perhaps. Easy? Not at all. (It'd be hard with the documentation you'd need, even harder without.) My recommendation would be to use a PMod SD (SD-card adapter) instead. That should be easier to integrate into what you want to do, while also accomplishing the same purpose. Dan
  8. D@n

    How to generate another, faster clock (CMOD S7) ?

    @TestDeveloper, I instantiate my PLL's and MMCE's like that all the time. Vivado has been fairly robust in how it handles it, so in spite of the pitfalls it has always worked for me. One thing you might consider doing is to create a counter that counts down (CLOCK_RATE_HZ/2), and then toggles an LED. You might find that useful to know that you got the clocking right. Dan
  9. All, If you've followed the Vivado tutorial to build an AXI-lite peripheral, you'll know to ask Vivado to generate an IP core for you which you can then edit to your taste. What's not so commonly known is that this core has bugs in it : it does not comply with the AXI standard. Specifically, if the Vivado demonstration core receives two requests in a row while the return channels ready line is low, one of those two requests will get dropped. This applies to both read and write channels. The failure is so severe that it may cause a processor to hang while waiting for a response. Worse, since this is caused within vendor provided code, most users won't see any need to examine it, instead choosing to believe that their own code must somehow be at fault. The article demonstrates the bugs in the 2016.3 AXI-lite demonstration core. Since that Vivado 2016.3, Xilinx has updated their AXI-lite demonstration to add another register to its logic--presumably to fix this issue. As of version 2018.3, even this updated logic continues to fail verification. Should you wish to repeat this analysis, this same article discusses how it was done. Only about 20 lines of logic need to be added to any Verilog AXI-lite core, plus the lines necessary to instantiate a submodule containing a property file. That's all it takes to verify that any AXI-lite core properly follows the rules of the road AXI-lite bus using SymbiYosys--a formal verification tool. The steps necessary to correct this logic flaw are also discussed. Since writing that article, I have posted another basic AXI-lite design which doesn't have these flaws. Moreover, the updated design can process bus transactions with a higher throughput than the original design ever would. While I'm not sure quite how fast MicroBlaze or even the AXI interconnect can issue bus requests, this design at least shows how you could build a slave peripheral that can handle two requests at once. Feel free to try it out and let me know if you find any flaws within it. Dan
  10. D@n

    NEXYS 4 Programming Flash

    @bhall, No, this makes perfect sense. Xilinx, in their infinite wisdom, created a SPI port clock pin to be used for configuring the device. It's controlled internally. When they then realized that customers would want to use it as well, they created a STARTUPE2 primitive that you need to use to get access to it. As such, its often not listed in the port lists, but still usable. On several of the newer Digilent designs, Digilent has connected that pin to two separate ports. This allows you to control the pin like a normal I/O. However, doing this requires special adjustments at the board level--not the chip level. Dan
  11. D@n

    NEXYS 4 Programming Flash

    @bhall, You should thank @jpeyron for calling me out. I tend to ignore any threads with block diagrams in them--I just don't seem to be able to contribute to them that well. @jpeyron also cited the wrong reference to my article (Oops!). I think he meant to cite this article here on flash controller development. In general it's not really all that hard to do--you just need to spend some time working with the specification and your hardware, and a *really* *good* means of scoping out what's going on. The design built in this article assumes a DDR output. As such it can read a 32-bit word in (roughly) 72 system clocks. I have an older QSPI controller as well that I've used on many of my flash designs. It takes up twice as much logic. The link above should show you where and how to find it if you would like. This one doesn't use the DDR components. The other trick in what you are attempting to do will require you to read an ELF file. Check out libelf for that purpose. It's really easy to use, and should have no problems parsing your executable file--before it turns into an MCS file. Hope this helps, Dan
  12. D@n


    @Junior_jessy, Ok, I take that back then ... it sounds from your description like you might be ready to move forward. I've heard that Cepstral processing works quite nicely for speech, although I've never tried it myself. So, your algorithm will have several parts ... I like how you've shown it above to work in several sections. Now imagine that each of those sections will be a module in your design. (An entity for VHDL types.) That module should take no more than one sample input per clock, and produce one sample output per clock. Your goal will be to do all the processing necessary one sample at a time. This applies to the Cepstrum as well. As I recall, a Cepstrum is created by doing an FFT, taking the log of the result (somehow), and then taking an IFFT. FFTs in FPGAs tend to be point by point proccesses: you put one sample in at a time and get one sample out. So expect to do a lot of stream processing. Alternatively, for audio frequencies, it might make sense to do some block processing ... but that's something you'll need to decide. Either way, it's likely to look to you like you are processing one sample at a time. You mentioned above that all you knew how to do were minimal counters, shift registers and such. Relax: You are in good hands. Most of the designs I just mentioned, to include the FFT, are built out of primarily shift registers and counters and other simple logic. However, there are two other components you will need to work this task: You'll want to know how to do a (DSP-enablerd) multiply, and how to use the Block RAM within your FFT. The rule for both of these is that they should use a process all to themselves. One process to read from RAM, one process to write to RAM, and one process to do your multiply--don't do any operations in any of those three types of processes. That's the tableau you have to work with. How you put it together--that's up to you and your skill as an engineer. As for the FFT, you are welcome to use mine or you can use Xilinx's. That'll at least keep you from rebuilding that wheel. You might find it valuable to rearrange the Octave script you've highlighted above so that it works on one sample at a time as I'm describing. Think about it. Hopefully, though, that gets you going ... a bit. Keep me posted, Dan
  13. D@n


    @Junior_jessy, Well, let's start with those items that will be constant. Constant values should be declared as generics, constant vectors should be placed in some kind of RAM--either block RAM or SDRAM depending upon the size. How you then interact with this data will be very different depending upon which you choose. I notice that you are comparing your data against a known pattern, but looking for the maximum difference in (signal - template). Have you considered the reality that the signal of interest might have a different amplitude? That the voice speaking it might be using a different pitch or a different cadence? Testing against ad-hoc recorded signals (not your templates) will help to illustrate the problems with your current method. Ask your significant other, for example, to say some of the words. Then you say it. Then see which matches. It looks like you might be doing a correlation above. (A correlation isn't recommended ... you aren't likely to be successful with it--see above.) If you find that you need to implement a correlation, your method above won't accomplish one very well.. Your approach above is more appropriate for attempting to match a single vector, not for matching an ongoing stream of data. Correlations against a stream of data are often done via an FFT, point by conjugate point multiplication, followed by an inverse FFT. If you do one FFT of the input, and then three IFFT's depending upon which sequence you are correlating with, you can save yourself some FPGA resources. Be aware of circular convolution issues when using the FFT--you'll need to understand and deal with those. Once done, looking for the maximum in a stream of data is fairly simple. This change should be made in Octave still. All that said, your algorithm is not yet robust enough to work on real speech. Looks like you haven't tried it on ad-hoc recorded speech yet, instead trying it on your small recorded subset. Record an ad-hoc subset and see what happens. I think you'll be surprised and a bit disappointed for all the reasons I've discussed above. Have you seen my article on pipeline strategies? It would help you to be able to build some pipeline logic here when you are finally ready to move to VHDL. (You aren't ready for the move yet) How about my example design that reads from an A/D, downsamples and filters the input, FFT's the result, and then plots the result on a simulated VGA screen? You might find it instructive as you start to think about how to go about this task. (You aren't ready for this yet either--but it might be a good design to review.) Realistically, though, the bottom line is that you really have some more work to do before moving to VHDL. In particular, you need to test your algorithm against ad-hoc sampled data separate from your training data. After that, and after you fix the bugs you'll discover by doing so, you'll have a better chance of success once you do move to VHDL. Dan
  14. @zygot, Thank you! I'd given you up as a lost cause. We can continue this discussion in the forum thread we started it in. I've heard enough from others to suggest you are right in this, I just don't know enough that I could explain it convincingly, so I need to learn more about the issue. Let's start here, since you have just highlighted one of my own struggles when analyzing my own experience. I cannot separate how much of the problems I found and solved via formal were because I didn't know how to write a good test bench, and how many of them are a result of the formal tools just being fundamentally better approach to digital design. I appreciate your insight here, and would love to hear your insight again after you've tried the tools a couple of times on your own designs--should you choose to do so. Your experiences might help me answer this question. Quick hint here: start simple, work up to complex--as with anything in life. It's a work in progress, but I'd be glad to let you know when the entire tutorial is complete if you would like. There is already a discussion on how to use SymbiYosys that starts in lesson 3 on FSMs. SymbiYosys tasks are covered in lesson 4, as is the $past() operator. My current work is focused on the block RAM chapter as well as the serial port receiver chapter--to include how to verify each. Well, not quite. Let me try to clarify. Yosys (not SymbiYosys yet) is a Verilog synthesizer. It can ingest Verilog and output one of several outputs types including EDIF (for working with Xilinx tools), VQM files (for working with Altera), JSON (for working with the open source NextPNR tool), and Aiger and SMT2 (for working with formal solvers). That's one program. Next, there are several formal solvers available for download from the internet: abc, z3, yices, boolector, avy, suprove, etc. These are separate from Yosys and SymbiYosys, although SymbiYosys very much depends upon them. These solvers accept a specially formatted property file as input--including Aiger (avy, suprove) and SMT2 (z3, boolector, yices) as examples. SymbiYosys is a fairly simple python script that connects these two sets of tools together based upon a setup script. The official web site for SymbiYosys can be found here, although I'll admit I've blogged about using it quite a bit. (Try this article regarding asynchronous FIFOs for an example.) SymbiYosys itself can be found on github, just like Yosys and the various solvers, together with some examples that can be used to test it. As for the the comment about SymbiYosys implementing an extension of Verilog, please allow me to clarify again. The "extensions" you refer to are actually a subset of the System Verilog assertion language found in the SystemVerilog standard. (No, the entire SV language is not supported in the open version, neither is the entire SVA subset supported.) Yosys supports the "immediate assertion" subset of the SVA language. In particular, it supports making assertions, assumptions, and cover statements within always/initial blocks, but not these statements on their own. Please do be skeptical. I've been very skeptical of it, but you've just read my conclusions from my own experiences above. I'd love to hear your thoughts! Also, feel free to send me a PM if you have any struggles getting started. As for your next statement, SymbiYosys doesn't understand the synthesis tool outputs from the various FPGA vendors. It understands (through Yosys) Verilog plus the immediate assertion subset of SVA (System Verilog Assertion language). That is, you'll need to provide the properties together with your design in order to have SymbiYosys verify that your properties hold as your design progresses logically. Further, I normally separate my design logic from the formal properties within by `ifdef FORMAL// and `endif lines. This keeps things so that Vivado/Quartus/Yosys can still understand the rest of the logic, while SymbiYosys or rather Yosys can understand the stuff within the ifdef as well. One last item: Verilator understands many of these properties as well--they aren't unique to SymbiYosys. Hope this starts to make some more sense, Dan
  15. At the invitation of @zygot, I thought I might share about my own experiences using formal verification when building FPGA designs. This comes up in the context of debugging Xilinx's AXI-lite demonstration code, and from demonstrating that with an interface property file any AXI-lite core can simply be debugged with only about 20 or so lines of code. So how did I get here? I imagine that most FPGA users in this community probably start out their journey without formal verification. They either start out with VHDL/Verilog or with the schematic entry form of design. I personally started out with Verilog only. I never learned to simulate any of my designs until a couple of years into my own journey. About two years after that, I learned about doing formal verification with yosys-smtbmc, and then with SymbiYosys. (SymbiYosys is a wrapper for several programs, including yosys-smtbmc, that has an easier to use user interface than the underlying programs do.) The first design I applied formal verification to was a FIFO. By this time I was quite confident that I knew how to build FIFO's. (Imagine a puffed out chest here ...) In other words, I was very surprised when the formal tools found a bug in my FIFO. (That puffed out chest just got deflated ..) This was my first experience, which I would encourage you to read about here. Why didn't I find the bug with simulation? Because there was one situation/condition that I just never checked with simulation: what should happen when reading and writing an empty FIFO at the same time. Perhaps I'm too junior when using test benches, perhaps I'm just not creative enough to imagine the bugs I might find, either way this just wasn't a situation I tested. Some time later, I was working with my ZipCPU and had to trace down a bug. You know the routine: late nights, bloodshot eyes, staring at GB of traces just to find that one time step with the error in it, right? You can read about it the mystery bug here. (Spoiler alert: I found the bug in a corner case of my I-cache implementation.) Wouldn't you rather instead use formal methods to find bugs earlier and faster? I certainly would! Anything to avoid needing to dig through GB of trace files looking for some obscure bug. You can read about the three formal properties you need for verifying memory designs (such as caches) here--had I verified my I-cache using these three properties, I would've found the bug much earlier. After that first experience with the FIFO, I started slowly working my way through all of my cores and applying formal methods to them. I discovered subtle errors when building even a simple count-down/interval timer. With every bug I found, I was motivated all the more to continue looking for more. I was haunted by the idea that someone might try out one or more of my cores, find that it doesn't work, and then write me off as a less than credible digital designer. (Remember, I was looking for work during this time and using my on-line digital designs and blog as an advertisement for my capabilities.) One ugly bug I found in my SDRAM controller would've caused the controller to read or write the wrong address in certain (rare) situations. None of my test benches had ever hit these situations, nor is it likely they would have. You'd have to hit the memory just right to get this error. In other words, had I not used formal methods, I would've found myself two years from now wondering why some ZipCPU program wasn't working and not realizing that the (proven via simulation and testbench) SDRAM controller still had a bug within it. What does design using formal methods look like? The design flow is a bit different. Instead of simulating your design through several independent tests in a row, and then looking through a long trace to find out if the design worked, formal methods get applied to several steps of a design only. If your design fails, you will typically get a trace showing you how the design could be made to go into an unacceptable mode. You can then adjust your design, and run again. In that sense, it is very much like simulation. However, unlike simulation, the trace generated by the formal tool stops immediately at the first sign of a bug, leading you directly to where the problem started. I said the trace tends to be short. How short? Often 20 cycles is plenty. For example, 12 steps are sufficient to verify the ZipCPU for all time, whereas 120 steps are necessary to verify a serial port receiver with an assumed (asynchronous) transmitter with a dissimilar clock rate. This is very different from running an instruction checking program through the CPU, or a playing a long speech through the serial port receiver. As you might imagine, like the other examples, when I finally got to the point where I was ready to verify the full CPU I found many more bugs in my "working" CPU than I expected I would find. As a result of these experiences, I now apply formal verification during my design process--long before I enter any "verification" phase. Any time I need to re-visit something I've written before, I immediate add properties to the design so I can use formal methods. Any time I design something new, I now start with formal methods. I will also use formal before simulation. (It's easier to set up than test benches were anyway.) I use cover() statements to make sure my design can do what it is supposed to, and assert() statements to guarantee that it will never do what it isn't supposed to. While formal can find bugs in corner cases, and it can do so relatively quickly, it isn't the solution to every problem. The formal engines struggle with an exponential explosion of possibilities that need to be checked. As an example, both multiplies and cryptographic algorithms have been known to hit this combinatorial explosion limit relatively quickly. (I can exhaustively verify a multiply with Verilator in 15-minutes that would take over 4-days with formal methods.) For this reason, you want to keep the design given to them relatively simple if possible. That doesn't mean that you can't formally verify large designs. As an example, I recently formally verified an FFT generator. I just did it one piece at a time. I still use simulation. Why? Because of the combinatorial explosion problem associated with formal methods, I tend to only verify components or clusters of components but never my whole design. Using Simulation, I can get the confidence I am missing that my entire design works. Besides, would you believe me if I told you my FFT worked without showing you charts of the inputs and outputs? Only after a design (component) passes both formal verification and simulation will I now place it on hardware. Why? Because it's easier to debug a design using formal than using simulation, and it's easier to debug a design using simulation than it is to debug it on the actual hardware. Eventually, SymbioticEDA approached me in January of last year, after I had been convinced to use formal verification for all my projects and after I had started writing about my experiences, and asked me to do contract work for them. If you aren't familiar with SymbioticEDA, they are the company that owns and maintains SymbiYosys. My first job for them was to create a training class to teach others how to use their tool. I was a natural fit for this job, since I was already convinced in the efficacy of the formal methods they had made available through SymbiYosys. I've had other jobs with them since, to include formally verifying a CPU that had been committed to silicon several times already. (Yes, there were bugs in it.) In other words, while this may sound like a paid advertisement, the experiences I've had and discussed above have been genuine. If you've never tried SymbiYosys, then let me invite you to do so if for no other reason than to determine whether or not my own experiences might play out in your designs. SymbiYosys is free and open source, and using it to verify a Verilog component will cost you nothing more than your time. Feel free to share with me your experiences if you would like. Likewise, you can find many other articles on these topics at zipcpu.com. Dan P.S. I'm slowly building a tutorial on how to design using Verilog, Verilator, and SymbiYosys. The course is intended to be (somewhat) hardware agnostic, so you should be able to use Yosys, Vivado, ISE, or even Quartus with it. Each lesson consists of a new concept and a design that will teach that concept. The lesson then goes over the steps to building the Verilog, Simulating the design and co-simulating any peripherals using Verilator, as well as formally verifying the design using SymbiYosys. While the tutorial remains a work in progress, many have enjoyed it even in its incomplete state. P.P.S. In my "story" above, I didn't pay attention to chronology, so some of the events may appear to be out of order with reality.