D@n

Members
  • Content Count

    1769
  • Joined

  • Last visited

  • Days Won

    126

D@n last won the day on September 26 2018

D@n had the most liked content!

About D@n

  • Rank
    Prolific Poster

Contact Methods

  • Website URL
    http://zipcpu.com

Profile Information

  • Gender
    Not Telling
  • Interests
    Building a resource efficient CPU, the ZipCPU!

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. @TestDeveloper, I instantiate my PLL's and MMCE's like that all the time. Vivado has been fairly robust in how it handles it, so in spite of the pitfalls it has always worked for me. One thing you might consider doing is to create a counter that counts down (CLOCK_RATE_HZ/2), and then toggles an LED. You might find that useful to know that you got the clocking right. Dan
  2. All, If you've followed the Vivado tutorial to build an AXI-lite peripheral, you'll know to ask Vivado to generate an IP core for you which you can then edit to your taste. What's not so commonly known is that this core has bugs in it : it does not comply with the AXI standard. Specifically, if the Vivado demonstration core receives two requests in a row while the return channels ready line is low, one of those two requests will get dropped. This applies to both read and write channels. The failure is so severe that it may cause a processor to hang while waiting for a response. Worse, since this is caused within vendor provided code, most users won't see any need to examine it, instead choosing to believe that their own code must somehow be at fault. The article demonstrates the bugs in the 2016.3 AXI-lite demonstration core. Since that Vivado 2016.3, Xilinx has updated their AXI-lite demonstration to add another register to its logic--presumably to fix this issue. As of version 2018.3, even this updated logic continues to fail verification. Should you wish to repeat this analysis, this same article discusses how it was done. Only about 20 lines of logic need to be added to any Verilog AXI-lite core, plus the lines necessary to instantiate a submodule containing a property file. That's all it takes to verify that any AXI-lite core properly follows the rules of the road AXI-lite bus using SymbiYosys--a formal verification tool. The steps necessary to correct this logic flaw are also discussed. Since writing that article, I have posted another basic AXI-lite design which doesn't have these flaws. Moreover, the updated design can process bus transactions with a higher throughput than the original design ever would. While I'm not sure quite how fast MicroBlaze or even the AXI interconnect can issue bus requests, this design at least shows how you could build a slave peripheral that can handle two requests at once. Feel free to try it out and let me know if you find any flaws within it. Dan
  3. D@n

    NEXYS 4 Programming Flash

    @bhall, No, this makes perfect sense. Xilinx, in their infinite wisdom, created a SPI port clock pin to be used for configuring the device. It's controlled internally. When they then realized that customers would want to use it as well, they created a STARTUPE2 primitive that you need to use to get access to it. As such, its often not listed in the port lists, but still usable. On several of the newer Digilent designs, Digilent has connected that pin to two separate ports. This allows you to control the pin like a normal I/O. However, doing this requires special adjustments at the board level--not the chip level. Dan
  4. D@n

    NEXYS 4 Programming Flash

    @bhall, You should thank @jpeyron for calling me out. I tend to ignore any threads with block diagrams in them--I just don't seem to be able to contribute to them that well. @jpeyron also cited the wrong reference to my article (Oops!). I think he meant to cite this article here on flash controller development. In general it's not really all that hard to do--you just need to spend some time working with the specification and your hardware, and a *really* *good* means of scoping out what's going on. The design built in this article assumes a DDR output. As such it can read a 32-bit word in (roughly) 72 system clocks. I have an older QSPI controller as well that I've used on many of my flash designs. It takes up twice as much logic. The link above should show you where and how to find it if you would like. This one doesn't use the DDR components. The other trick in what you are attempting to do will require you to read an ELF file. Check out libelf for that purpose. It's really easy to use, and should have no problems parsing your executable file--before it turns into an MCS file. Hope this helps, Dan
  5. D@n

    Voice-activited

    @Junior_jessy, Ok, I take that back then ... it sounds from your description like you might be ready to move forward. I've heard that Cepstral processing works quite nicely for speech, although I've never tried it myself. So, your algorithm will have several parts ... I like how you've shown it above to work in several sections. Now imagine that each of those sections will be a module in your design. (An entity for VHDL types.) That module should take no more than one sample input per clock, and produce one sample output per clock. Your goal will be to do all the processing necessary one sample at a time. This applies to the Cepstrum as well. As I recall, a Cepstrum is created by doing an FFT, taking the log of the result (somehow), and then taking an IFFT. FFTs in FPGAs tend to be point by point proccesses: you put one sample in at a time and get one sample out. So expect to do a lot of stream processing. Alternatively, for audio frequencies, it might make sense to do some block processing ... but that's something you'll need to decide. Either way, it's likely to look to you like you are processing one sample at a time. You mentioned above that all you knew how to do were minimal counters, shift registers and such. Relax: You are in good hands. Most of the designs I just mentioned, to include the FFT, are built out of primarily shift registers and counters and other simple logic. However, there are two other components you will need to work this task: You'll want to know how to do a (DSP-enablerd) multiply, and how to use the Block RAM within your FFT. The rule for both of these is that they should use a process all to themselves. One process to read from RAM, one process to write to RAM, and one process to do your multiply--don't do any operations in any of those three types of processes. That's the tableau you have to work with. How you put it together--that's up to you and your skill as an engineer. As for the FFT, you are welcome to use mine or you can use Xilinx's. That'll at least keep you from rebuilding that wheel. You might find it valuable to rearrange the Octave script you've highlighted above so that it works on one sample at a time as I'm describing. Think about it. Hopefully, though, that gets you going ... a bit. Keep me posted, Dan
  6. D@n

    Voice-activited

    @Junior_jessy, Well, let's start with those items that will be constant. Constant values should be declared as generics, constant vectors should be placed in some kind of RAM--either block RAM or SDRAM depending upon the size. How you then interact with this data will be very different depending upon which you choose. I notice that you are comparing your data against a known pattern, but looking for the maximum difference in (signal - template). Have you considered the reality that the signal of interest might have a different amplitude? That the voice speaking it might be using a different pitch or a different cadence? Testing against ad-hoc recorded signals (not your templates) will help to illustrate the problems with your current method. Ask your significant other, for example, to say some of the words. Then you say it. Then see which matches. It looks like you might be doing a correlation above. (A correlation isn't recommended ... you aren't likely to be successful with it--see above.) If you find that you need to implement a correlation, your method above won't accomplish one very well.. Your approach above is more appropriate for attempting to match a single vector, not for matching an ongoing stream of data. Correlations against a stream of data are often done via an FFT, point by conjugate point multiplication, followed by an inverse FFT. If you do one FFT of the input, and then three IFFT's depending upon which sequence you are correlating with, you can save yourself some FPGA resources. Be aware of circular convolution issues when using the FFT--you'll need to understand and deal with those. Once done, looking for the maximum in a stream of data is fairly simple. This change should be made in Octave still. All that said, your algorithm is not yet robust enough to work on real speech. Looks like you haven't tried it on ad-hoc recorded speech yet, instead trying it on your small recorded subset. Record an ad-hoc subset and see what happens. I think you'll be surprised and a bit disappointed for all the reasons I've discussed above. Have you seen my article on pipeline strategies? It would help you to be able to build some pipeline logic here when you are finally ready to move to VHDL. (You aren't ready for the move yet) How about my example design that reads from an A/D, downsamples and filters the input, FFT's the result, and then plots the result on a simulated VGA screen? You might find it instructive as you start to think about how to go about this task. (You aren't ready for this yet either--but it might be a good design to review.) Realistically, though, the bottom line is that you really have some more work to do before moving to VHDL. In particular, you need to test your algorithm against ad-hoc sampled data separate from your training data. After that, and after you fix the bugs you'll discover by doing so, you'll have a better chance of success once you do move to VHDL. Dan
  7. @zygot, Thank you! I'd given you up as a lost cause. We can continue this discussion in the forum thread we started it in. I've heard enough from others to suggest you are right in this, I just don't know enough that I could explain it convincingly, so I need to learn more about the issue. Let's start here, since you have just highlighted one of my own struggles when analyzing my own experience. I cannot separate how much of the problems I found and solved via formal were because I didn't know how to write a good test bench, and how many of them are a result of the formal tools just being fundamentally better approach to digital design. I appreciate your insight here, and would love to hear your insight again after you've tried the tools a couple of times on your own designs--should you choose to do so. Your experiences might help me answer this question. Quick hint here: start simple, work up to complex--as with anything in life. It's a work in progress, but I'd be glad to let you know when the entire tutorial is complete if you would like. There is already a discussion on how to use SymbiYosys that starts in lesson 3 on FSMs. SymbiYosys tasks are covered in lesson 4, as is the $past() operator. My current work is focused on the block RAM chapter as well as the serial port receiver chapter--to include how to verify each. Well, not quite. Let me try to clarify. Yosys (not SymbiYosys yet) is a Verilog synthesizer. It can ingest Verilog and output one of several outputs types including EDIF (for working with Xilinx tools), VQM files (for working with Altera), JSON (for working with the open source NextPNR tool), and Aiger and SMT2 (for working with formal solvers). That's one program. Next, there are several formal solvers available for download from the internet: abc, z3, yices, boolector, avy, suprove, etc. These are separate from Yosys and SymbiYosys, although SymbiYosys very much depends upon them. These solvers accept a specially formatted property file as input--including Aiger (avy, suprove) and SMT2 (z3, boolector, yices) as examples. SymbiYosys is a fairly simple python script that connects these two sets of tools together based upon a setup script. The official web site for SymbiYosys can be found here, although I'll admit I've blogged about using it quite a bit. (Try this article regarding asynchronous FIFOs for an example.) SymbiYosys itself can be found on github, just like Yosys and the various solvers, together with some examples that can be used to test it. As for the the comment about SymbiYosys implementing an extension of Verilog, please allow me to clarify again. The "extensions" you refer to are actually a subset of the System Verilog assertion language found in the SystemVerilog standard. (No, the entire SV language is not supported in the open version, neither is the entire SVA subset supported.) Yosys supports the "immediate assertion" subset of the SVA language. In particular, it supports making assertions, assumptions, and cover statements within always/initial blocks, but not these statements on their own. Please do be skeptical. I've been very skeptical of it, but you've just read my conclusions from my own experiences above. I'd love to hear your thoughts! Also, feel free to send me a PM if you have any struggles getting started. As for your next statement, SymbiYosys doesn't understand the synthesis tool outputs from the various FPGA vendors. It understands (through Yosys) Verilog plus the immediate assertion subset of SVA (System Verilog Assertion language). That is, you'll need to provide the properties together with your design in order to have SymbiYosys verify that your properties hold as your design progresses logically. Further, I normally separate my design logic from the formal properties within by `ifdef FORMAL// and `endif lines. This keeps things so that Vivado/Quartus/Yosys can still understand the rest of the logic, while SymbiYosys or rather Yosys can understand the stuff within the ifdef as well. One last item: Verilator understands many of these properties as well--they aren't unique to SymbiYosys. Hope this starts to make some more sense, Dan
  8. At the invitation of @zygot, I thought I might share about my own experiences using formal verification when building FPGA designs. This comes up in the context of debugging Xilinx's AXI-lite demonstration code, and from demonstrating that with an interface property file any AXI-lite core can simply be debugged with only about 20 or so lines of code. So how did I get here? I imagine that most FPGA users in this community probably start out their journey without formal verification. They either start out with VHDL/Verilog or with the schematic entry form of design. I personally started out with Verilog only. I never learned to simulate any of my designs until a couple of years into my own journey. About two years after that, I learned about doing formal verification with yosys-smtbmc, and then with SymbiYosys. (SymbiYosys is a wrapper for several programs, including yosys-smtbmc, that has an easier to use user interface than the underlying programs do.) The first design I applied formal verification to was a FIFO. By this time I was quite confident that I knew how to build FIFO's. (Imagine a puffed out chest here ...) In other words, I was very surprised when the formal tools found a bug in my FIFO. (That puffed out chest just got deflated ..) This was my first experience, which I would encourage you to read about here. Why didn't I find the bug with simulation? Because there was one situation/condition that I just never checked with simulation: what should happen when reading and writing an empty FIFO at the same time. Perhaps I'm too junior when using test benches, perhaps I'm just not creative enough to imagine the bugs I might find, either way this just wasn't a situation I tested. Some time later, I was working with my ZipCPU and had to trace down a bug. You know the routine: late nights, bloodshot eyes, staring at GB of traces just to find that one time step with the error in it, right? You can read about it the mystery bug here. (Spoiler alert: I found the bug in a corner case of my I-cache implementation.) Wouldn't you rather instead use formal methods to find bugs earlier and faster? I certainly would! Anything to avoid needing to dig through GB of trace files looking for some obscure bug. You can read about the three formal properties you need for verifying memory designs (such as caches) here--had I verified my I-cache using these three properties, I would've found the bug much earlier. After that first experience with the FIFO, I started slowly working my way through all of my cores and applying formal methods to them. I discovered subtle errors when building even a simple count-down/interval timer. With every bug I found, I was motivated all the more to continue looking for more. I was haunted by the idea that someone might try out one or more of my cores, find that it doesn't work, and then write me off as a less than credible digital designer. (Remember, I was looking for work during this time and using my on-line digital designs and blog as an advertisement for my capabilities.) One ugly bug I found in my SDRAM controller would've caused the controller to read or write the wrong address in certain (rare) situations. None of my test benches had ever hit these situations, nor is it likely they would have. You'd have to hit the memory just right to get this error. In other words, had I not used formal methods, I would've found myself two years from now wondering why some ZipCPU program wasn't working and not realizing that the (proven via simulation and testbench) SDRAM controller still had a bug within it. What does design using formal methods look like? The design flow is a bit different. Instead of simulating your design through several independent tests in a row, and then looking through a long trace to find out if the design worked, formal methods get applied to several steps of a design only. If your design fails, you will typically get a trace showing you how the design could be made to go into an unacceptable mode. You can then adjust your design, and run again. In that sense, it is very much like simulation. However, unlike simulation, the trace generated by the formal tool stops immediately at the first sign of a bug, leading you directly to where the problem started. I said the trace tends to be short. How short? Often 20 cycles is plenty. For example, 12 steps are sufficient to verify the ZipCPU for all time, whereas 120 steps are necessary to verify a serial port receiver with an assumed (asynchronous) transmitter with a dissimilar clock rate. This is very different from running an instruction checking program through the CPU, or a playing a long speech through the serial port receiver. As you might imagine, like the other examples, when I finally got to the point where I was ready to verify the full CPU I found many more bugs in my "working" CPU than I expected I would find. As a result of these experiences, I now apply formal verification during my design process--long before I enter any "verification" phase. Any time I need to re-visit something I've written before, I immediate add properties to the design so I can use formal methods. Any time I design something new, I now start with formal methods. I will also use formal before simulation. (It's easier to set up than test benches were anyway.) I use cover() statements to make sure my design can do what it is supposed to, and assert() statements to guarantee that it will never do what it isn't supposed to. While formal can find bugs in corner cases, and it can do so relatively quickly, it isn't the solution to every problem. The formal engines struggle with an exponential explosion of possibilities that need to be checked. As an example, both multiplies and cryptographic algorithms have been known to hit this combinatorial explosion limit relatively quickly. (I can exhaustively verify a multiply with Verilator in 15-minutes that would take over 4-days with formal methods.) For this reason, you want to keep the design given to them relatively simple if possible. That doesn't mean that you can't formally verify large designs. As an example, I recently formally verified an FFT generator. I just did it one piece at a time. I still use simulation. Why? Because of the combinatorial explosion problem associated with formal methods, I tend to only verify components or clusters of components but never my whole design. Using Simulation, I can get the confidence I am missing that my entire design works. Besides, would you believe me if I told you my FFT worked without showing you charts of the inputs and outputs? Only after a design (component) passes both formal verification and simulation will I now place it on hardware. Why? Because it's easier to debug a design using formal than using simulation, and it's easier to debug a design using simulation than it is to debug it on the actual hardware. Eventually, SymbioticEDA approached me in January of last year, after I had been convinced to use formal verification for all my projects and after I had started writing about my experiences, and asked me to do contract work for them. If you aren't familiar with SymbioticEDA, they are the company that owns and maintains SymbiYosys. My first job for them was to create a training class to teach others how to use their tool. I was a natural fit for this job, since I was already convinced in the efficacy of the formal methods they had made available through SymbiYosys. I've had other jobs with them since, to include formally verifying a CPU that had been committed to silicon several times already. (Yes, there were bugs in it.) In other words, while this may sound like a paid advertisement, the experiences I've had and discussed above have been genuine. If you've never tried SymbiYosys, then let me invite you to do so if for no other reason than to determine whether or not my own experiences might play out in your designs. SymbiYosys is free and open source, and using it to verify a Verilog component will cost you nothing more than your time. Feel free to share with me your experiences if you would like. Likewise, you can find many other articles on these topics at zipcpu.com. Dan P.S. I'm slowly building a tutorial on how to design using Verilog, Verilator, and SymbiYosys. The course is intended to be (somewhat) hardware agnostic, so you should be able to use Yosys, Vivado, ISE, or even Quartus with it. Each lesson consists of a new concept and a design that will teach that concept. The lesson then goes over the steps to building the Verilog, Simulating the design and co-simulating any peripherals using Verilator, as well as formally verifying the design using SymbiYosys. While the tutorial remains a work in progress, many have enjoyed it even in its incomplete state. P.P.S. In my "story" above, I didn't pay attention to chronology, so some of the events may appear to be out of order with reality.
  9. @zygot, This is very fascinating, thank you for sharing! I had been wondering how others had managed to simplify the ARM+FPGA design methodology. I used a different approach on a Cyclone-V. I created a bus bridge that I connected to the CPU. Since it was a Cyclone-V, I could connect from an Avalon to a WB bus with library code connecting the ARM's AXI interface to the avalon interface. I had two problems with this approach. First, any bug in my Avalon->WB bridge would halt the ARM hard--requiring a power cycling to correct. This left me with no effective means of debugging the problem in hardware. ("No effective means" is a reflection of the fact that I never tried to use the vendor's JTAG scope interface...) I wouldn't have found the problem if I had not managed to create a simulation of my design--not cycle accurate, mind you, but enough to find the bug. Second, the throughput was horrible. (I was using the "low-speed" bus interface port.) Because of the difficulties I had, I wouldn't recommend such chips to others. Perhaps your approach gets around those difficulties? Second, I'd be curious to know if you knew about the bugs in Xilinx's demo AXI-lite peripheral code? It seems as though the interface is so complicated, even Xilinx struggled to get it right. Score another point for using formal verification. I'm still hoping to return to this problem in order to create a full AXI to WB bridge (burst support, ID support, etc.). So far, however, the interface protocol has been so complex that I have yet to be successful at it. Thank you again for sharing, Dan
  10. D@n

    Voice-activited

    @Junior_jessy, Please allow me to add a word or two to @zygot's advice. In particular, you've done your testing (so far) on recorded speech. Moving from recorded to live speech is a testing challenge (nightmare?). You'll want to be able to not only process the live speech, but also to be able (just after the fact) to grab recordings of any live speech that didn't process as you wanted it to, so that you can place these recordings into your MATLAB framework and see what went right (or wrong) about them. It is possible to use your computer microphone audio for this. You'll learn a lot from host audio, just not enough. Since @zygot mentioned them, I'll admit to having worked with the MATLAB simulink to HDL tools before. I would not recommend them to anyone who is considering them. They are great for graphically designing code, horrible when trying to do a diff to see if anything changed and horrible when digging for the details within a design. Likewise I had bad experiences trying to find all the top level ports and internal comments. Finally, once I stopped paying for the license (and swapped laptops), the code was lost to me. Not something I'd recommend. One other point: @zygot said "Simulate, simulate and simulate." I agree completely. Were I a professor, I would even stomp my foot at this point. It will be on the final exam. Simulate the A/D, simulate your algorithm, simulate both together! Create a simulation application, using your favorite HDL utility, that will allow you to input any audio file and test it. Examine what happens if the audio is earlier or later, stronger or weaker, faster or slower, etc. Examine up to two audio files in succession in this manner! Don't stop at one. (I've found a lot of bugs in the dead period between audios, where the algorithm is recycling.) In addition, I'd also suggest formally verifying whatever you can. (The A/D is a known candidate for formal verification ...) I've been caught multiple times over by bugs I'd never find in simulation that I then find with formal verification, so I highly recommend that to you. ( @zygot if you want to dig into this further, I'd be glad to, but let's start another topic for it if you are so inclined.) The most recent example? A UART transmitter that couldn't handle a transmission in 10*baud clocks, but rather 10*baud clocks + 1. Just my two cents, Dan
  11. @sittinhawk, Waayy too locked down? Buggy too. It's not a total loss. I'm still using Verilog. The Borg has not gotten to me yet. As you suggested above, I now have a large library of Verilog code that I bring to any new project. This includes a CPU in a couple of implementations, several peripherals, and even an interconnect builder. I've also rejected AXI. The protocol is way too complex for its own good. I've been using Wishbone (WB) instead. For half the logic of even AXI-lite, you can get about twice the speed. (Well, that's not quite true ... I do have an AXI-lite peripheral implementation that recovers the speed loss ...) Using a WB to AXI bridge still gives me access to the Xilinx MIG. There is an open source DRAM controller generator out there, but I haven't tried it yet. My first ARM+FPGA project used an Avalon bus. That bus is at least easy to bridge to WB, and even to formally verify the bridge. Since that time, I've now written a formally verified AXI-lite to WB bridge. (One that worked the first time out too!) I could use that if I ever needed to interface with an ARM processor on board. However, as you've noticed, there are a lot of individuals writing into the forums, both this one and Xilinx's) who are clueless about how to use the canned IP, and struggling to figure out how to integrate their own IP into it. Worse, they can't figure out how to debug it when it doesn't work--which is to be expected with any closed source solution. Once you got off the ground, Verilog never had these problems. Dan
  12. D@n

    GPS Pmod

    @HelplessGuy, The fundamental discipline of all engineering is to be able to take a problem, such as the GPS not working for you, and to break it down into pieces in order to figure out which piece is broken. This was my advice to you above. I'm a little confused by your response saying that you don't need to do this. Does that mean that things are finally working for you? Dan
  13. @Josef, I'm not sure I know enough to say the problem only exists in the case where a user places the flash in a specific configuration. I do know that I never loaded the flash using the GUI. I always loaded my design into the flash myself, so I can't really comment on the GUI approach and how well it works. I know that more than one person has struggled to load their design from the GUI, and I've always suspected that this was part of the cause. I do know enough to say that you want your flash device in a QSPI/XIP configuration for fastest access. I also know that, if you are concerned, it's not that hard to read the ID off of the flash to see who the manufacturer is, what the size of the chip is, and which chip from the manufacturer it represents. Dan
  14. @Josef, At the time, I was trying to build a high speed flash controller. As part of any design work I do, I start by downloading the specifications for all the parts on any board you will be working with. Then, when I'm ready to work with a given part, I start reading the specification for the part I want to work with. I would recommend this to you as well. In this case, I found the "problem" by reading the specification for the Micron flash chip. In Micron's zeal for creating a faster/better/cheaper chip, they created a chip that could start in QSPI/XIP mode. This is in many ways a sales point, since a chip that can start in that mode is going to run faster than one that starts in SPI mode and then needs to transition to QSPI. The problem, though, is trying to figure out which mode the chip started in if you have no idea what the configuration register is set to. In the middle of this, there arose a question as to whether or not the FPGA could properly place the chip into the necessary SPI or QSPI mode, from whatever state the user had left it in, in order to configure the chip initially. I'm still not certain of the answer to that question. In my current work, I'm resolving this problem by sending a carefully chosen command to the flash that, if the flash is in SPI mode, will have no meaning, but if the flash is in QSPI mode it will return the flash to SPI mode. Once the flash has been returned to a known state, I can then place it into the state that I want it to be in--usually the QSPI/XIP mode I wanted to work with in the first place. This approach has another benefit as well. When you load a design onto an FPGA via the JTAG, the flash chip doesn't change modes like it will if you load your design from the flash. In other words, the flash might already in its QSPI/XIP mode when my design starts up separate from whatever reset mode the flash might be in. (This issue has caught me by surprise more than once, and not always with the flash, where some piece of hardware is already initialized upon design startup--something worth watching out for.) By first taking the flash out of whatever mode it is in initially and placing it into a known mode, the design can reliably start when loaded from JTAG as well as when loaded from flash. Dan
  15. @Josef, One of my personal goals is to try to build a single/universal flash driver that can work with both boards. This includes a startup script that will pull the Micron out of whatever reset configuration/state it is in. I think I've got it, but the Spansion flash will need at least two configuration changes: 1) the number of "dummy" cycles needs to be changed. These are the number of cycles between the read address and the first clock with read data in it. This change needs to be made both to the open source flash simulator I have, as well as to the RTL code and the software driver I have. All are possible, just annoying to do. 2) The startup script within the driver itself (probably) needs to change. Once done, my OpenArty design should work with the new hardware as well. At least, the different flash is the only change I know of. Since I don't have one of the newer Arty boards, it will be hard for me to know for certain. Dan