xc6lx45

Members
  • Content Count

    596
  • Joined

  • Last visited

  • Days Won

    35

Everything posted by xc6lx45

  1. Hi, >> So can any be give me suggestions how i can correct the debounce concept in the code. regular solution: count consecutive, identical input states and reset the counter on change of input state. Signal change of output state when counter reaches predetermined value. sneaky / suboptimal solution: sample the switch on a 100 Hz grid. I have to keep reminding myself that there is nothing inherently wrong with VHDL ... in Verilog, the whole task could be solved cleanly in one screen length of code, give or take some.
  2. Hi, you can search for the keyword "blocking" vs "non-blocking" assignment in Verilog. You can get a hint at the answer (also the "always @(posedge clk) if you look at the schematic in your post. Note the triangle, "C" is an edge sensitive input (which goes down to the transistor level electronics that ultimately define the foundations of "synchronous" logic design as we know it). This only as sneak preview to what a web search will reveal ๐Ÿ™‚
  3. I'm quite sure you can use one account (I have done so on several PCs myself with Webpack). Looking at that license file, it says HOSTID=ANY To me, this looks like (but someone correct me if I'm wrong) that the free webpack license isn't even tied to one specific machine.
  4. Just wondering whether that is worth the trouble. Internally, the pins are all muxed to the same ADC pair. For example, this chip http://www.ti.com/product/CD74HC4051 could do the same anywhere in your design.
  5. Reality check: Production ATE equipment => 6+ digit price range Test lab equipment => 6+ digit price range Basys 3: Low 3 digit price range We can state with confidence it will be able to test a "subset" of features
  6. Hi, I finally got around to clean up my fractals pet project. It includes 30 parallel fractal engines and makes good use of the chip's 90 hardware multipliers, with up to 30 * 3 * 200 MHz =18 billion multiplications per second under full load. The demo still works on USB power - the chip does get a little warm, though. Other than clocking and ASYNC_REG for CDC, the code is 100 % inferred logic (portable). Here you can see it in action. Blurring and glitches are phone camera artifacts. There is some documentation (partly under construction), including an architecture diagram: https://github.com/mnentwig/forthytwo/tree/master/fractalsProject The control code runs on the 32-bit variant of James Bowman's J1B CPU. I use my own small "forthytwo.exe" compiler in C#, which is included in the repo (starting from the link, navigate up). I've tried to deliver it in a way that it does not require Visual Studio to build (.exe for windows is included, and you can build it from a .NET runtime. This may need some editing of the makefile, though). There is a bootloader variant included so feel free to use it as testbed for the J1B, using UART, LEDs, and switches. Note, anything image related is done in RTL - the J1B only calculates vectors that define the image location in "fractal space". If anybody wants to go ahead minus the fractals part, I've done that work for you already (refImpl project). There are also Verilator simulations included - similar to the original J1B repo - that allow interactive (UART => keyboard) experiments with the microcontroller code. The bravest can try their hands on my hand-crafted assembler floating point implementation, which is a little bit rough in some places but generally works nicely if I am aware of the limitations (e.g. no internal rounding, just truncation). Bonus: 2D matrix math feat. Bhaskara's 7th century sine approximation for the rotation matrix. Happy hacking! -Markus
  7. Yes but I had already proven that it works
  8. "Due to recent events": Please be aware that by default, the FPGA (Xilinx 7 series) wakes up with PLLs possibly still unlocked. This is explicitly stated in UG908 p. 70 and shouldn't surprise anyone who has read the manual ๐Ÿ™‚ See "BITSTREAM.STARTUP.LCK_CYCLE" option. "Recent events" means that yesterday, I found that my design (which started from the first clock cycle), worked 100 % reliably on one board but failed with 100 % certainty on the other board. Identical RTL with initial delay in SW did work reliably on both boards so this is a shy little bug ... Adding simple reset logic based on the .locked signals from the PLLs solved the issue, now both boards are functional.
  9. OK... you might still check whether the register survives retiming. Use (* DONT_TOUCH = "true") attribute on muxr and similar registers if in doubt. BTW, if you want to design a board and are not using much of the chip's floorspace, check other FPGA brands e.g. Lattice. They seem more straightforward in the external circuitry (see "tinyfpga" boards for a reference, or the ICE40 EVB. AFAIK it has on-chip non-volatile memory so the flash should be optional. What's left on the board is mostly decoupling capacitors...). It's also somewhat slower which should be more forgiving with regard to noise, signal integrity etc.
  10. are you sure you are registering your combinational logic before the output driver? See my "digital hazards" comment a few posts back... It is a fairly common misunderstanding, where reality differs from simulation (*) - synchronous design methodology guarantees correct signal levels at clock edges ONLY. It gives NO GUARANTEES WHATSOVER between clocks: For example, when you have a sequence of 0 bits, there may be spikes between clock edges and the circuit is still functioning correctly as designed. (*) it will show to some extent if you'd run e.g. "post-implementation timing simulation" with actual LUTs and routing delays. But as said, the first thing to do is check whether the outputs are registered, without any combinational logic on the output side.
  11. Check the warnings. Most likely, there is one related to this construct: >> if(rising_edge(i_clk) and i_clk ='1') then the 2nd part is redundant (can't have a rising edge without the clock being high). But, FPGA-internally, information and clock signals are kept electrically separate (clocks drive edge-sensitive inputs, information signals drive level-sensitive inputs) and mixed only in exceptional use cases like ASIC emulation. In a nutshell, use clocks only in "rising_edge" constructs => "synchronous" design methodology. Can you reduce the test case so it only shows the error and nothing else? Most likely, when that is done, you'll see the root cause for yourself.
  12. Thanks, that's a nice idea. I don't think it would be timing critical, and the complexity could be hidden away in a module.
  13. For comparison, I got the following LUT counts for James Bowman's J1B (16 bit instruction, 32 bit ALU) CPU which I know quite well: * 673 LUTs = 3.3% utilization of A7-35 with 32 stack levels in distributed RAM (replacing the original shift register based stack which does not look efficient on Xilinx 7 series) * 526 LUTs if reducing the +/- 32 bit barrel shifter to +/- 1 bit, but the performance penalty is severe (e.g. IMM values need to be constructed from shifts). * 453 LUTs if further allowing one BRAM18 for each of the two stacks. This includes a UART and runs at slightly more than 100 MHz but memory/IO need two instructions / two cycles. So the RISC "overhead" does not seem that dramatic. It's slightly bigger, somewhat slower but has baseline opcodes (e.g. arithmetic shift and subtract, if I read it correctly) that J1B needs to emulate in SW). It would be interesting to know where the memory footprint goes when I use (soft) floats. I've done the experiment in the recent past with microblaze MCS, and did not like what I saw. On J1B I need about 320 bytes for (non IEEE 754) float + - * / painfully slow without any hardware support but it keeps the boat afloat, so to speak. Using C instead of bare metal assembly would be tempting.... I just wonder how much effort it takes to install the toolchain.
  14. I'd consider attaching a wire to the grounds, possibly from the bottom side. I'd leave the 3.3 V alone. BTW, are you aware of the problem of digital hazards ? Essentially, you are not allowed to drive IO outputs from combinational logic but must register each output. This might be responsible for some of the noise and would cause weird problems when driving edge-sensitive inputs on fast external ICs e.g. a shift register. You can use an additional higher frequency clock to not introduce more delay / jitter than necessary (say, 200 MHz). Either use the same PLL with a 2nd output at a multiple of the signal frequency, or use a 2nd PLL at arbitrary high frequency and make the the timing to the original clock "don't-care" with a false_path constraint between the clocks e.g. #set_false_path -from [get_clocks -of_objects [get_nets clk1]] -to [get_clocks -of_objects [get_nets clk2]] #set_false_path -from [get_clocks -of_objects [get_nets clk2]] -to [get_clocks -of_objects [get_nets clk1]]
  15. xc6lx45

    Cmod A7 programming

    Hi, isn't there a SDK menu option "Xilinx/program flash"? I'm checking this from a Zynq project but I think it looks the same. Hint, if you run into weird tool issues during the programming process, make sure the Vivado-side hardware manager is not interfering with the device.
  16. I doubt it will fix all the problem but I am fairly certain that 10 ยตF caps are unsuitable. 100 nF is more like it. Capacitor impedance is a V-shaped curve over frequency - you want to be on the left half with frequencies of interest (also harmonics, not just the fundamental digital switching rate). Use a capacitor that's too big and you end on the right half, where it goes up, steeply (the capacitor has just become an inductor) All that said, the CMOD A7 is a great little board but with a single GND pin it does have its limitations. You could have a look at Trenz TE 0726 ... I think I've helped them sell a few of those over the years ๐Ÿ™‚ but then it's not nearly as convenient e.g. bring your own JTAG and power. The "current return path" topic is something you should look into (much easier with a four-layer board). As a quick hack, note that there are additional ground connectors on the PMOD connector, on the opposite side of the board (which is good).
  17. yes, some details would help... Random thought: A possible workaround is to configure unused IO pins as outputs and short them to the ground plane. In RTL, define as outputs driving zero, set DRIVE to 24 (mA, the maximum value) and SLEW-FAST. This should provide the minimum-impedance setting for your "software-defined ground pins". It's a hack and you may even damage the module if too many "1"s are accidentally driven into short-circuited outputs. Still, taking the risk may be cheaper than the alternatives. Random thought # 2: Read the data sheet of your capacitors, especially the |Z|(f) curve. There can be massive differences between dedicated RF capacitors to low-ESR to run-of-the-mill capacitor. An FPGA is much more challenging than an 8-bit-era digital chip since its digital transients are so much faster.
  18. This is really something to consider in the long term. X and A have a strong interest to make us use their respective processor offerings. Nothing ever is for free and we may pay the price later when e.g. some third-party vendor (think China) shows up with more competitive FPGA silicon but I'd need a year for migrating my CPU-centric design. For industrial project reality, accepting vendor lock-in may be the smaller evil but if you have the freedom to look ahead strategically (personal competence development is maybe the most obvious reason for doing so, maybe also government funding) there may be wiser options. This is at least what keeps me interested in soft-core CPUs even though its absolute KPIs are abysmally bad.
  19. ... some numbers. Yes, apples are not oranges , this is about orders-of-magnitude, not at all a scientific analysis and maybe slightly biased. Take the Zynq 7010. It has 17600 LUTs. Let's count each as 64 bits => 1.1 MBit for the logic functions of my application (if you like, add 2.1 MBit BRAM => 3.2 MBit). Now the ARM processor: While it's probably only a small add-on in terms of silicon area / cost (compare with the equivalent Artix - it's even cheaper - weird world...) it includes 256 kB on-chip memory 512 kB on-chip L2 cache which is 6.1 MBit So we've got already several times the amount of "on-chip floorspace" for the application logic and it'll probably run faster than FPGA logic as it's ASIC technology not reprogrammable logic, typically clocks at 666 MHz (-1) where a non-tuned-/pipelined design on the PL side will probably end up between 100 and 200 MHz. Needless to say, offloading application logic to DRAM or FLASH is trivial where a RTL-only implementation hits the end of the road, somewhat stretchable by buying a bigger chip, maybe partial reconfiguration or biting the bullet and adding a softcore CPU which will be so pathetically slow that the ARM will hop circles around it on one leg. Right, I forgot, the above-mentioned 7010 actually has two of them
  20. Hi, learning a new language well is a major investment => constant cost. Picking an inadequate language / technology / platform is a cost multiplier. Which one hurts more? For a small project the learning effort dominates so you tend to stick with the tools you've got. Try this in a large project and the words "uphill battle" or "deathmarch" will come to life... There's a human component to this question: Say, my local expert has decades of experience with FORTH coding on relay logic - you bet what his recommendation will be, backed by some very quick prototyping within a day or two. And if you have ... >> someone good in verilog or vhdl, ... who is opposed to learning C, you have interesting days ahead... Ultimately, implementing non-critical, sequential functionality in FPGA fabric is a dead end for several reasons. Start with cost - a LUT is much, much more expensive than its functional equivalent in RAM on a processor. Build time is another. The "dead end" may well stretch all the way to success but don't lose sight of it. You will see it clearly when it's right in front of your nose. Now this is highly subjective, but my first guess (knowing nothing about the job, assuming it's not small and not geared towards either side by e.g. performance requirements), I'd predict that implementation on Zynq would take me 3..10x less effort than using HDL only. This may be even worse when requirements change mid-project (again, this is highly subjective but you have considerably more freedom in C to keep things "simple and stupid", use floats where it's not critical, direct access to debug UART, ...). On the other hand, Zynq is a very complex platform and someone needs to act as architect - it may well be that the "someone good in verilog" will get it right first time in a HDL-only design but need architectural iterations on Zynq because the first design round was mainly for learning. Take your pick. Most likely, Zynq is the best choice if you plan medium-/long term, and the (low-volume!) pricing seems quite attractive compared to Artix.
  21. yes, exactly. I haven't tried it myself with Lattice , only Xilinx (I have my own USB JTAG code e.g. to upload a bitstream to Xilinx FPGAs, here https://forum.digilentinc.com/topic/17096-busbridge3-high-speed-ftdifpga-interface). But I wouldn't expect any surprises. Some of the basic functions e.g. IDCODE are standardized.
  22. just remembered something: the "tinyFpga" boards with Lattice devices should work also and are 1/3 the price of above.
  23. Hi, I doubt that you can convert a JTAG master into a slave (there is little symmetry in the protocol) but you should be able to use just about any FPGA or microcontroller board, as long as it has a hardware (not USB) JTAG port. From the top of my head I can't come up with a low-end Digilent board with hardware JTAG port. Now If I had to pull the cheapest suitable board from my own collection it would be a tie between Trenz DIPFORTy1, Xess Zula II, Papilio Pro and Numato Saturn, probably decided by shipping charges. If you search a bit, you might find much cheaper options for "something" with JTAG, like an obsolete WLAN or DSL router for example. I can't think of any dedicated out-of-the-box JTAG slave hardware. This could be a typical use case for an FPGA but a non-trivial design effort (not recommended) PS if your master works at a fixed voltage only (e.g. 3.3 V), double-check that the slave voltage matches.
  24. timing closure will fail but read the warnings: you'll get a notification when your design is lacking extra registers before and / or after the inferred DSP48 block that are required for maximum speed to be absorbed into the DSP48 hardware unit.