xc6lx45

Members
  • Content Count

    705
  • Joined

  • Last visited

  • Days Won

    38

Reputation Activity

  1. Like
    xc6lx45 got a reaction from Altis in Digilent A7 PMOD Input Voltage   
    You may be able to make it work with e.g. a 10 kOhms serial resistor that limits the current into the FPGA input through the FPGA's clamping diode to the FPGA's supply rail at 3.3 V.
    But please validate this for yourself e.g. with a google search.
  2. Like
    xc6lx45 got a reaction from itse me mario in Genesys 2 Audio Codec with FIR filter   
    True but you'll need some means to break down the problem into smaller pieces that can be debugged individually. Otherwise you have this big black box and the information "it doesn't work".
    I'm not trying to sell you any methodology, simple answers or miracle tools - I would cut most corners myself if I had to do a similar job but you may need a little more "scaffolding" if doing it for the first time. A lot more if learning FPGA design along the way.
    BTW, the current consumption of your codec might serve as quick-and-dirty indicator that your register writes are going through (works for a module, not sure if this will help you)..
  3. Like
    xc6lx45 got a reaction from macellan in Microblaze issues for a beginner   
    Hi,
    >> but I feel like lost in documents
    Welcome to FPGAs. This is pretty much the name of the game (but it also makes the struggle worthwhile - if it were easy, everybody would do it ūüôā ).
    As a general direction, a solid (basic) tutorial is good but don't expect to be led by the hand all the way. The constant version changes make this quite difficult (good news: it means there is technological progress ... well at least in theory but the guys from the marketing department said so and they'll surely know ...).
    More specific directions: Have a look at Microblaze MCS. It's fairly simple - set up the most basic system with some BRAM (=memory "internal" to the FPGA fabric) and one UART. Once you've got that printing "Hello World" - which is mostly a question of baud rates and not mixing Tx/Rx pins, you can add features one by one and the sky is the limit.
    Well, at least until the little girl next door pulls out her Raspberry Pi, running four cores at 10x the clock frequency - don't complain no one told you: by absolute standards, the performance of any softcore CPU is pathetic, compared to a regular ASIC CPU on the same technology node. So eventually you'll have to move into FPGA territory, or it makes little sense except as a learning exercise.
  4. Like
    xc6lx45 got a reaction from CPerez10 in Simple question regarding Anvyl Spartan 6 clock frequency   
    Hi,
    as a simple (oversimplified?) answer, designing for higher clock speed requires higher effort (possibly "much" higher effort), and the resulting optimizations make the code harder to work with.
    Using the clocking wizard to generate a 500 MHz PLL is easy (try it). But writing logic at those frequencies is a different story (e.g. try to implement a conventional counter that divides down to 1 Hz. Why do all those XYX_CARRY signals show up in the timing report already at synthesis?). You also need to distinguish between what is feasible in plain logic fabric, and what can be done with dedicated "hard-macro" IP blocks such as SERDES.
  5. Like
    xc6lx45 got a reaction from [email protected] in Simple question regarding Anvyl Spartan 6 clock frequency   
    Hi,
    as a simple (oversimplified?) answer, designing for higher clock speed requires higher effort (possibly "much" higher effort), and the resulting optimizations make the code harder to work with.
    Using the clocking wizard to generate a 500 MHz PLL is easy (try it). But writing logic at those frequencies is a different story (e.g. try to implement a conventional counter that divides down to 1 Hz. Why do all those XYX_CARRY signals show up in the timing report already at synthesis?). You also need to distinguish between what is feasible in plain logic fabric, and what can be done with dedicated "hard-macro" IP blocks such as SERDES.
  6. Like
    xc6lx45 got a reaction from JColvin in spam for a good cause: COVID-19 research on your FPGA workstation   
    Hi,
    most people working with FPGAs will have a reasonably capable PC under their desk. You can make that CPU power available for COVID-19 research when it sits idle through [email protected]
    The project has apparently been around for many years doing protein research. A summary is here.
    I found that on Windows, installation is pretty straightforward:
    Install BOINC create an account Start BOINC and select [email protected] It can be configured in many ways e.g. set it to use CPU power only if the machine has been idle for a minute. I leave it running at 100 % in the background and don't notice any obvious slowdown. From the few hours I've spent on the topic I got the impression that my old (but water-cooled and diligently overclocked) desktop still outclasses the other hardware I've tried: The newer laptop has good burst performance but goes into thermal throttling after a few seconds. ARM-based Android devices were far off, they probably have better power efficiency but lack¬† the memory bandwidth (e.g. I see ~10 GB for 12 threads plus 2GB storage).
    So take this as a motivational speech that your FPGA design rig could be a valuable contributor. About one day of uptime got me on position 392.924 of the [email protected] worldwide ranking and I hope this post will motivate many of you to beat this ūüôā
    Cheers

    Markus
  7. Like
    xc6lx45 got a reaction from Bobby29999 in Inverse Transform Sampling¬†method on FPGA   
    Playing the devil's advocate, you could put a softcore processor on an FPGA, then recompile the standard C code. Work done in a day (plus a week if you're bringing all the tools up for the first time) but does not make any sense.
    This leads to a very interesting question is, "why does it not make any sense"? Because it does not use the FPGA efficiently. This is unfortunately a common problem to educational projects, "just do anything" even if it makes no sense. Which is unfair because providing answers would require more expertise than can be expected from a student. Instructors are lazy.
    A possible answer to my own question, "yes but we need 1000x higher data rate than the softcore CPU delivers" or "it may use no more than 500 LUTs and only a single BRAM". An FPGA can do many things but the design effort for what starts to look like a "real" design task will skyrocket, e.g. a pipelined implementation with multiple independent operations in flight at the same time.
    A hint, fight laziness with laziness and try to negotiate away any auxiliary tasks like starting value generation (e.g. agree on a hardcoded seed, or have it sent via UART as hex number).
    For example, one strategy (that might even make sense in some scenarios e.g. when the CPU is there anyway) would be softcore CPU based with a few FPGA-optimized hardware accelerators for critical functions.
  8. Like
    xc6lx45 got a reaction from Bobby29999 in Inverse Transform Sampling¬†method on FPGA   
    Well OK, what I wrote was maybe not accurate to the final digit ...
    What I meant is this
    "The IEEE standard goes further than just requiring the use of a guard digit. It gives an algorithm for addition, subtraction, multiplication, division and square root, and requires that implementations produce the same result as that algorithm. Thus, when a program is moved from one machine to another, the results of the basic operations will be the same in every bit if both machines support the IEEE standard. This greatly simplifies the porting of programs. Other uses of this precise specification are given in Exactly Rounded Operations."
    Intel got quite some publicity on their division bug so I'd assume those should work properly (note, not a word about logarithms) but yea, this is not my specialty area
     
  9. Like
    xc6lx45 reacted to zygot in Inverse Transform Sampling¬†method on FPGA   
    It's been too long ago but I do remember taking the scenic side journey into investigating performance of floating point on Intel processors. Mostly what I remember is that it was interesting, informing, had unexpected surprises and was a valuable exercise. Just recommending the excursion to anyone interested in 'bit exactness'.
  10. Like
    xc6lx45 got a reaction from Bobby29999 in Inverse Transform Sampling¬†method on FPGA   
    I think you need to think carefully about the quality of your math functions. For example, IEEE 754 guarantees bit-exactness in many cases but on an FPGA you usually cut corners where you can since you're stuck on reprogrammable logic technology instead of ASIC / hard macros. The low bits rarely matter with one-size-fits-all floats (or "doubles" if one-size-fits-all-floats is too small size ūüôā ) but from a cryptographic point of view (where also the statistics of an error matter), this could be a killer.
  11. Like
    xc6lx45 got a reaction from john_joe in Powering Zybo-Z7-20 directly from USB charger???   
    one unintended consequence is that it will put your board at a potential of 115 V RMS relative to ground (the typical charger has only two mains terminals without ground, and uses internally a high-impedance voltage divider to prevent a charge buildup between primary and secondary side). You can show it with a digital multimeter, on many units you can even feel the buzz when touching the low-voltage wire. Obviously, not all chargers are the same, on some you can't feel it and on others it's fairly unpleasant). That said, it's done frequently e.g. with Raspberry Pi. I can't comment on Zybo but I've used chargers to run FPGA boards standalone without incident.
  12. Like
    xc6lx45 got a reaction from [email protected] in ADC to FFT using Lattice FPGA   
    are you sure that your whole architecture makes sense? FFT size / memory is limited, at such a high rate you can transform only a very short burst. For example, to isolate power line hum (which is very likely something you'll find in the data) you need at least 20 ms...
  13. Like
    xc6lx45 got a reaction from [email protected] in ADC to FFT using Lattice FPGA   
    I don't know which device family you are using, but for example for MachXO3 locate this document
    http://www.latticesemi.com/-/media/LatticeSemi/Documents/ApplicationNotes/IK/ImplementingHighSpeedInterfaceswithMachXO3-Devices.ashx?document_id=50122
    The high-speed generic DDR (GDDR) interfaces are supported through the built-in gearing logic in the Programmable I/O (PIO) cells. This gearing is necessary to support high-speed I/O while reducing the performance requirement on the FPGA fabric.
    The idea is to instantiate a PIO cell that captures data on both clock edges for you. Do not think in terms of rising and falling clock edge on an FPGA for general logic: Verilog or VHDL offers simply more flexibility than FPGA technology. There is also a lot of ASIC- or simulation-driven (or just clueless) material on the web to cause confusion.
    Unless you're into specialist applications e.g. low power, stick with one clock edge for the whole design (reality check: it might seem a clever idea to use both edges, especially given that Lattice devices often have invertible clock inputs. But, the hardware is never perfectly symmetrical so it will always sacrifice some timing margin over the equivalent single-edge design with double-frequency clock via duty cycle uncertainty. It gains nothing also because standard logic cells (exception: PIO above) with edge sensitive input will only trigger on one edge at a time - this goes down to the transistor level circuitry)
    Then, use a 24 bit FIFO. You will not be able to deinterleave the data in real time - that would mean running at 400 MHz. Do not underestimate the difficulty of reaching even 200 MHz - I suggest you do some early synthesis trials to avoid hitting the wall with a design that works in simulation but can never close timing on hardware. I don't know all the Lattice families by heart but "Lattice", 200 MHz and what I read between the lines in your first post rings an alarm bell for me e.g. it comes as a surprise how many logic levels you really get when you've worked with 6-/7 input LUTs before and suddenly are down to four.
  14. Like
    xc6lx45 reacted to zygot in How do you use verilator with Xilinix unimacro elements?   
    You already have a proper simulator in Vivado. Learn how to write a testbench. Where's the Verilator evangelist when a potential convert needs him?
  15. Like
    xc6lx45 reacted to [email protected] in need advice for budget friendly road to FPGA based soft(firm)ware defined radio   
    @apchar,
    I recently had an opportunity to do some gateway defined radio work with an iCE40 Ultra 5LP, a digilent MIC3 MEMS microphone PMod, and a SX1257-Pmod for a radio.  Hardware wise, the set up was really easy to do--and so I'd recommend it to you at all if you are interested.
    You can find my project, source code and references to hardware, here.
    Dan
  16. Like
    xc6lx45 got a reaction from SigProcbro in Zynq 7000 Baremetal with webpack.   
    I don't think there's anything to stop you.
    As a starting point, you could create a "FSBL" (first-stage boot loader) project and cut it to size by deleting out anything you don't need. There are some essential parts you do need, such as powering up the level shifters between PS/PL, enable PL clocks etc.
    IMHO, the easiest way to get there is to start from working code, even if you intend to not use a line of Xilinx code in the long run.
    Note that when you do this, storing a faulty FSBL in flash memory can soft-brick your board when it prevents JTAG access on power-up. This is easy to fix if you're aware of the issue, takes some paper clip acrobatics to short a flash pin at power-up so it doesn't load from flash. Happens rarely, I think I had to do this once or twice in total. Some boards have boot mode jumpers, in this case no paper clip is required...
  17. Like
    xc6lx45 got a reaction from SigProcbro in Zynq 7000 Baremetal with webpack.   
    you can single-step through the FSBL, so that's probably a "yes".
    Note that documentation may become an issue (regardless of webpack or paying license): The ARM core is Xilinx-customized, so the ARM documentation helps only to a point. For example, non-standard use of the cache controller is such a topic.
    If you do want to use Xilinx libraries (but no OS) then forget everything I've written. It's a straightforward design flow => click through the menus to generate a new e.g. "Hello World" SDK project in standalone mode. Use an unmodified FSBL that loads your application. It's the obvious way ahead if you want to use external DRAM in your project, which is probably the case.
  18. Like
    xc6lx45 got a reaction from mjacome in Vivado Design Suite for BASYS 3: Mass Installation Inquiry   
    I'm quite sure you can use one account (I have done so on several PCs myself with Webpack).
    Looking at that license file, it says
    HOSTID=ANY
    To me, this looks like  (but someone correct me if I'm wrong) that the free webpack license isn't even tied to one specific machine.
  19. Like
    xc6lx45 reacted to asmi in Public service announcement: PLL locking   
    You can force config logic to wait for PLL/MCMM locks before GSR deassertion and design startup. RTFM: UG472 table 3-7, parameter STARTUP_WAIT. But you've got to be careful with this option as design will never start if one of clocks is not present at startup - typical case being HDMI input, or just about any non-MGT high-speed input for that matter. So it's fine to use it for system clock(s), but it's a definitely "NO" for IO clocks.
  20. Like
    xc6lx45 reacted to hamster in RISC-V RV32I CPU/controller   
    I just had a look at the J1b source, and saw something of interest (well, at least to weird old me):
    4'b1001: _st0 = st1 >> st0[3:0]; .... 4'b1101: _st0 = st1 << st0[3:0]; A 32-bit shifter takes two and a half levels of 4-input, -2 select MUXs per input bit PER DIRECTION (left or right) and the final selection between the two takes another half a LUT, so about 160 LUTs in total (which agrees with the numbers above)
    However, if you optionally reverse the order of bits going in, and then also reverse them going out of the shifter, then the same shifter logic can do both left and right shifts.
    This needs only three and a half levels of LUT6s, and no output MUX is needed. That is somewhere between 96 and 128 LUTs, saving maybe up to 64 LUTs.
    It's a few more lines of quite ugly code, but might save ~10% of logic and may not hit performance (unless the shifter becomes the critical path...).
  21. Like
    xc6lx45 reacted to hamster in RISC-V RV32I CPU/controller   
    I've just posted my holiday project to Github - Rudi-RV32I - https://github.com/hamsternz/Rudi-RV32I
    It is a 32-bit CPU, memory and peripherals for a simple RISC-V microcontroller-sized system for use in an FPGA.
    A very compact implementation and can use under 750 LUTs and as little as two block RAMs -  < 10% of an Artix-7 15T.
    All instructions can run in a single cycle, at around 50MHz to 75MHz. Actual performance currently depends on the complexity of system bus.
    It has full support for the RISC-V RV32I instructions, and has supporting files that allow you to use the RISC-V GNU toolchain (i.e. standard GCC C compiler) to compile programs and run them on your FPGA board. 
    Here is an example of the sort of code I'm running on it - a simple echo test:, that counts characters on the GPIO port that I have connected to the LEDs.
    // These match the address of the peripherals on the system bus. volatile char *serial_tx = (char *)0xE0000000; volatile char *serial_tx_full = (char *)0xE0000004; volatile char *serial_rx = (char *)0xE0000008; volatile char *serial_rx_empty = (char *)0xE000000C; volatile int *gpio_value = (int *)0xE0000010; volatile int *gpio_direction = (int *)0xE0000014; int getchar(void) { // Wait until status is zero while(*serial_rx_empty) { } // Output character return *serial_rx; } int putchar(int c) { // Wait until status is zero while(*serial_tx_full) { } // Output character *serial_tx = c; return c; } int puts(char *s) { int n = 0; while(*s) { putchar(*s); s++; n++; } return n; } int test_program(void) { puts("System restart\r\n"); /* Run a serial port echo */ *gpio_direction = 0xFFFF; while(1) { putchar(getchar()); *gpio_value = *gpio_value + 1; } return 0; } As it doesn't have interrupts it isn't really a general purpose CPU, but somebody might find it useful for command and control of a larger FPGA project (converting button presses or serial data into control signals). It is released under the MIT license, so you can do pretty much whatever you want with it.
    Oh, all resources are inferred, so it is easily ported to different vendor FPGAs (unlike vendor IP controllers)
  22. Like
    xc6lx45 got a reaction from JColvin in hard working FPGA...   
    Happy new year
     
  23. Like
    xc6lx45 got a reaction from Arjun in Why we need SOC (Procesor + FPGA), if we can do our all work with FPGA???   
    Hi,
    learning a new language well is a major investment => constant cost. Picking an inadequate language / technology / platform is a cost multiplier.
    Which one hurts more? For a small project the learning effort dominates so you tend to stick with the tools you've got. Try this in a large project and the words "uphill battle" or "deathmarch" will come to life...
    There's a human component to this question: Say, my local expert has decades of experience with FORTH coding on relay logic - you bet what his recommendation will be, backed by some very quick prototyping within a day or two.
    And if you have ...
    >> someone good in verilog or vhdl,
    ... who is opposed to learning C, you have interesting days ahead...
    Ultimately, implementing non-critical, sequential functionality in FPGA fabric is a dead end for several reasons. Start with cost - a  LUT is much, much more expensive than its functional equivalent in RAM on a processor. Build time is another. The "dead end" may well stretch all the way to success but don't lose sight of it. You will see it clearly when it's right in front of your nose.
    Now this is highly subjective, but my first guess (knowing nothing about the job, assuming it's not small and not geared towards either side by e.g. performance requirements), I'd predict that implementation on Zynq would take me 3..10x less effort than using HDL only. This may be even worse when requirements change mid-project (again, this is highly subjective but you have considerably more freedom in C to keep things "simple and stupid", use floats where it's not critical,  direct access to debug UART, ...).
    On the other hand, Zynq is a very complex platform and someone needs to act as architect - it may well be that the "someone good in verilog" will get it right first time in a HDL-only design but need architectural iterations on Zynq because the first design round was mainly for learning. Take your pick.
    Most likely, Zynq is the best choice if you plan medium-/long term, and the (low-volume!) pricing seems quite attractive compared to Artix.
     
  24. Like
    xc6lx45 got a reaction from Arjun in Why we need SOC (Procesor + FPGA), if we can do our all work with FPGA???   
    ... some numbers. Yes, apples are not oranges , this is about orders-of-magnitude, not at all a scientific analysis and maybe slightly biased.
    Take the Zynq 7010. It has 17600 LUTs. Let's count each as 64 bits => 1.1 MBit for the logic functions of my application (if you like, add 2.1 MBit BRAM => 3.2 MBit).
    Now the ARM processor: While it's probably only a small add-on in terms of silicon area / cost (compare with the equivalent Artix - it's even cheaper - weird world...) it includes
    256 kB on-chip memory
    512 kB on-chip L2 cache
    which is 6.1 MBit
    So we've got already several times the amount of "on-chip floorspace" for the application logic and it'll probably run faster than FPGA logic as it's ASIC technology not reprogrammable logic, typically clocks at 666 MHz (-1) where a non-tuned-/pipelined design on the PL side will probably end up between 100 and 200 MHz.
    Needless to say, offloading application logic to DRAM or FLASH is trivial where a RTL-only implementation hits the end of the road, somewhat stretchable by buying a bigger chip, maybe partial reconfiguration or biting the bullet and adding a softcore CPU which will be so pathetically slow that the ARM will hop circles around it on one leg. Right, I forgot, the above-mentioned 7010 actually has two of them
  25. Like
    xc6lx45 got a reaction from Arjun in Why we need SOC (Procesor + FPGA), if we can do our all work with FPGA???   
    This is really something to consider in the long term. X and A have a strong interest to make us use their respective processor offerings. Nothing ever is for free and we may pay the price later when e.g. some third-party vendor (think China) shows up with more competitive FPGA silicon but I'd need a year for migrating my CPU-centric design.
    For industrial project reality, accepting vendor lock-in may be the smaller evil but if you have the freedom to look ahead strategically (personal competence development is maybe the most obvious reason for doing so, maybe also government funding) there may be wiser options.
    This is at least what keeps me interested in soft-core CPUs even though its absolute KPIs are abysmally bad.