hamster

Members
  • Content Count

    543
  • Joined

  • Last visited

  • Days Won

    84

Everything posted by hamster

  1. @Dan, I managed to find the 'average' option rather than 'decimate' which has stopped high frequency noise showing up in the spectrums, giving a more reasonable noise floor.
  2. BASYS3 + PMOD Breadboard + Analog Discovery 2. It was just a hack, so the table was a quick formula in a spreadsheet, yes, I assume the + and - sides are both rounding towards zero causing some asymmetry, but with 11 significant bits that should be somewherere about -60dB at a guess. Most of the noise is just the shoddy physical implementation. Flying wires on a breadboard, on PMODs, just the shielded wires on the AD2 and so on. If I leave a wire hanging around it will pick up most the noise too, maybe 6dB lower than on the channel that is under measurement. This was just a quick experiment, just using the 100MHz clock rate. If I use a slower clock (e.g. update only every 8 cycles so 12.5MHz) the noise floor actually drops a lot.. Also using the AD2 on a different laptop to the one programming/powering the FPGA removes a lot of noise too. I assume that this is due to voltage drops and noise on the USB cables. Plenty of room for experimentation and improvement.
  3. I played around with a 1st and 2nd order 12-bit Sigma Delta DAC implemented on the FPGA. I found the results quite interesting, as the change is pretty simple to implement and the change to the noise on the output spectrum is quite significant, with lowered 2nd harmonic and a much smother noise floor. VHDL code is on GitHub at https://github.com/hamsternz/second_order_sigma_delta
  4. I think of std_logic_vector the same way I would think of digits in a number... the rightmost digit is digit zero. It most likely isn't the best way set things up for this example, but it avoids the need to swap the bit ordering in the ASCII characters. Oh, for the button synchronization... signals take time to get across the chip (speed of light, capacitance and so on), so different parts of the design can see different values for the same signal as it change unless. As you can't control when the user might press the button you have to sample the value of the input signal on the clock edge, holding that in a register. That registered value is then used drive the rest of your logic. There is a slight complication - If the signal from the button changes state *exactly* on the clock edge, the flipflop might not be able to correctly register as a 1 or a 0, but could be in some weird "metastable" state that takes a short while to become either a 1 or a 0. To stop this causing bugs in the operation of the logic deeper in the deign, the output of that fliplfop is then sampled a second time to get a "known good, either 1 or 0" signal, that can get to where it needs to within a clock cycle. Hence the design pattern... btn gets sampled into btn_metastable (which is a bit dodgy if you use it), and then btn_metastable gets sampled into btn_synchronized, which is then used by the rest of the logic.
  5. Had a hack at it... tested working on BASYS3 library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.ALL; entity msg_repeater is Port ( clk : in STD_LOGIC; btn : in STD_LOGIC; tx : out STD_LOGIC; led : out STD_LOGIC_VECTOR (3 DOWNTO 0)); end msg_repeater; architecture Behavioral of msg_repeater is constant char_t : std_logic_vector (7 downto 0) := "01010100"; constant char_e : std_logic_vector (7 downto 0) := "01000101"; constant char_s : std_logic_vector (7 downto 0) := "01010011"; constant char_space : std_logic_vector (7 downto 0) := "00100000"; -- Note message is sent from bit 0 to the highest signal msg : std_logic_vector (49 downto 0) := char_space & "0" & -- Character 4, with start bit "1" & char_t & "0" & -- Character 3, wrapped in start and stop bit "1" & char_s & "0" & -- Character 2, wrapped in start and stop bit "1" & char_e & "0" & -- Character 1, wrapped in start and stop bit "1" & char_t & "0" & -- Character 0, wrapped in start and stop bit "1"; -- Idle symbol and stop bit of the last character signal msg_index : unsigned( 7 downto 0) := (others =>'0'); -- Should we send a message? signal triggered : std_logic := '0'; -- for generating the baud tick rate constant clock_rate : natural := 100000000; constant baud_rate : natural := 9600; signal baud_counter : unsigned(27 downto 0) := (others => '0'); signal baud_tick : std_logic := '0'; -- For the button synchonizer signal btn_synchronized : std_logic := '0'; signal btn_metastable : std_logic := '0'; begin process(clk) begin if rising_edge(clk) then -- Set the serial output bit tx <= msg(to_integer(msg_index)); led <= "0001"; -- Controlling the message index if baud_tick = '1' then if msg_index = 0 then -- We are waiting to be triggered if triggered = '1' then msg_index <= msg_index + 1; end if; elsif msg_index = msg'high then -- We have finished the message msg_index <= (others => '0'); triggered <= '0'; else -- We are sending bits msg_index <= msg_index + 1; end if; end if; -- Generating the baud tick if baud_counter < baud_rate then baud_tick <= '1'; baud_counter <= baud_counter - baud_rate + clock_rate; else baud_tick <= '0'; baud_counter <= baud_counter - baud_rate; end if; -- Seeing if we are triggered if btn_synchronized = '1' then triggered <= '1'; end if; -- Synchronize the button with the clock domain btn_metastable <= btn; btn_synchronized <= btn_metastable; end if; end process; end Behavioral;
  6. Away from my laptop at the moment, but they way I would do this: A register sized to hold your clock rate (28 bits for 100MHz). If it is less than the baud_rate, set 'bit_tick' to '1' and add (clock_rate- baud_rate) to the register. Otherwise set baud_tick to '0' and subtract the baud rate from the register. That will give you 'bit_tick' that is 1 for the right number of cycles per second, and allow you to keep everything in the design running in the same clock domain. You also want to have a synchronizer on your button, to make it work reliably. You also have a problem in that when the button is lifted you will stop sending data straight away, so will most likely send an incomplete message. Due to the way that VHDL signal assignments work you will try to send out bit 50 of your message, so that might mess things up. Would you like me to have a crack at rewriting it for you, so you can see the difference?
  7. hamster

    Basys 3 clock question

    IC9 is the 100MHz oscillator. It is on the bottom, just to the right of center. See the bottom left of page 5 of the schematics (on the resources page). Part code is DSC1033CC1-100.0000T
  8. You need a top level module, that describes all the signals that are connected to your FPGA. It will then contain all you synthesiable VHDL components that make up your design, and describes how they are connected to the each other and the outside world.
  9. Ok - here's how to drive the seven segments, from 1000 feet up. You need to have the constraints for the segments and the anodes for the display. See the board's reference manual and master UCF file for them. ##7 segment display #set_property -dict { PACKAGE_PIN T10 IOSTANDARD LVCMOS33 } [get_ports { CA }]; #IO_L24N_T3_A00_D16_14 Sch=ca #set_property -dict { PACKAGE_PIN R10 IOSTANDARD LVCMOS33 } [get_ports { CB }]; #IO_25_14 Sch=cb #set_property -dict { PACKAGE_PIN K16 IOSTANDARD LVCMOS33 } [get_ports { CC }]; #IO_25_15 Sch=cc #set_property -dict { PACKAGE_PIN K13 IOSTANDARD LVCMOS33 } [get_ports { CD }]; #IO_L17P_T2_A26_15 Sch=cd #set_property -dict { PACKAGE_PIN P15 IOSTANDARD LVCMOS33 } [get_ports { CE }]; #IO_L13P_T2_MRCC_14 Sch=ce #set_property -dict { PACKAGE_PIN T11 IOSTANDARD LVCMOS33 } [get_ports { CF }]; #IO_L19P_T3_A10_D26_14 Sch=cf #set_property -dict { PACKAGE_PIN L18 IOSTANDARD LVCMOS33 } [get_ports { CG }]; #IO_L4P_T0_D04_14 Sch=cg #set_property -dict { PACKAGE_PIN H15 IOSTANDARD LVCMOS33 } [get_ports { DP }]; #IO_L19N_T3_A21_VREF_15 Sch=dp #set_property -dict { PACKAGE_PIN J17 IOSTANDARD LVCMOS33 } [get_ports { AN[0] }]; #IO_L23P_T3_FOE_B_15 Sch=an[0] #set_property -dict { PACKAGE_PIN J18 IOSTANDARD LVCMOS33 } [get_ports { AN[1] }]; #IO_L23N_T3_FWE_B_15 Sch=an[1] #set_property -dict { PACKAGE_PIN T9 IOSTANDARD LVCMOS33 } [get_ports { AN[2] }]; #IO_L24P_T3_A01_D17_14 Sch=an[2] #set_property -dict { PACKAGE_PIN J14 IOSTANDARD LVCMOS33 } [get_ports { AN[3] }]; #IO_L19P_T3_A22_15 Sch=an[3] #set_property -dict { PACKAGE_PIN P14 IOSTANDARD LVCMOS33 } [get_ports { AN[4] }]; #IO_L8N_T1_D12_14 Sch=an[4] #set_property -dict { PACKAGE_PIN T14 IOSTANDARD LVCMOS33 } [get_ports { AN[5] }]; #IO_L14P_T2_SRCC_14 Sch=an[5] #set_property -dict { PACKAGE_PIN K2 IOSTANDARD LVCMOS33 } [get_ports { AN[6] }]; #IO_L23P_T3_35 Sch=an[6] #set_property -dict { PACKAGE_PIN U13 IOSTANDARD LVCMOS33 } [get_ports { AN[7] }]; #IO_L23N_T3_A02_D18_14 Sch=an[7] I suggest you update them so the first 8 are called "segments[0]" through "segments[7]" and the last 8 are "anodes[0]" through "anodes[7]"; You will also need constraints for the other signals you are using - e.g. the clock signal. You will then need to add the output to the top level design: ... segments : out std_logic_vector(7 downto 0); anodes :out std_logic_vector(7 downto 0); ... In your top level design. you can select which digit to light by setting one of the anodes to low, and also setting the patterns you want on the segments. For example: anodes <= "11111110"; segments <="0101010"; should light segment B, D, F and the Decimal port on digit 0 of the display. See section 9.1 of the reference manual to see which segment is where on the display. With that going you will then want to add a new sub-module to your design, that takes the 8 digits you want to display, and converts them to a pattern of segments and anodes, at a slow enough speed that they don't flicker (e.g. show each digit for 1/400th of a second, then move onto the next. The interface to this component will be something like: entity seven_seg_display is port ( clk : in std_logic; digit0 : in std_logic_vector(3 downto 0); digit1 : in std_logic_vector(3 downto 0); digit2 : in std_logic_vector(3 downto 0); digit3 : in std_logic_vector(3 downto 0); digit4 : in std_logic_vector(3 downto 0); digit5 : in std_logic_vector(3 downto 0); digit6 : in std_logic_vector(3 downto 0); digit7 : in std_logic_vector(3 downto 0); segments : out std_logic_vector(7 downto 0); anodes : out std_logic_vector(7 downto 0) ); end component; Once you you have that module connected, and with the body of it written (it is a counter and a few case statements) you should then do a bit of testing (eg. connect the switches to the digits) and should be ready to do display your data. This testing step is vital, to ensure that your LED patterns are correct, and you have got left and right correct for the anodes. Another test might be to create a 32-bit long counter, and then connect the digits to bit slices within that counter.
  10. Oh, here is a paper that might provide some insight... have a search for "Efficient FPGA Implementation of the RC4 Stream Cipher using Block RAM and Pipelining filetype:pdf" (The URL was too long to post here).
  11. Hi! RC4 was designed so that it doesn't map well to dedicated hardware, but uses only a small amount of compute power so it can be implemented on all but the tiniest microcontrollers, so implementing it in an FPGA is a great learning experience of what software problems are hard in hardware, and why. I looked into implementing RC4 a long while ago, and decided that a high performance implementation is pretty much impossible (where high performance is greater than one byte encoded/decoded per cycle). I've had a look at your code, and it looks like you are writing only for simulation. For example: -- Initialize and return an integer array 0 - 255 function initialS256 return t_Integer_Array is variable S: t_Integer_Array(0 to 255); begin for i in 0 to 255 loop S(i) := i; end loop; return S; end initialS256; This won't turn into useful hardware if you try to implement it in an FPGA. What you want to do is restructure it around some sort of block diagram style system, with a big focus on how you intend to store the array that holds the cipher state - is it all in a block RAM, held in distributed RAM, or just in 65366 flip-flops? Each has a different set of restrictions that limit how you can describe the hardware. What are you doing the project for? Is it for a course or self-directed learning? As for the 7-seg, you should be able to find other posts covering that here and/or in the FPGA reference manual, or the Digilent GithHub repo.
  12. On this topic I've been making an audio DSP board using the CMOD A7, where additional noise is a real pain. My initial prototype board had some audio noise problems - I couldn't hear it but I could measure it. I initally thought was due to the CMOD-A7 and could not be fixed, but eventually put down to quite a few different causes: - I had nearly shorted the output of one of the DAC to GND, which as causing spikes on the power rail. Once fixed things were a lot better, but not perfect/ - I had not made any real attempt to stitch the top fill to the ground plain on the bottom - after all it was a hack. - I didn't have any series resistors in the I2S lines. I added 50 ohm ones (just picked a random value out of the air - might look at this again) - I had a few capacitor bodges standing up in the air, which could only make things worse - I was measuring very close to the FPGA, with a high impedance scope probe So I addressed all of these in the next prototype, and made up a test jig allowing me to measure 30cm from the board and things are much better - to the point I can't reliably measure any additional noise in the audio band. I guess what I am trying to say is that even with just one GND pin the CMOD-A7 can be part of a low noise audio system, but you have to put some extra thinking and work in to make it happen. This may or may not be of use to your use-case.
  13. The last of the parts came in and the new board is up and running. Here's the old and new boards side by side, and spectrum of a 10kHz test tone going from the ADC, through the FPGA and then DAC (top = new board, middle = old board, bottom = no board in the loop. The additional work I did on grounding on the PCB has paid off, with a very good noise floor - better than I can measure with the tools I have to hand.
  14. I just had a look at the J1b source, and saw something of interest (well, at least to weird old me): 4'b1001: _st0 = st1 >> st0[3:0]; .... 4'b1101: _st0 = st1 << st0[3:0]; A 32-bit shifter takes two and a half levels of 4-input, -2 select MUXs per input bit PER DIRECTION (left or right) and the final selection between the two takes another half a LUT, so about 160 LUTs in total (which agrees with the numbers above) However, if you optionally reverse the order of bits going in, and then also reverse them going out of the shifter, then the same shifter logic can do both left and right shifts. This needs only three and a half levels of LUT6s, and no output MUX is needed. That is somewhere between 96 and 128 LUTs, saving maybe up to 64 LUTs. It's a few more lines of quite ugly code, but might save ~10% of logic and may not hit performance (unless the shifter becomes the critical path...).
  15. The toolchain is pretty simple to build but takes a while - for me it was just clone https://github.com/riscv/riscv-gnu-toolchain, make /opt/riscv (and change ownership), then run './configure' with the correct options, then 'make'. There are a whole lot of different Instruction set options and ABIs, so I definitely recommend building from source rather than downloading prebuild images. At the moment I haven't included any of the stdlib or soft floating point. I'll add that to the "todo someday" list.
  16. I've just posted my holiday project to Github - Rudi-RV32I - https://github.com/hamsternz/Rudi-RV32I It is a 32-bit CPU, memory and peripherals for a simple RISC-V microcontroller-sized system for use in an FPGA. A very compact implementation and can use under 750 LUTs and as little as two block RAMs - < 10% of an Artix-7 15T. All instructions can run in a single cycle, at around 50MHz to 75MHz. Actual performance currently depends on the complexity of system bus. It has full support for the RISC-V RV32I instructions, and has supporting files that allow you to use the RISC-V GNU toolchain (i.e. standard GCC C compiler) to compile programs and run them on your FPGA board. Here is an example of the sort of code I'm running on it - a simple echo test:, that counts characters on the GPIO port that I have connected to the LEDs. // These match the address of the peripherals on the system bus. volatile char *serial_tx = (char *)0xE0000000; volatile char *serial_tx_full = (char *)0xE0000004; volatile char *serial_rx = (char *)0xE0000008; volatile char *serial_rx_empty = (char *)0xE000000C; volatile int *gpio_value = (int *)0xE0000010; volatile int *gpio_direction = (int *)0xE0000014; int getchar(void) { // Wait until status is zero while(*serial_rx_empty) { } // Output character return *serial_rx; } int putchar(int c) { // Wait until status is zero while(*serial_tx_full) { } // Output character *serial_tx = c; return c; } int puts(char *s) { int n = 0; while(*s) { putchar(*s); s++; n++; } return n; } int test_program(void) { puts("System restart\r\n"); /* Run a serial port echo */ *gpio_direction = 0xFFFF; while(1) { putchar(getchar()); *gpio_value = *gpio_value + 1; } return 0; } As it doesn't have interrupts it isn't really a general purpose CPU, but somebody might find it useful for command and control of a larger FPGA project (converting button presses or serial data into control signals). It is released under the MIT license, so you can do pretty much whatever you want with it. Oh, all resources are inferred, so it is easily ported to different vendor FPGAs (unlike vendor IP controllers)
  17. Clip the oscilloscope ground lead to the probe tip, and wave it near the board.. Tell us what you see...
  18. WAV files are the simplest to work with. 1. The WAV file have s small header on it, then they are all raw sample data, usually stereo pairs of 16-bit signed numbers. Just write a small program in your favorite scripting language to print out data after about 64 bytes. 2. For phone-quality audio, you need bandwidth of 300Hz to 3kHz. - this needs around 8000 samples per second, and about 8-bit sample depth . You could use some u-law or a-law compression to increase dynamic range (https://en.wikipedia.org/wiki/Μ-law_algorithm) 3. - 8 kilobyes per second, if you play raw 8-bit samples. Oh, and to convert data from a WAV file to lower sample rates (e.g. from 48kS/s to 8kS/s) you can't just drop 5 out of six samples - you need to first filter off the frequencies greater than half the target sample rate. It's not that challenging to actually do in code (usually just a couple of 'for' loops around something like "out[x] += in[x+i] * filter[j]') but generating the magic values for the filter can be interesting.
  19. The "DC and Switching characteristics" tells you the delays in the primatives, but can't tell you the routing delays. The only way to truly know it to build the design in Vivado, and then look at the timing report. Inference of DSP blocks and features is pretty good as long as your design is structured to map onto the DSP slices. There are little gotchas like not attempting to reset registers in the DSP slice that don't support it. Skim reading the DSP48 User Guide will pay off many times over in time saved from not having to redesign stuff over and over to help it map to the hardware.
  20. My views - if you want to learn low-level stuff (eg. VHDL/Verilog coding), buy a board with lots of buttons, LEDs, switches and different I/O over a more application specific development board. I think think that the Basys3 is pretty good for this and better than the Arty. Once you have sharpened your skills, then look for a board that will support your projects. If you want to initially work at a systems level, using IP blocks and so on, then look for a board that has interfaces that supports your area of interest. Debugging H/W when you are also debugging FPGA designs is no fun. A Zynq based board (e.g. Zybo) would be good, as it already a CPU, that is much better (faster, less power, better features) than a CPU you could implement in the FPGA fabric. Just be warned that with a Zynq system the SDRAM memory is usually on the far side of the processor system, so you don't get direct access to it - you need to access it over an AXI interface and compete with the CPU for bandwidth.
  21. hamster

    Dividing in Verilog.

    It may well do, but not knowing *all* the details of what you are doing means I can't offer you useful advice.
  22. Seen the problem. You need to define both o1[0] and o1[1] in your constraints, as o1 is a vector of two signals. At the moment you are trying to attach both signals to the same pin, hence the error. Ditto for o2, o3 and o4.
  23. hamster

    Dividing in Verilog.

    If you are dividing by a constant to can multiply by the inverse. If you only have a small number of different divisors you could consider a lookup table of inverses. Otherwise you need to implement a binary division algorithm yourself, to meet your throughput and latency needs. Division by arbitrary numbers is quite expensive - best avoided if at all possible.
  24. Are you showing us all of the file? What does the top level of your design look like? Do you have any other XDC files in your project?