hamster

Members
  • Content Count

    515
  • Joined

  • Last visited

  • Days Won

    79

Everything posted by hamster

  1. Hi! You are welcome - notes added: http://hamsterworks.co.nz/mediawiki/index.php/Simple_AXI_Slave#Notes_from_the_Internet Answers: 1) Might be a bug - if it is, email me the fix 2) No idea - but you need to have the correct number of transfers to stop the bus form locking up. I suspect DMA might be needed. 3) "Where WRAP type is necessary? How to use PS to work in WRAP mode?" Interesting question. I assume that it is used to fill cache lines. The data at the address being requested is delivered first, then the rest of the cache line is filled. As this address range is uncached I suspect it is unused.
  2. Yes, of course. But at the time I was playing with that project was only interested in developing a working understanding of AXI. I assume there is a magical binary file that contains all the configuration bits for the PS subsystem block, that forms part of the configuration/boot image for the Zynq....
  3. 1) That is based on the address window assigned to the AXI port in the PS configuration. Look at the screen grab below the block diagram for the PS in that web page. 2) The hello Program was made in the Eclipse IDE that Vivado provides for bare-metal development. Under linux you will either need a devices driver, or using shared memory or something, or I would tend to open /dev/mem and seek and write in there for a quick hack.
  4. Yes. An AXI Slave interface would be needed to read/write to one of the BRAM ports, and the PL design can read/write to the second port on the BRAM. Another other option might be to make an AXI master, that can write into and/or read from the PS's memory., but architecting this way seems wrong. The last option would be to route some of the MIO pins into the fabric, but this gives you a very narrow, restrictive low bandwidth connection compared to a AXI slave.
  5. Hi, I haven't been around here much lately, but I saw your post and thought I had something of use to you. Here is an AXI Slave, connected to the a PL, written in only VHDL: It only decodes a few addresses, and doesn't decode the entire assigned address range. http://hamsterworks.co.nz/mediawiki/index.php/Simple_AXI_Slave Feel free to drop me an email, but it has been a while since I've had spare time to do any of this stuff....
  6. Depending on what you are after, a full FFT might not be appropriate. It also depends on your desired throughput rates (e.g. 48KS/s for audio, or 100MS/s for radio work). 'Srreaming' FFT functions are possible, where the oldest sample is removed from the calculation and the new sample added - but the data can't be 'windowed', which can be a problem. Maybe if you told.us more.about the.end result you are after?
  7. Remember to add current limiting resistors. The drive strength settings for the I/O pins are how much current can be sourced/sunk and still give valid logic voltages. Short circuit currents are much higher - 50mA or higher.
  8. Humm.... DATA_P and DATA_N are not mirrors of each other. You might also need to delay DATA_P and DATA_N with respect to the clock signal, so it samples in the middle of the bit.
  9. I know it sounds stupid,. But sorry, that is how long it takes. There is no short cut. For example, global optimization may cause logic to get 'thrown away', and that would need to get inserted back into the design somehow.
  10. hamster

    Basys3 & Pmods as XADC

    The pins have no ability to perform ADC - they are dedicated digital I/O. You could however get an ADC PMOD module, plug that in, and use that.
  11. I think it is best to view it as a car assembly line. The quickest way to assemble a car is for each team to do their step one after each other - a car will take the combined time of all operations, but you are only building one car at a time. Fine if you a build a single Mclaren F1 race car. The quickest way to assemble lots of cars is a production line. Each team does their step, and then the car moves on to the next team for the next process. Making the first car takes (number of steps) * (length of longest step). But once you have your first car you get another car every (length of longest step). But what if mounting a motor takes five minutes, and the next longest operation takes only three minutes? You can only make one car every five minutes, and the rest of the teams are idle for at least 40% of the time. The solution might be to split mounting the engine into two steps, each taking up to three minutes. Then you can make a car every three minutes. In FPGA-speak, this is pipelining to get the highest clock speed. Big gains are easy at first, but then it gets harder and harder to get any faster, as more parts of the design become critical to the timing of the overall pipeline. The other solution might be to combine pairs of the three minute steps so no step takes longer than 5 minutes. That way you only need half the resources, yet can still produce a car every five minutes... this is the "once you meet your FPGA's timing constraints, then re-balancing the pipeline can save you resources" strategy.
  12. Oh, sorry - I forgot to answer. 2) This is a post-implementation view. So any optimizations that could be made have been made. I am guessing (and this is only guessing) that during the optimization thnigs have been pushed around. Are you displaying a full range of colours? or just a couple? The 24bit RGB seems to have been reduced to just a couple of bits. You can assign a "KEEP" attribute to the signals in the video generator block, and it should not optimize them away.
  13. 1) it is easier to explain why this is. An internal PLL had to lock on to the pixel clock. If it fails to lock (because the clock is out of range when it attempted to lock) or it becomes unlocked (e.g. the clock stops or changes frequency) it needs to be reset so it can start liocking back onto the now stable input clock again. I willwill answer (2) when I am not on my phone :-)
  14. I don't think you have to keep it consistent within a module. but it makes sense to. Conventions I have seen and use are: signals are always "downto". This makes numbers make sense - e.g. "00110000"; sig_downto <= "00110000"; assigns binary 48 sig_to <= "00110000"; assigns binary 12; Arrays are always "to". This makes initialization make sense.- the first element in the list goes into slot zero. If you do want to flip bits, this can do it for you: out_sig(0) <= in_sig(2) when rev = '1' else in_sig(0); out_sig(1) <= in_sig(1) when rev = '1' else in_sig(1); out_sig(2) <= in_sig(0) when rev = '1' else in_sig(2);
  15. I can't see what the issue is... can you describe exactly how are you expecting this to reverse the order of the bits? If your assignment into 'reverse_order_bits' does change the order, then when you assign that into 'out_sig' then that will also reverse the order. So if the output is always the input, then 'rev' (and that means 'SW[4]') also has no effect, and can be optimized out. And if the out_sig always equals in_sig, then your design does indeed contains no logic, just three connections from input pins to output pins. All seems consistent to me. Try it with explicitly reversing the order of the bits.
  16. Hi, The common way would be to use slices of the std_logic_vector: signal seg : std_logic_vector(3 downto 0) := "0001"; ... if rising_edge(clk_240) then seg <= seg(2 downto 0) & seg(3); end if or the opposite way: if rising_edge(clk_240) then seg <= seg(0) & seg(3 downto 1); end if You most likely also want a concurrent statement to assign the value to the module's output, and allow for the fact that the display are usually active low (ii.e. digits are switched on when the output is 0). seg_output <= not seg; This also helps, because if 'seg'_output is declared as being "out" in the module definition you can't read it's value to update it (yes, a little bit silly for VHDL). VHDL does have ROL, ROR, SLA and SRL operators, but nobody uses them as far as I know.
  17. I believe you can set a simulation flag to enforce range bounds checking, and it will error if you end up with an out-of-range value. It is off by default. You got it lucky with your range testing on hardware - if you had a a range from 0 to 5, it would most likely still count through 6 and 7. The synthesis tools has this sort of internal dialog: - Hum, this is an integer will be used with values between 0 and 7, as that is what the HDL tells me. - I could replace it with a UNSIGNED that can hold values between 0 and 7, as it only needs positive numbers. - To hold the numbers between 0 and 7 I only need 3 bits. - Great! I will implement this signal as a 3-bit unsigned binary value - Because this is a 3-bit value I can update it with a 3-bit adder ... I doesn't statically inspect the rest of the code to enforce that you will only use numbers in the stated range. It also doesn't put any additional logic enforce the range as stated (e.g. clamping, or wrapping of results), you have to ensure that your design stays within the stated range. For this reason I prefer to use UNSIGNED(n downto 0 ), so I know exactly what the tools will produce. I am sure others will disagree, and prefer the higher level of abstraction, and finding ranges is very useful in simulation (as long as the "range check" option is turned on!). ... diversion... An important consideration for this is when inferring memory blocks. If you define a memory block that has 768 entries, you might end up with a physical memory block that has 1024 entries, 256 of which you might not be using. Or it might be made up of three blocks with 256 entries each, and the required logic to make it work as if it is a single memory block. So why is this important? Well, if you set the write address to an invalid index of 1023 or 768 (as only 0 to 767 are valid) you might find that you corrupt entry 511 or 256 by accident. Or maybe you won't today, depending on what the synthesis tools feel like doing and the address decode logic during the last build. The tools are designed to take as many shortcuts as possible to give you exactly what you asked for, no more and no less, with most optimized use of FPGA resources. Don't be surprised if unexpected inputs give unexpected outputs
  18. With a 10k resistance only 0.26mA will be flowing from the emiiter to the base of the transistor. Depending on the current gain of the transistor (maybe 100x) that still only gives 26mA to drive the motor. Use a 330 ohm resistor and see if that helps. That should allow allow around 10mA to pass through the base, allowing the transistor to switch around 1A. Also it is good practice to have a snubber diode over the motor, to prevent it damaging your FPGA board. You may also want to use a NPN transistor, allowing the emitter to be connected to ground, rather than the positive (regulated) power rail. This would be especially helpful if the motor is to be powered by a different supply.
  19. The PMOD connector pins are not able to provide much power - at best a few mA. You will need to add something to increase the power - either a transistor or MOSFET switch, or maybe a full H-bridge driver. Also, the voltage regulators on the FPGA is only engineered to power the FPGA and a few low power add-ons. So you will need an additional external power source to drive the motor.
  20. In general the tools will let you try anything. If you want to "ski off piste" the let you. They may warn strongly against it, or might even need you to stamp your feet a little with directives and settings, but you might have a valid need to do what you are asking. They also can only control what you do inside the FPGA. For example, if you had "output_pins <= std_logic_vector(unsigned(input_pins)+1);" and wire your outputs and inputs together you will get exactly what you have inside the chip (but a little slower). If you work with software you should all be used to this. Take this small C program: $ cat div.c #include <stdlib.h> int main(int argv, char *argc[]) { return atoi(argc[1])/0; } $ gcc -o div div.c -Wall -pedantic -O4 div.c: In function ‘main’: div.c:3:23: warning: division by zero [-Wdiv-by-zero] return atoi(argc[1])/0; ^ $ ./div 123 Floating point exception (core dumped) $ Should divide by a constant zero be a critical error? How is the compiler supposed to know if you are really doing this by accident, or are writing a program to test the behavior of divide by zero? Part of the learning curve for FPGAs is to develop a feeling for what is the "safe area" to work in, and recognize when you are leaving wandering outside of it. It is a shame that in this case you stumbled into this by accident, but you get big bonus points from me for realizing that you were in a strange & weird place, and attempting to make some sense of it before moving on. (BTW, I like playing in these areas)
  21. Sorry to have irritated you... Here is my example: - I used an unsigned datatype for the counter - I also had all bits used - the low 24 on the PMODS, the high 8 on the LEDs. - It was built for the Nexys2 board, using the latest version of ISE. library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.ALL; entity btn_test is Port ( clk_100M : in STD_LOGIC; btnC : in STD_LOGIC; led : out STD_LOGIC_VECTOR( 7 downto 0); seg : out STD_LOGIC_VECTOR( 6 downto 0); an : out STD_LOGIC_VECTOR( 3 downto 0); dp : out STD_LOGIC; JA : out STD_LOGIC_VECTOR( 7 downto 0); JB : out STD_LOGIC_VECTOR( 7 downto 0); JC : out STD_LOGIC_VECTOR( 7 downto 0)); end btn_test; architecture Behavioral of btn_test is signal up_test : unsigned (31 downto 0) := (others => '0'); begin count_button_process_triggers: process(btnC) begin up_test <= up_test +1; end process; JA <= std_logic_vector(up_test( 7 DOWNTO 0)); JB <= std_logic_vector(up_test(15 DOWNTO 8)); JC <= std_logic_vector(up_test(23 DOWNTO 16)); led <= std_logic_vector(up_test(31 DOWNTO 24)); seg <= (others => '0'); an <= (others => '1'); dp <= '0'; end Behavioral; The constraints are almost the standard Nexys2 ones, so I won't include them. Although it produces a metric truckload of warning about combintatorial logic oops, it does produce a BIT file, and when programmed it does do as expected, but what it arguably shouldn't be doing. Without a change in state of btnC the counter is counting up. It cycles through all 32-bit values in about 5 seconds, so each addition is taking about 1.2ns (or is free-running at about 800 MHz, if you like). Or maybe it doesn't - maybe it skips some values and the lowest bits are not counting properly. But it looks as if it counts, without a clock or an event on btnC signal.
  22. I have actually done this, and the speed only partially depends on the device parameters, the biggest being propergation delay. Moving from a breadboard to hard-wired made dramatic changes in frequency (20%+) due to inductance and capacitance. Extra decoupling (which enhanced switching speed) also changed things. Supply voltage also made a big difference too.
  23. The length of the sync pulses is important, but if you know what they should be, and you have an "data enable" that is asserted during the active pixel session, then it is logically possible to regenerate the sync pulses, using the start of the active video periods and their length as a reference point. You can count the pixels and lines, working out the format (e.g. 640x480, 800x600...) then use that to see when sync periods should start and end for standard video resolutions, and then use that to add your own sync signals.
  24. On some random web post I found this: Timing recommendation #3.... 720x576 at 50Hz Modeline................ "720x576" 27,000 720 732 796 864 576 581 586 625 -hsync -vsync ... Timing recommendation #5.... 720x576 at 50Hz Modeline................ "720x576" 27,000 720 732 796 864 576 581 586 625 -hsync -vsync Because it is larger than 640x480 it should work with standard HDMI. How to interpret the numbers 27,000 is the pixel clock Horizontal 720 active pixels 732 start horizontal blanking 796 end horizontal blanking 864 total length of line. Vertical: 576 active lines 581 Start vertical blanking 586 stop vertical blanking 625 total lines. How to generate a hsync signals out of nothing The easiest way is to .... - Convert the input into three signals Y, C and a one-bit Active_Pixel (aka 'data enable') - Add a horizontal pixel counter that counts 0 to 863 when when you see the first active pixel - i.e. reset it when you see the first active pixel, otherwise leave it free-running. - You can then generate hsync when the horizontal counter is between 732 and 795 How to generate the vsync signals - detect when you have more than 144 cycles without active data (144 = 564 clocks per line - 520 active pixels per line). This lets you know when you are in the vertical blanking interval (as the horizontal blanking is 144 cycles) - then reset the line counter when you see the next active pixel, as this will be the first visible line. - increment the line counter, whenever the horizontal counter wraps from 863 back to 0/ - Generate the vsync pulse when the line counter is between 581 and 585.
  25. Hi again - sorry, busy weekend! I had a look at the IP in Digilent's GitHub (https://github.com/Digilent/vivado-library). It looks like it can be configured to support slow pixel clocks: From CLKGEN.VHD: kClkRange : natural := 1; -- MULT_F = kClkRange*5 (choose >=120MHz=1, >=60MHz=2, >=40MHz=3, >=30MHz=4, >=25MHz=5 SO to support a 27MHz Pixel clock you will need to set kCLkRange to However.... I also had a look at the standards for the video format you mentioned. It seems to have a "data" clock of 27MHz, but each Pixel has "Y" and "C" values, so the actual Pixel clock is 13MHz. It is also an interlaced format (with odd and even fields to each frame). This format is structurally incompatible, so simple direct translation is not possible. It will really need a complex project to receive this video into memory, de-interlace it, then send it out as progressive video. The timing of "standard definition" broadcast formats (PAL, NTSC...) are not compatible with DVI or HDMI...