xc6lx45

Members
  • Content count

    165
  • Joined

  • Last visited

  • Days Won

    9

xc6lx45 last won the day on May 5

xc6lx45 had the most liked content!

About xc6lx45

  • Rank
    Prolific Poster

Contact Methods

  • Website URL
    https://www.linkedin.com/in/markus-nentwig-380a4575/

Profile Information

  • Gender
    Male
  • Location
    MUC
  • Interests
    RF / DSP / algorithms / systems design / implementation / characterization / final testing and creative abuse of Pedal Steel Guitars

Recent Profile Visitors

784 profile views
  1. Theory of pipelining/paralellism

    >> To me, it's not obvious how adding flipflops to everything is better. simply chop the work into smaller pieces, then process them at a higher rate. For example, take the Horner scheme for polynomial evaluation c0 + x*(c1 + x*(c2 + x * c3))). I could do that in a single clock cycle, maybe at 10 MHz. Or I chop it into 14 operations (some to allow for hardware registers in the DSP48 blocks) and suddenly it runs at 160 MHz. I posted the code a while ago, if interested I can look it up.
  2. CMOD A7 Native Schematic files

    Hi, the Vivado USB-JTAG functionality is (AFAIK) licensed, it's not an open-source design. Check the schematic. For Artix, the only such thing that I know of - but I haven't really searched - is the DIPFORTy1 in its first board revision. It comes without USB JTAG. For Spartan 6 there is a bigger selection of free boards, e.g. Papilio or Pipistrello come to mind, Eagle schematics and board files.
  3. SPI TFT Display

    In the meantime I've turned the VGA adapter into an AXI-LITE slave, attached. E.g. put it as RTL module into a ZYNQ design. Autoconnect will then add an "interconnect" to convert between AXI and AXI-Lite. Use a clock wizard (MMCM mode works best) to proved the 25.175 MHz VGA clock. The bus clock does not matter. It's tested for basic functionality. For example, this... int main() { init_platform(); volatile char* p = (char*)(0x40000000+848); for (int ix = 0; ix < 20; ++ix){ *p = *p ^ 0x80; ++p; } ... ... for a base address of 0x40000000 inverts the first 20 characters by setting bit 7 (run it twice without reloading the bitstream => back to original). vga.v
  4. Not sure if I should comment on this. Well, I've done job interviews for one (1) FPGA opening, and it was a non-standard case (but quite a few in general). So my experience surely doesn't reflect the state of the FPGA industry specifically. Maybe "work hard then be honest" is a good guideline. If you have worked hard, you can afford to be honest. Doing tutorials alone does not qualify as "working hard". It's spoon-fed baby food. Well we all needed it at some point but rarely talk about it (insert special thanks here to Numato for their excellent Zynq bare-metal two-pager...) JColvin's "engineering through problems" is spot-on but requires the "working hard" before. A large part of the world's population has their FPGA knowledge from marketing material, not from learning things the hard way. In a job interview, be careful not to be mistaken for one of those. For someone coming from school, having some published project on github etc might be worth a lot (but: quality, not quantity. This can work against you if the code shows you haven't understood the basics). Refer to that in the CV. Having own projects also shows you have a genuine interest in this line of work, which is usually a good thing. And keep in mind, you're playing a ball game with the recruiter. Not against him / her. Give them something to talk about and be prepared to do the talking. It's surprisingly hard to come up with good questions that lead somewhere.
  5. Digilent CMOD A7 Disconnects and/or does not Program

    Remembered one thing: You were using only a single PC, is that so? Try another one. It may be that there are broken drivers somewhere in the Windows directory. Maybe the USB hardware is unreliable. A powered hub won't automatically fix it. If a full install is too much work, use the LabToy.exe file and put the FTDI drivers into the same folder (or install them system-wide via FTDI's installer). Speaking of which, updating them to the latest version (8/2017!) might be worth a shot in any case.
  6. Digilent CMOD A7 Disconnects and/or does not Program

    Hi, if you want a 2nd opinion, you can try my labToy executable. It runs completely independent JTAG code. Most likely, it'll fail the same. But if it happens to work reliably, the next question would be "why". I suspect it's USB power related - you might give it a shot to supply the boards externally.
  7. SPI TFT Display

    PS: If anybody tries this code with jumper cables to the monitor and the picture is blurry in horizontal direction, ground pin 10 (VSYNC GND). With a 15-pin breakout board this should not be an issue.
  8. Vivado is bugged up the *** per usual (no, my bad)

    10 points for hamster, case closed. It's double reversal. Both cases are identical, so the rev input does not contribute to irreducible logic, gets optimized away with a warning (see above). Everything works as designed.
  9. Vivado is bugged up the *** per usual (no, my bad)

    laughing out loud... but a lie implies intent - Vivado would plead for diminished responsibility or insanity Well... It just tells you its view, without much interpretation. And this is post-optimization - the assumption is that all signals contribute to irreducible logic. If something gets optimized away, it gives this kind of warning which seems weird at first but is just a statement of the "hard" facts available to the tool. Interpretation is left to the user. Did you notice that it completely optimized out the "reversed" option but apparently not the non-reversed path (no warning for sw(0..3)? The warning about "clk" is legit - you have a clock in a fully combinational design. I'd get rid of that (or register the output) just to clear the waters. I'd look at the elaborated design, this is Vivado's view of your input. If you can pinpoint the problem, I'm curious to know what exactly caused it.
  10. sine wave generation in arty 7 35t

    For the digital generation you could also try this one (in addition to above). It's based on cubic interpolation but isn't specifically optimized for sine (increase the number of segments in the wavetable, see .m file). Unless you have a DAC as separate component, converting to analog is always a "hack" (the analog accuracy of the digital output driver becomes the bottleneck). That said, it works surprisingly well. There once was even a Xilinx app note but it was made obsolete). The "soft-core" DAC I came up with is shown below (edited, let's hope I didn't break anything). It's very simple, actually (but might take a couple of iterations with a scope, which is a good idea anyway). I used LVTTL33, DRIVE=24 and SLEW=FAST on the output. PMOD outputs come with a 200 ohms resistor so I can get a lowpass filter by adding a small capacitor. Simple in theory, but finding a good ground can be challenging. Adjusting the PWM frequency to the capacitor is one option. BTW, there are different flavors of sigma-delta converters out there, but they work best in the center of the voltage range, less so near 0 V and +3.3V. // PWM-DAC Markus Nentwig 2016-17 // This code is in the public domain module pn24(i_clk, o_out); input wire i_clk; output reg [23:0] o_out = 24'h1; always @(posedge i_clk) begin o_out <= o_out >> 1; o_out[23] <= o_out[0]; o_out[22] <= o_out[23] ^ o_out[0]; o_out[21] <= o_out[22] ^ o_out[0]; o_out[16] <= o_out[17] ^ o_out[0]; end endmodule ... // - try as high CLKDAC frequency as possible // - todo: use better PN generation than above // === pseudorandom dither generator === wire [23:0] dither; pn24 iDitherGenerator(CLKDAC, dither); // === regular PWM === reg [4:0] dacCount = 5'd0; always @(posedge CLKDAC) dacCount <= dacCount + 5'd1; // === combine === wire [17:0] pwm = {dacCount, dither[12:0]}; // === generate DAC output === reg dacOutput = 1'b0; reg [17:0] val = <yourDataSourceHere>; always @(posedge CLKDAC) dacOutput <= val > pwm; You can get some impression on the performance here (note: but don't try to read dBc values from the plot, it would be discrete spikes vs power density)
  11. how to build satellite for radio communication?

    Can't use duct tape with satellites. Kapton tape is what keeps those things together
  12. FPGA audio - ADC and DAC

    BTW, one more thought: If all you need to generate is a 1 kHz tone at 96 ksps, a perfect wavetable needs only 96 entries for a full period (or using simple symmetry, 1/4 of that). I don't see where the 12 bit come in. If in doubt, it should be trivial to run the wave calculation in floating point on a Microblaze processor, at least for debugging. Now the THD+N reading from your screenshot is -74 dBc, which is suspiciously close to the quantization noise of a Nyquist-rate 12-bit DAC (eq 12, 6*12+1.7). A modern converter should reach -100 dBc, which would be your 0.001 % target.
  13. features of xilinx spartan xc3s50

    Just curious... is this from a catalog of homework questions? If so, please be aware that the FPGA community is relatively small and lecturers read those forums, too. You may find that the questions do not have a simple yes-or-no answer. As a starting point for 3-series features, go here (and sorry but unless you have me convinced this isn't a homework problem I will not answer more "constructively", it helps no one).
  14. FPGA audio - ADC and DAC

    Hi, if you like, try my code based on splines. Historically, it's written for 18x18 multipliers but nowadays Artix offers 25x18 at the same cost so it can still be optimized. I don't have any THD numbers at hand but the fixed-point spline math is about 16 bit accurate to the double precision template. If needed, simply use more segments (see the included Octave .m script, right now 16 for a full cycle). Link: fix the path to iverilog.exe in the .bat file; run and load the "loadThisIntoGtkWave..." file into gtkwave => one cycle sine wave. https://drive.google.com/file/d/1xChgXBTQU8rRKrukCGVDTaeEgbNpXQeX/view?usp=sharing Here is the main function, tested at 150 MHz (1 output sample / clock cycle). Note, this needs the "wavetable" data file from the archive above to be functional. If I clean out the bus interface and the optional (commented) "aux" tag path, it's actually quite short. // # spline-based cyclic waveform generator # // # (C) 2015-2018 Markus Nentwig # // # This code is in the public domain # module spline(input wire i_clk, input wire [17:0] i_x, output wire [17:0] o_y, // input wire [7:0] i_aux, output wire [7:0] o_aux, input wire [31:0] Si_addr, input wire [31:0] Si_data, input wire Si_we); parameter ADDRMASK = 32'h0000003F; parameter ADDRVAL = 32'h20000000; (* ram_style = "distributed" *) reg [17:0] mem[0:63]; initial begin `include "wavetable_WAVEID3.v" end // === coefficient write === localparam NWRDELAY = 2; (* DONT_TOUCH = "TRUE" *)reg [5:0] wa [1:NWRDELAY]; (* DONT_TOUCH = "TRUE" *)reg [17:0] wd [1:NWRDELAY]; (* DONT_TOUCH = "TRUE" *)reg we [1:NWRDELAY]; genvar j; generate for (j = 2; j <= NWRDELAY; j = j + 1) begin always @(posedge i_clk) begin wa[j] <= wa[j-1]; wd[j] <= wd[j-1]; we[j] <= we[j-1]; end end endgenerate always @(posedge i_clk) begin wa[1] <= Si_addr; wd[1] <= Si_data; we[1] <= Si_we & ((Si_addr & ~ADDRMASK) == ADDRVAL); if (we[NWRDELAY]) mem[wa[NWRDELAY]] = wd[NWRDELAY]; end // MSB unsigned to index coeff bank wire [3:0] xMSB0 = i_x[17:14]; // LSB half-range shift down as signed variable in polynomial wire signed [17:0] xLSB0 = {{5{!i_x[13]}}, i_x[12:0]}; // === pipeline variables === // Note: this is the shortest possible pipeline length that utilizes all hardware registers // (PREG and MREG) in the inferred DSP48 blocks (Artix 7) localparam PMAX = 14; localparam Nc0 = 13; localparam Nc1 = 9; localparam Nc2 = 5; localparam Nc3 = 1; reg signed [17:0] xLSB[1:PMAX]; reg signed [17:0] c3[1:Nc3]; reg signed [17:0] c2[1:Nc2]; reg signed [17:0] c1[1:Nc1]; reg signed [17:0] c0[1:Nc0]; reg signed [17:0] acc[1:PMAX]; // reg signed [7:0] aux[1:PMAX]; // === clear aux === // generate // for (j = 1; j <= PMAX; j = j + 1) begin // initial aux[j] = 8'd0; // end // endgenerate assign o_y = {!acc[PMAX][17], acc[PMAX][16:0]}; // assign o_aux = aux[PMAX]; `ifdef SIM wire [17:0] DEBUG_acc3 = acc[3]; wire [17:0] DEBUG_acc7 = acc[7]; wire [17:0] DEBUG_acc11 = acc[11]; reg [31:0] PMAX_NOM; `endif // === delay variables through pipeline === genvar i; generate for (i = 2; i <= PMAX; i = i + 1) begin always @(posedge i_clk) begin xLSB[i] <= xLSB[i-1]; if (i <= Nc3) c3[i] <= c3[i-1]; if (i <= Nc2) c2[i] <= c2[i-1]; if (i <= Nc1) c1[i] <= c1[i-1]; if (i <= Nc0) c0[i] <= c0[i-1]; // aux[i] <= aux[i-1]; end end endgenerate integer NEXT; integer CURR; always @(posedge i_clk) begin // === input to pipeline === c3[1] <= mem[(xMSB0<<2)+0]; c2[1] <= mem[(xMSB0<<2)+1]; c1[1] <= mem[(xMSB0<<2)+2]; c0[1] <= mem[(xMSB0<<2)+3]; xLSB[1] <= xLSB0; // aux[1] <= i_aux; // === calculation of pipeline stages === // x has 13 fractional bits (input: 17; 4 MSBs stripped off) // - after multiplication with x [1, 2, 3, 4], arithmetic right-shift // by 13 bits keeps the decimal point in place. This is equivalent // to floor(...). // - for correct midpoint rounding, add 0.5 (1 << 12) // now equivalent to floor(x+0.5) NEXT = 2; CURR = 1; acc[NEXT] <= c3[CURR]; NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= acc[CURR]; // coefficient ram NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= acc[CURR]; // DSP48 input reg NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= (acc[CURR] * xLSB[CURR] + $signed(1 << 12)) >>> 13; NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= acc[CURR] + c2[CURR]; NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= acc[CURR]; // output reg NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= acc[CURR]; // input reg NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= ((acc[CURR] * xLSB[CURR] + $signed(1 << 12)) >>> 13); NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= acc[CURR] + c1[CURR]; NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= acc[CURR]; // output reg NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= acc[CURR]; // input reg NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= ((acc[CURR] * xLSB[CURR] + $signed(1 << 12)) >>> 13); NEXT = NEXT + 1; CURR = CURR + 1; acc[NEXT] <= acc[CURR] + c0[CURR]; `ifdef SIM PMAX_NOM <= NEXT; `endif end endmodule
  15. Why is my Process getting triggered with no change to the sensitivity

    You could inspect the implemented design as schematic. Now, typical switch bounce extends over a length of thousands of clock cycles. It's a job for a counter, otherwise I guarantee there will be multiple triggers, "down" events on "key up" and vice versa. For the record, to properly bring in an asynchronous signal there should be a synchronizer (see pragma ASYNC_REG discussion before and the related documentation). A single register will catch "almost" 100 % of metastability, but you're asking for trouble in a real-world design when it leads to a once-per-week catastrophic failure that the customer finds "almost" acceptable. See Xilinx docs. What may happen in this specific example is that Q1 is slow to settle when the input voltage creeps through the threshold at the clock edge. Then Q2 sees some result and the expression in Q_OUT sees something else. Boom. A whole clock cycle seems a long time for Q1 to settle but the design tools may utilize most of it for routing / logic delay so I need to consider Q1's slow settling relative to the timing margin, not the clock cycle (so this is where the ASYNC_REG pragma comes in: Tell the tool to please put no delay in-between two vanilla registers so that the whole cycle length can be used for settling) And: the official MTBF test circuit (same docs) might be an interesting beast to study wrt, "non-standard" design.