• 0
rappysaha

Simple code for DDR3 SDRAM

Question

Hi,

I am very new at field of FPGA. Now I am working Genesys2. I have to control DDR3 memory. I find some examples in Digilent site for DDR3 using microblaze processor. But, in my case I don't have to use microblaze processor. I have to send some fixed value through the DDR3 memory like 8-bit data (X'FF') i.e. I will write that data into the Genesys2 DDR3 memory and readout the data from the memory. I already go through Xilinx manual ug_586 . But still it is not clear to me how to start coding for the DDR3 memory. My questions are:

1) Is it possible to have example code without using microblaze processor for DDR3 memory?

 Or any suggestion for starting code to control DDR3 memory.

Actually, I have do it in any way. So any helpful suggestion will be appreciated.

Thank you.

Share this post


Link to post
Share on other sites

6 answers to this question

Recommended Posts

  • 1

@rappysaha,

Wow, I know what you mean!  Those are some wonderful, good, and hard questions to answer.  I'll tell you what I know:

  1. I am working on a project to handle DDR3 SDRAM memory apart from Xilinx's Memory Interface Generator (MIG).  I've been working on that project for about two months now.  If I had it working, I would commend that approach to you.  (I know I just linked to a project on OpenCores ... and yet OpenCores has been down for a couple of days now.  Kind of sad ... I've got (what I think is a) nice blog describing my efforts to date.  The last news I have is that my "logic" for controlling the DDR3 SDRAM works in my home-made DDR3 simulator, and that (in the simulator) I get about a 9-clock delay from request to completion, where the MIG generated core takes about 23 clocks from request to completion.   (These are 81.25MHz clock ticks, running on an Arty ...)
  2. Just as a little background:
    1. Every DDR3 memory transfer is 128 bits long.  Sure, the memory supports a 64 bit transfer mode, but that transfer mode still takes as many clocks as the 128-bit transfers, so you don't get any advantage there.  (This assumes a 16-bit bus transfer width, such as the Arty has.  Other bus transfer sizes will scale with the memory width.)
    2. Any transaction to/from the memory that is less than 128 bits is a misnomer--every request that crosses the DDR3 memory bus is 128 bits.
    3. It costs several clocks before you can write to the memory.  If you are familiar with SDRAM, there are 8-banks of memory within each chip.  To read/write memory, it must first be copied from the DRAM to an SRAM--this is called activating the bank.  If the bank was already activated with the wrong "row" of memory, then the bank must be closed, or as the spec calls it it must be "precharged" before being activated.  This takes a clock.  Once the bank is activated, you may request to read/write on the memory bus.  The following clock starts a read/write, and the full read/write takes place on the clock following.  In all, a transaction may require one clock to precharge the bank, one clock to activate it on another row, one clock to issue the read/write command, one clock to start the bus going, one clock to transfer the data, and ... Xilinx's MIG stuffs another 20+ transactions to those 5 clocks of bus interactions.  (Keep in mind, the memory clock is going at 4x the speed of the "clock" of your interface.)
    4. The memory is very particular about what clock speeds it can and cannot support.  (This is why my own controller has, to date, been rewritten about five times ...)  The memory clock speed cannot be slower than 3.3ns, and on the Arty it cannot be faster than 3ns (the spec goes much faster ...)  The speed of the controller MIG gives you is likely to be 1/4 this rate.
    5. The MIG can be hard to configure.  The Digilent how to's and device project files should help you do so.
    6. MIG wants to control your clock.   Therefore, source the clock for your whole design, and indeed your reset as well, from the MIG core.  MIG also wants your external clock input as well as a 200MHz clock input.  I find that I need to go through a PLL to generate these two clocks.  They then need to be passed to the core "unbuffered".
  3. When I finally got frustrated with my efforts above, I built a Wishbone to AXI4 bridge.  (I've since moved this project to github from OpenCores, as you may notice from the link address ...)  This project is very similar to a prior wishbone to AXI3 bridge written by another great, with the one exception that it pipelines memory accesses to the extent the MIG and AXI4 allows.  This means that you can issue one read (or write) command per clock, and let the memory deal with things.  Xilinx does require, when sending "pipelined" requests across their AXI4 bus, that your request cannot cross a 4kB boundary.  (You'll have to start a new request at the boundary--this is what the documentation says, I haven't tried it in practice ...)
  4. If you wish to look into the pipelined bridge I mention above ...
    1. You'll need to understand a touch about the pipelined mode within the Wishbone B4 specification.  OpenCores is down, or I'd point you to that spec ... so let me give you a couple details.  To initiate a transaction, raise the CYC and STB lines, while setting the address, write-enable, and data (if it's a write transaction).  The transaction request has been made on the same clock that STB is high and STALL (from the memory peripheral) is low.  Once you've finished requesting the transactions you wish, drop the STB line.  The transaction is complete when the ACK line goes high.  On that clock, if you requested data, the data is returned to you.  You'll then want to drop the CYC line.  (Go ahead and read the AXI4 spec--it's not nearly that simple ...)
    2. The bridge core (above) uses a 6-bit transaction identifier, and a 128-bit transaction width within MIG.  You'll need those numbers as you generate the core.  It also supports natural (rather than strict) ordering.  (If you select strict ordering, the portion of the bridge core that handles it ... isn't quite ready yet.  Use the non-strict bridge option--it'll cost you some extra logic and an extra clock, but it'll work.  If you really want strict ordering, you can help me get the 10 lines of code needed to get the strict ordering code working ...)
    3. If you only wish to write 8-bits, you'll need to still fill out at least 32-bits of a transaction on the bridge core.  You can use the SEL line to select which byte within the 32-bits you give it is actually written.  Alternatively, you can gather your write requests until they fill out a 32-bit word and write them then.  Still, the memory transaction itself is 128-bits, so the 8-bits you write will turn into a 128-bit transaction--even if all you wish to write is 8-bits.

If you haven't figured it out, this solution is not the one on the beaten path.  It works, though, and I'll be happy to discuss it further if you would like.

Dan

Share this post


Link to post
Share on other sites
  • 1

@rappysaha,

A couple things I noticed:

  1. You don't want to increase your address unless the last command was accepted.  (i.e. app_rdy was high)
  2. Even if you don't want to use AXI4, you still might wish to take a look at that wb2axi converter.  One of the things it handles nicely is the various ready lines--both on the command bus, as well as the write bus associated with it.  Look at the o_axi_awvalid, and o_axi_wvalid line as an examples--the first is the command valid, the second is the write bus valid.  As for the core, a new value is accepted by the core anytime (i_wb_stb)&&(!o_wb_stall) is true, so that should explain those lines within that logic.
  3. Be aware that it takes quite a bit of time for the MIG to start up.  I think this is why the MIG likes to output a reset signal--so it can hold your logic in reset until the MIG has started.  (We're talking about a ms here.  First MIG has to hold the DDR reset down for 200us, then it must clock it with the clock enable line held low for 500us, etc.--nothing you need to worry about, save that this must take place.)
  4. If starting the MIG were painful, you also need to be aware that the MIG itself will go out to lunch every now and then (roughly every 7.8us) so that it can refresh its memory.  During that period of time, the memory will be unavailable.  By my calculation, that's about a 57 clock penalty--but who knows how Xilinx actually implemented their MIG?
  5. I cannot comment on how well Xilinx's simulator accurately simulates a DDR3 memory at all.  I just don't know.
  6. While I have code that will simulate a DDR3 SDRAM, it doesn't have the interface you are working with and the interface it does have  ... doesn't yet work on any Xilinx chips.  (sorry--it's just another work in progress)
  7. As for checking in realtime, and on the hardware itself instead of via simulation--this is really what you want to do.  I highly recommend doing this.  All of my projects have included some kind of checking in real time into them, so I can see what is going on within the project.  I've debugged interactions with the ICAPE interface, QSPI flash, DDR3 memory, and now I'm working on the Arty's network card--all using this sort of approach.  I'd highly recommend it to you.  Can I say that again? 
  8. I'm sure others here on this forum can describe to you how to use Xilinx's ChipScope for that purpose.  I personally have never used it.  It might be simpler than what I'm about to discuss and propose to you.
  9. To see what's going on within the hardware, I use what I call a "wishbone scope".  The scope records until a trigger plus a programmable number of clocks.  So, for example, you might want to look for what happens when you assert the enable line and start recording there.  (Set the programmable delay to the size of the scope's memory)  Alternatively, you might wish to stop recording when an error condition takes place (set the delay to zero)--or later to pan through your logic from a start condition to ... however much later.  That said, doing this requires a lot of ... preliminary stuff to work.  You will need a means of communicating with your board and read/writing to the wishbone bus that the scope is parked on.  (It only takes a 1-bit address line, so even if you don't park it on a wishbone bus properly, the interface is fairly simple--but you'll still need to communicate with it from something external.)
  10. You can see some of the projects I have where I do this on GitHub.  There's a project using a XuLA2-LX25 board, one using a CMod-S6, and I'm now working on one using the Arty platform (this one uses the MIG, but via the AXI4 interface).  If you wish to cut/copy/paste, I'll warn you: none of these projects are simple, and the Arty one is still a work in progress.  My basic design works as follows: A host computer communicates via a standard protocol with the board.  The protocol was built so that I might use it no matter what the boards interface, even over PCIe if necessary.  I typically build a basic host program wbregs, just to read and write addresses on the board (like the scope configuration address) from the command line.  When that gets old, I build a C++ file to do what I need--such as reading from the scope.  From the RTL side, check out all the RTL files beginning with wbu--the top one is wbubus.v.  The interface is generic enough to be able to be run from a UART (the Arty), a JTAG/User command (the XuLA2-LX25), or even the Digilent's parallel DEPP interface.  Of course, my fear with even mentioning these is that they could easily overwhelm you like I did earlier in this thread.  (I would be overwhelmed personally ...)  At the same time, copying from such a project might be one way you could get started quickly--so I'll let you be the judge.

Let me know if this helps, and we can go from there,

Dan

Share this post


Link to post
Share on other sites
  • 0

Hi Rappysaha,

I have not been able to find an example or demo that uses the ddr3 and does not use the MIG. Here is a link to a VHDL SDRAM controller which is the closest thing i could find. Hope this helps!

cheers,

Jon

Share this post


Link to post
Share on other sites
  • 0

@D@n

Hi D@n,

I tried follow your code but as I am new so, it is not easy for me to follow the whole thing. Besides, I think I only want to use the IP not the AXI4 interface as I am not using any soft core processor for my whole project. DDR3 is a part of my project. I want to control my DDR3 for simple data transfer. So, if you may give any suggestion about how to start with this MIG IP this will be very helpful. Already I go through the user guide and example. I also upload my code here to get my output. So, any suggestion will be very helpful. Thank you.

 

Hi @jpeyron

Actually digilent have some example code for other interface like VGA. And they are very easy to follow. But incase of DDR3 I think the examples are not so clear. Any way followed the site that you referred. But I need more specific if it is possible. Anyway thank you.  

 

MIg write and read.txt

Share this post


Link to post
Share on other sites
  • 0

@rappysaha,

I was afraid that would happen.  Okay, let's work with the user interface.

Do you have Xilinx's ug586, "7 Series FPGAs Memory Inteface Solutions: User Guide"?  I'll admit it's not very comprehensible, but ... it's what we have to work with.

Looking at the guide, and your code, let me offer some pointers:

  1. Be careful of setting your state at the very first line of your process.  This isn't computer code, where the state is set on the first clock and then set to something else.  Still, having said that, it looks like your state machinery would still work for what you want--no corrections are necessary.
  2. Xilinx AXI documentation talks a lot about the ready signal and the enable signal, and concerns particular race conditions.  They recommend setting the enable signal before checking the ready signal, lest some race condition occurrs.  In your code, you wait for the ready signal on the write line before setting the enable signal.
  3. On page 156 of Xilinx's document, they show three potential write timing relationships compared to the command relationship.  I would recommend you use #1 or #2, rather than #3, because of this warning they give.
  4. Be aware of the condition whereby the app_rdy signal is high, but wdf_rdy is not or vice versa.  With your code as written, you might find yourself issuing a whole bunch of commands, but with no data to go with them 'cause their data fifo wasn't ready.
  5. You will need to assert the memory burst "END" command when you hit the last byte in a 32-byte group.  This could be the first data byte you send, if it's address bits end in 2'b11.  The group is not defined by how much data you wish to send in total, but (at least as I understand it) but how much data will cross the interface.
  6. You haven't set the 'mask' bits (app_wdf_mask) anywhere in your code.  You'll probably want to make sure those are explicitly set to zero.  These bits allow write commands to only effect certain bytes in the memory.  If I understand correctly, any bit where the mask is '1' corresponds to a byte that is not written, whereas a mask with a bit corresponding to zero is a byte that is written.

There are just some observations.  They come with no guarantee that, should you follow them, your could would work.  :P

Dan

Share this post


Link to post
Share on other sites
  • 0

Hi @D@n,

Thank you for your helpful suggestion. I really need it. Reading and writing procedure may be large so make it a small part.

At first, I want to see when I am inserting app_cmd, app_addr and app_en there must be an output of app_rdy.  like the following attached figure (3). I attached my code also. I ran behavioral simulation by using force clock option. But I don't get app_rdy high (fig.4). But when I upload the code to the board I see led (2) is turned on as I set a logic in my code like following:

process (ui_clk)
begin
if rising_edge(ui_clk)then          
       app_cmd <= "000";
       app_addr<= app_addr+'1';
       app_en<= '1';  
    if (app_rdy ='1') then
        led<= X"02";
    end if;
end if;
end process;

is there anyway I can check it in real time. If you can provide any material (for real time simulation) it will be very helpful. Any suggestion will be appreciated.

Rappy

 

MIg write and read.txt

3.PNG

4.PNG

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now