• Content Count

  • Joined

  • Last visited

  • Days Won


Gau_Veldt last won the day on January 8 2018

Gau_Veldt had the most liked content!

About Gau_Veldt

  • Rank

Profile Information

  • Gender
  • Interests
    Computers, AI, Games, hardware and software Coding, FPGA, retro computing, consoles

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. From the C64 kernal disassembly ( ): .,FCE7 20 02 FD JSR $FD02 scan for autostart ROM at $8000 .,FCEA D0 03 BNE $FCEF if not there continue startup .,FCEC 6C 00 80 JMP ($8000) else call ROM start code .,FD02 A2 05 LDX #$05 five characters to test .,FD04 BD 0F FD LDA $FD0F,X get test character .,FD07 DD 03 80 CMP $8003,X compare wiith byte in ROM space .,FD0A D0 03 BNE $FD0F exit if no match .,FD0C CA DEX decrement index .,FD0D D0 F5 BNE $FD04 loop if not all done .,FD0F 60 RTS *** autostart ROM signature .:FD10 C3 C2 CD 38 30 'CBM80’ .,FE47 48 PHA save A .,FE48 8A TXA copy X .,FE49 48 PHA save X .,FE4A 98 TYA copy Y .,FE4B 48 PHA save Y .,FE4C A9 7F LDA #$7F disable all interrupts .,FE4E 8D 0D DD STA $DD0D save VIA 2 ICR .,FE51 AC 0D DD LDY $DD0D save VIA 2 ICR .,FE54 30 1C BMI $FE72 .,FE56 20 02 FD JSR $FD02 scan for autostart ROM at $8000 .,FE59 D0 03 BNE $FE5E branch if no autostart ROM .,FE5B 6C 02 80 JMP ($8002) else do autostart ROM break entry So again we have the cartridge ROM being physically mapped into the bus at $8000. The Kernal then looks for the magic word 'CBM80' (high bit set on C,B,M) at the start of the ROM at $8004-$8008. At $8000 is the pointer into where to start in the cartridge ROM at reset. At $8002 is the pointer for where to run in the cartridge upon NMI (RUN/STOP+RESTORE).
  2. As an aside when I went looking for C64 FPGA snippets or significant parts (VIC, SID, etc) of it I hit a lot of nasty paywalls (and/or licensewalls) hence why I did it myself AND made the sources freely available on GitHub. Edit: Oh yeah, and lots of abandoned stuff or out of date (and/or whose latest version is behind a paywall/licensewall).
  3. This is where it helps to have the disassembly of the KERNAL ROMs. Anyways: from the cartridge ROM will use the GAME or EXROM lines (by pulling low) and the ROM will appear in the C64 address space from $8000. The start address of code to boot in the cartridge ROM gets stored at $8000,1 and the address of code to run on NMI has its address placed at $8002,3. Finally the CBM80 magic signature must appear at $8004,8 "CBM80" in PETSCII. I should point also that the ROM address lines will be relative to the ROM offset, not the C64's bus address (so address at the cartridge ROM's address pins will be $0000 but on C64's address bus it will be $8000) GAME and EXROM tell the C64's banking logic when to activate the cartridge ROM's chip select (CS) input.
  4. I actually did this project to learn FPGA development. and I have learned a few things that I'll apply to future projects but here's the highlights: 1. Bidirectional buses don't work at all on FPGAs. Buffers are very limited resources on most FPGA topologies. This means split the bus and have it separate for input versus output (eg: have dataOut 7:0 and dataIn 7:0 buses for data). This one's a real pain because pretty much all old school computer logic uses some form of bidirectional bus. You can interface an external (external here means external hardware only if it's buried in a source file include or IP component somewhere and compiled in it's not external) signal from something like SPI or I2C (by declaring as inout) but not for signals between entities of your own design sources (otherwise you get the synth error about signals with multiple drivers). I got bit so many times by this gotcha on the C64: the bidirectional CIA I/O's, the 6510 databus and bidirectional I/O port, etc. 2. Asynchronous reset lines are bad news so don't use them. Synchronous (clocked) resets are fine. States will be set to design-specified initial values whenever the bitstream is loaded so cold-start reset isn't even really all that necessary (though you do want some way to hold processes back long enough for things like clock tile MMCM's and PLL's to be properly running first). As for your CPU implementation I already have my suspicions starting with that 0 on the memory free report to a possible search area. That 0 ties to the floating point logic in the BASIC ROM. The FP code, in particular, requires BIT, ASL, ROL, ROR and LSR to work properly on all addressing modes (BIT is a frequent flyer for bugged implementations since its actual operation and side effects [from the flag-modifying-only AND] are somewhat exotic [eg: don't forget about bit 7->N and bit 6->V]). I used a few very useful documents when I did my 6510 design: the instruction explanations available on and another site which has a grid of all the opcodes and their addressing modes with the grid rows and columns representing the hex nybbles of the instruction opcode. I can't give you the links since I'm away visiting my stepmom until near the end of the month and as such don't have the computer with those bookmarks in front of me. Oh and you'll get a working C64 even if you ignore decimal mode (nothing on the KERNAL or BASIC ROMs use decimal mode whatsoever and the only decimal mode related instruction present in the ROMs is CLD which just ensures it's shut off). Oh and there's also a complete C64 ROM disassembly kicking around on the web as well. It's also very useful.
  5. PSS: If by software simulator you mean logic simulation for the design sources then that's already available in Vivado at deisgn-level, synth-level or implementation-level simulation contexts. It's just they are really really slow, especially the synth and implementation sims. Expect a good few hours just for the second-or-two RAM test the C64 Kernal starts with (that you have to wade through because none of the interesting stuff where you see bugs such as interrupts going wrong in CPU sources will happen until after that test runs and the Kernal turns on interrupts). Just one VIC video frame is a go-have-a-coffee 2-3 minute sim step. Oh and I don't duplicate all of the C64's quirks (ie: no badlines in my VIC since the master clock is at 80 MHz (meaning the RAM/ROM will run at 80 MHz) and my VIC uses 76 out of 80 of those (giving it more than enough cycles to fetch all its needed video data in fact most of it even before the VGA horizontal sync and back porch have passed) and the remaining four give my Z64's 6510 a full unimpeded 1 MHz rate (the four clock cycles given to the CPU allow each of its PH1+,PH1-,PH2+ and PH2- clock stages to happen within a single 1 MHz period).
  6. PS: I used that python script to generate compile-in ROM VHDL library includes out of the binary images for the Chargen, Kernal and BASIC ROMs. I don't put the ROM function sources into the github repo for obvious reasons (cough DMCA takedowns cough) but the Python program to convert a ROM binary dump you already have into a VHDL library source is there. It would just need to be copied to another pyhton and tweaked slightly to make cartridge ROM sourcefiles.
  7. Just a quick rabble on those three points: 1. Software emulators for C64 are already quite plentiful VICE being among the better ones. 2. I've got majority of the design complete for a running C64. There are some missing bits I need to still add (the SID volume envelopes, ring and sync mod, VIC sprites and bitmap modes). I ran into some issues with emulating IEC so there's still no disk/floppy image support. EOI and ATN overrides were causing me grief. I took a pause at this point and decided it might be time to get a Raspberry Pi going with a breadboard that speaks SPI over one of the Zybo's PMODs for realtime debugging of signals using the Pi (Vivado fails to compile the debug IP every time I've tried to add it so that's a no go). It also means the Pi could act as the IEC device and transfer files over to my Z64. I'll work more with this once I return from my visit with my stepmom in a week or so. 3.Cartridges are pretty basic to get going. There's a few wrinkles if the cartridges use the EXROM or GAME signals (you need the proper logic present in memory decode signals to honor them) but otherwise cartridges are simple memory images that could be loaded into the BRAM on the 7010 and accessed as ROM. You can set cartridge data to load when the bitstreams are sent to the FPGA but I find the file format used by Vivado is rather cumbersome for this purpose so I use a python script that instead creates a function with a huge case statement the yield the ROM values (input: A[12:0], output: D[7:0] for an 8K ROM) then it's just a matter of adding these into the source tree as a library and accessing that function from a clocked process block in the main design sources (if rising_edge(someClock) then do_something_with_byte(rom_function(address); end if;) to have it synthesize as a ROM. On a final note I do have my existing hardware sources up on GitHub so you could investigate those in the meantime.
  8. Loberman basically already indicated what you'd need in HDL. I'll just add most boards have a downloadable constraint file (at least this was the case for the Zybo). Copy it into your project's (right clicking on the constraints in the sources pane followed by add new constraint files ensuring to check the "Copy files to project" option) constraints folder then open the constraints file (I'm using Vivado btw) and uncomment the lines for the desired pins (in this case one of the PMOD ports). Be sure to check the hardware manuals that the PMOD port chosen is appropriate to the application (eg: "slow" ports with protection resistors are better for experimenting on a breadboard). For extra credit: create a (bridged, meaning a separate 64-bit register latches this counter's output to a different clock domain) HDL 64-bit counter clocked by the 10 MHz (or faster) signal and create an AXI interface to make it accessible to the PS (linux). Then modify the /dev/random kernel driver to use the counter for entropy (/dev/random) whenever input from USB peripherals or network occurs.
  9. What's going to happen here is that the synthesizer is going to generate your design and optimize out any signal assignments that cannot be reached (or that are reached but overridden by a later assignment) on a particular path through the process. All possible paths that modify a particular signal are then strung into a MUX determining that signal's value. There will be one such MUX for each different signal the process assigns to (and in this case each MUX will also be coupled to a register clocked by someClock). It's not like a programming language where a series of assignments will be run in order. So a process like: someProc: process(someClock) is begin if (rising_edge(someClock)) then mySignal <= oneSignal; mySignal <= anotherSignal; mySignal <= oneMoreSignal; end if; end process someProc; Will synthesize (elaborate) to a register called mySignal whose input connects to oneMoreSignal and whose clock input is connected to someClock. oneSignal and anotherSignal won't even appear in the elaboration of someProc.
  10. Just to put in perspective 2^256 is a base ten number with around 77 zeros. Let's divide this by 1000 THz for an imaginary design running at 1 PHz (petahertz which has 15 zeros) that can do the entire elliptic curve, ripemd160 and comparison to your target hash in one clock cycle (this pure fantasy is to simplify things by starting from a whopper hyperbole of a best possible design that cannot be achieved in practice). Now we're at a number with around 62 zeros ~= 10^62 or so seconds. 60 seconds is one minute, 3600 seconds is an hour, 86400 a day, 3.162*10^7 is a year (around 31.6 million). If we divide this into 10^62 we get around 3.66 * 10^54 years on an imaginary device that does a complete crack attempt for one 256-bit guess on a 1 petahertz clock. If you had 1000 of these devices you'd only cut the exponent by around 3 (10^51) ideally and a million devices (10^6) only lowers the exponent by 6 to get 10^48. Ten billion (one of these ultra fast rigs for every man woman and child on the planet) only lowers the exponent by 10 to get 10^44. 3.66 * 10^44 YEARS. By then our sun will no longer be lit thus you likely won't be around to see the bitcoins when the key is cracked even with ten billion of these hypothetical 1 PHz devices that cannot be built with any of today's existing technology. A 64-bit number, which is the keyspace you actually need to attack, is marginally more doable: 5.834*10^11 years so if every man woman and child on the Earth had one of these 1 PHz rigs (a botnet of pure fantasy) it would take 5.834 years. A minimal FPGA board at around $100 runs at about 500 MHz so you'd need two million of them just to match one single 1 PHz fantasy device, multiplied by still another 10 billion to cut the time down more reasonably to 5.834 years (roughly 5 years, 10 months) for a cost of $200 * 10^16 (one trillion dollars is only 10^12). Conclusion: It's not really practical to attempt a crack.
  11. I've driven PMOD lines directly from HDL as part of a PWM driver to vary the brightness of LEDs when connected (via resistors) to the PMOD line jumpered onto a breadboard from one female end of a 12-to-2x6 PMOD splitter cable (the other end I reserved for plugging in a keyboard via a PMOD PS/2 adapter). It works fine to uncomment the appropriate PMOD port lines in the XDC file (and add those pin names into the entity declaration, as output in my case). In VHDL something like your problem I'd achieve by first using the clock wizard to make the initial 10 MHz clock, uncomment the desired PMOD declaration (in my case the Zybo there's six: JA through JF, I typically use JE since it has protection resistors and unpaired inputs making it more suitable for breadboarding in conjunction to the PL) and then a single signal assignment in the architecture section to drive the desired PMOD signal with the output signal from the clock. If requested I could place the PWM LEDs-via-PMOD VHDL source (only a single VHDL file, not counting XDC since these are board specific and I don't include them) on github. PS: I'll add here that it is also possible to rename the signal lines in the XDC to match the names used by an HDL project (making it easier to use a generalized project on different boards) rather than searching and replacing signal names in the project sourcefiles to match the XDC. This is likely the less error prone way and more friendly for end users of projects intended to work on a variety of boards (making it easy to upgrade to a Zybo2 or Zedboard from an original Zybo, perhaps). PSS: In the context of the PMOD documentation "GPIO", from an HDL perspective, is going to mean the PL generally (notwithstanding particular IPs that may already do so) accesses the PMOD lines directly.
  12. Short answer: If there is a Scratch ( server that will interact between the Discovery 2 and Scratch, GET IT! It means you could plug any sort of hardware (like a temperature sensor or an IR sensor) that generates readings into the Discovery and have it send that information to the Scratch project meaning a character would be able react (say having a sprite's face go red with increasing temperature) to a signal or waveform sensed at the Discovery. This gets even more interactive if the kids make their own hardware that sends data into the Discovery (think robots). Longer answer: I'm going to take it you mean the Discovery 2 "Maker Bundle" (that has the breadboard and accessories and stuff in it). This would have similarities to Canakit's Rasberry Pi 3 "Ultimate Bundle" that includes a GPIO breakout and breadboard, jumpers and some LEDs and stuff. I'm going to be cobbling some wires up on the breadboard to exchange data between the Pi and the standard (connector JE, resistor-protected, unpaired) PMOD port. I've done some PWM stuff on the Zybo side at the PL to get warmed up and ensure my PMOD port jumpers are all in order. The Pi will basically be doing something similar what the Discovery 2 does: monitoring signals transferred over the PMOD and dumping them to a visual log (ie: waveform charts) for debugging FPGA designs. I've tried using the Vivado debug core but it fails to compile every time I try to add its IP so I've given up on that for now and will use the Pi instead. It does mean I do need a sort of mini-debug FSM coded in HDL to speak SPI over the PMOD so the Pi can ask for signal lines and the PL (as slave) will send data for the currently chosen signal(s). When making retro computers in HDL there may be a rather high number of signals to monitor, it may not be known which are needed, milliseconds take hours in simulation and the simulator does not always accurately reflect the situation on the running hardware even when using implementation-level simulation. The real joy for a younger learner with a setup like mine would be activating the Pi's GPIO server for Scratch and interacting between a PL design loaded into the Zybo and a Scratch project. All sorts of possibilities and adventures await.
  13. Future projects may include clones of other machines or consoles (Apple II, the infamous NES, TRS-80, Amiga there are lots to choose from) and some other projects such as hardware LSTM (long short term memory, used for AI). The challenge for some will be memory (meaning I'll have to make a design that speaks AXI and uses the PS's DDR memory) since things like space for the NES game ROM memory or Amiga 500/1000 memory (including OS ROMs). The really big ones like loading a game ROM also aren't suitable for my python ROM-to-hdl script nor Vivado's RAM initialization file workflows.
  14. I thought Xilinx used bitstream encryption "for security" and to prevent leaking IP's to intruders (read: open source reversing engineers).