Archive

Archive for the ‘Assembly’ Category

ARM Assembly First Steps

May 28th, 2008 Ant 2 comments

I’ve been reading about ARM assembly lately. At the moment, the only book I’ve got is this one:

ARM: Assembly Language Programming

It’s not too bad, especially as it’s free and seemingly the only book on ARM assembly still available. (If anyone has any suggestions of good books on the topic, let me know.) It’s easy to read and gives an overview of the basics of writing assembly for the ARM family of processors. It has some quirks and strange omissions, though, and it’s obviously still a work in progress. A lot of internal references to other chapters are incomplete, a lot of pages are repeated, and I’m 63 pages in and there’s no mention of whether the ARM is little- or big-endian. This is rather important when you’re trying to add two 64-bit numbers with a 32-bit CPU (ie. the example on page 63). It also stumped me for a while with its buggy example programs. The “16-Bit Data Transfer” example on page 57 achieves the transfer (or, more accurately, doesn’t achieve it) by first loading a single byte from memory into R1 (line eight) and writing it back to memory as a word (line 9). It should, I’d have thought, have used the “LDRH” and “STRH” instructions to deal with half-word data instead.

It doesn’t list the clock cycles required for each command, either. One of my 6502 asm books included this information, which I thought superfluous at the time, but which I now realise is essential for writing efficient assembly code. If you work on the assumption that a C compiler will try to choose the most efficient asm instruction for each given situation, anyone trying to better the compiler by dropping into asm himself must know how long each command takes to execute. If he doesn’t know that information, the chances are the compiler will do a better job with a lot less effort.

This brings me to my first asm function for the DS. This will draw a pixel to one of the screens (it should be in 16-bit framebuffer mode; I’m using Woopsi as the test environment, hence the use of Woopsi’s “DrawBg” framebuffer pointer):

void drawPixel(u8 screen, s16 x, s16 y, u16 colour) {
/*
	mov r4, y, LSL #8	; Store y in r4 multiplied by screen width (ie. left shifted 8 places)
	add r4, x			; Add x to current value
	add r4, DrawBg[screen]; Add address of screen bitmap to current value
 
	mov r5, colour		; Store new colour in r5
 
	strh r5, [r4]		; Store r5 (colour) in address pointed to by r4
*/
	asm volatile("\
		mov r4, %0, LSL #8\n\
		add r4, %1\n\
		add r4, %2\n\
		mov r5, %3\n\
		strh r5, [r4]\n"
		:
		: "r" (y), "r" (x), "r" (DrawBg[screen]), "r" (colour)
		: "r4", "r5"
	);
}

My current plan is to write a few functions in C and asm, then compare the compiler’s take on it with my own. Hopefully I’ll see where the compiler does a better job. Still getting the hang of inline assembly, too. No idea yet how labels work when inlined, or why GCC insists I use backslashes within a string literal spread over several lines (something to do with the @ symbol, I think), or if there’s a more efficient way of getting at the function parameters (I think they’re automatically loaded into the lower registers, but I don’t know what happens if there are more parameters than there are registers).

None of this relates to Woopsi, though, as I don’t really have any intention of trying to out-code the compiler there. I have managed to make one change to Woopsi today - the Gadget::addGadget() and Gadget::insertGadget() functions now automatically draw the newly-added child gadget. Instead of forcing the developer to manually call draw() for each new gadget, this is done for him. Preventing new gadgets being drawn automatically is just a case of calling “disableDrawing()” on the parent gadget (or another gadget higher up the hierarchy).

Categories: ARM Assembly, Woopsi Tags:

Emulation and Other Thoughts

January 7th, 2008 Ant 11 comments

Things are still going a little slowly here in Zombie Towers. The only thing worse than programming dreams for disturbing your sleep is having programming dreams whilst ill. They’re essentially the same, but instead of waking up with the solution to a particularly thorny problem you wake up exhausted with a head full of nonsensical junk and the slightly worrying feeling that you’ve lost the ability to add simple numbers together. Having had a couple of meaningless physics dreams after reading half of Stephen Hawking’s “Theory of Everything”, I’ve been avoiding any complex thought and have been absorbing anime instead.

I did sneak a small coding project in, though - a CHIP-8 emulator. I was introduced to emulation back in the days of Amiga Format when one of the issues came with a couple of Game Boy emulators on the coverdisk. Despite being an even more abysmal programmer then than I am now, there weren’t many programs around that were completely beyond my understanding. However, emulators were one of them and I was immediately fascinated. How on earth would someone go about writing an emulator? I hadn’t even the vaguest notion of where to start. Understanding and then writing an emulator has been on my “things to do” list ever since.

I got past the “understanding” part of the equation ages ago, but it’s taken until now for me to get around to writing an emulator. I’d intended to write a CHIP-8 emulator in Flash MX, but couldn’t be bothered in the end (I was switching to Java at the time) and someone else got there first.

The CHIP-8 is a curious machine, mainly because it’s not a machine at all. There never was a CHIP-8 CPU. It’s probably one of the earliest examples of a commercial virtual machine. With the proliferation of dozens of incompatible home consoles in the late 70s, console manufacturers decided that there should be a common hardware platform to write games for. The simplest way to achieve this was to specify a virtual hardware platform and have each physical platform implement an interpreter. The CHIP-8 was the result. It is an 8-bit CPU featuring 4KB of RAM (of which the bottom 512 bytes is reserved for the hardware platform’s BIOS), 16 multi-purpose registers (the last of which doubles as the carry flag), a program counter, stack pointer, index register (which I think is a 70s name for a dedicated address register), 35 opcodes, a 16-key keypad (hard-wired into the CPU via another set of boolean registers) and some weird design decisions that reflect the fact that it is a virtual machine, not a CPU. It has two opcodes to draw sprites, for example - I doubt that anything so high-level and specific exists in a physical general-purpose CPU (though the NES’ not-quite-6502 might have something similar).

My emulator has most of the opcodes implemented, but some of the features aren’t implemented yet - the sprites in particular are tricky, mainly because there’s a dearth of documentation describing how the sprites are supposed to work. There’s no documentation about how fonts are stored, either - I imagine that they’re supposed to be coded into the BIOS memory somewhere, but finding out what the fonts look like or what address they’re supposed to be stored at would seem to involve getting hold of an existing emulator’s sourcecode and digging around in that.

I might get around to finishing it eventually. It’s enough just to have got the bulk of the CPU emulated, and it’s a distinctly bizarre experience to watch code you’ve written execute a binary written by someone else.

Writing an emulator is actually very simple. A CPU can be modelled as a large switch statement. Memory can be emulated by allocating an array, and CPU registers can be simulated in the same way. The emulator program fetches each distinct unit of information (“opcode”, “operation code”, or CPU instruction) from the binary it is executing and feeds it into the CPU switch statement. The switch statement decides which instruction the opcode represents then runs that instruction. A simple CPU could look like this:

void emulate(unsigned short opcode) {
 
	// Extract instruction and data from opcode
	instruction = (opcode & 0xFF00) >> 8;
	x = (opcode & 0x00F0) >> 4;
	y = (opcode & 0x000F);
 
	// Parse opcode
	switch(instruction) {
		case 1:
			// Add register x to register y and store in register x
			add(x, y);
			break;
		case 2:
			// Subtract register x from register y and store in register x
			sub(x, y);
			break;
		case 3:
			// Multiply register x with register y and store in register x
			multiply(x, y);
			break;
	}
}

This simple CPU can perform three actions - addition, subtraction and multiplication. The opcode is a 16-bit value comprised of the instruction code to execute (highest byte) and two 4-bit data values (lowest two nibbles). We use bitmasks and bitshifts to extract the instruction and data from the opcode in order to parse it in the switch statement.

The CHIP-8 is more complex, naturally - it has 35 instructions, and there is no set format for the instruction and data portions of the opcode. The instruction can consist of just the highest nibble or it can use the whole highest byte. The data portion of the opcode can consist of a single 4-bit value, two 4-bit values, a byte and a nibble, or a single 12-bit value, depending on which instruction it is paired with.

The basic premise holds, though - there are other ways to achieve the same thing (a jump table instead of a switch statement, for example), but writing a CPU emulator is just a case of getting hold of a list of opcodes, working out how the opcodes are structured, what the opcodes do and how they affect the system’s memory and registers.

Now that I know how all of this works, it’s easy to see how assemblers and disassemblers work (I might get around to writing my own CHIP-8 assembler if I finish the emulator). It’s easy to see how a Z80 or 6502 emulator would work. The 6502 is particularly easy, assuming you’re only interested in the documented opcodes (of which there are 40-odd; there are around 200 undocumented opcodes) and aren’t worried about making it cycle-exact.

Other than learning all of this gubbins, I’ve been pondering some more Woopsi developments. First of all, I’ve got a plan for the scroll bars. Each scroll bar (horizontal and vertical) will be a gadget comprised of three sub-gadgets. They need up and down/left and right buttons and a slider gadget, which is itself comprised of a “gutter” and a “grip”.

I’ve also been pondering Jeff’s suggestion that gadgets pass requests around via events. At the moment, the only place this is particularly relevant is in the screen flip and depth sort buttons. Currently, the buttons call their parent’s “flip()” or “swapDepth()” functions. Jeff’s suggestion, to aid subclassers, is that the buttons just trigger flip or depth swap events in their event handlers (the event handler would be set to the parent for these decoration gadgets). It doesn’t seem like a particularly big change to swap from “_parent->flip()” to “_eventHandler->handleEvent(EVENT_FLIP)”, but it is actually a fundamental shift in how the system hangs together. At the moment, the system is very rigid. Each gadget orders the other gadgets to do something. The flip buttons order their screens to flip from one display to the other. Switching to event-driven interaction will mean that gadgets request operations instead. Gadgets will no longer say, “You must flip to the top display.” They will instead ask each other, “I say, old chap, do you mind awfully if you flip displays?” The whole system becomes much more fluid and, indeed, flexible.

Thinking about this has been like looking at a 2D representation of a cube - one moment it looks like the cube protrudes, the next it seems like the cube is inset. I’ve been swapping between thinking that it’s a good idea to rejecting it and back again, but I’ve finally decided that it’s a good idea. Why not make the system more flexible? Other than an extra function call, what are the downsides? There are several good reasons for the change, but no really good reasons not to implement it.

Categories: Algorithms, Assembly, Development, Woopsi Tags:

Downtime

June 4th, 2007 Ant No comments

Not much going on here. The weather’s far too nice to sit around programming, and my project at work is finally drawing to a close (after nearly two years), so I’m getting bogged down in last-minute requests for changes that should really have been made earlier (ie. two years ago).

One thing I have managed to do is get E-UAE set up on my MacBook. It is noticably slower than even WinUAE in Parallels, but it does at least handle the mouse properly (not in AMOS, unfortunately). Drop the sound quality down to 8-bit and switch on OpenGL rendering and it’s not bad at all. This inspired me to get around to installing a real 68K assembler.

There are far too many assemblers available for the Amiga. I’d heard of DevPac, but there’s also Asm-One, Asm-Pro, PhxAss, SEKA, ArgAsm, and each one seems to come in multiple versions, often developed simultaneously by different people. Bit like DASM, I suppose, which seems to have slightly different versions available for different 6502 machines.

Anyway, I’ve given Asm-Pro a go, and it looks pretty good. Once I’d worked out how to get out of the weird commandline and into the editor, anyway. It’s a macro assembler, so it comes with macros to handle things like loops, IF blocks, etc. That’s something I wasn’t too keen on with the 6502 assemblers I looked at - if you’re going to use macros, why not just use a higher-level language? (I’ll probably end up writing out the op codes in hex myself at this rate.) The 68K chip is so much more complicated that I’m more inclined to use the macros. Factor in the massive difference in operating system complexity and custom chips and there’s really no choice - use macros or go mad.

So far, there are two hugely obvious differences between the chips. The 68K has an instruction set several times larger than the no-frills 6502. It’s also capable of handling 32-bit words, so adding two 16-bit numbers together doesn’t involve a dozen operations, the carry flag and a sacrificial goat.

Categories: 68000 Assembly Tags:

Dynamic Memory Allocation Part 2

May 18th, 2007 Ant No comments

A bit of research yesterday turned up this site:

The Official Unofficial C= Hacking Homepage

Seems to be a collection of 15 year old docs about C64 programming. The second issue is most pertinent here, as it contains - hurrah! - a description of a dynamic memory allocation technique. It’s designed for the C128, and is intended to support system RAM, expansion RAM, and some other sort of memory the name of which I can’t remember, but all I need to know is how it works.

In brief, the technique works like this (modified slightly to remove the RAM expansion stuff). We create a linked list of free memory blocks, each with a 4-byte header. The header contains the 16-bit (2 byte) address of the next free memory block, and the 16-bit (2 byte) length of the current free memory block. When the routine starts, we have a single header followed by lots of empty RAM. To allocate memory:

  • Jump to the first memory block’s header
  • Check the length of the block - is it large enough to contain our data?
  • If yes:
    • Jump x places from the end of the memory block (where x is the length of our data)
    • Insert our data into that position and return a pointer to the start of our data
    • Reduce the length of the memory block (the header value) by x
  • If no:
    • Jump to the next address (stored as the first two bytes in the block’s header), which moves us to the next free memory block
    • Repeat until memory allocated

The original article suggests that each chunk of memory should be aligned to the next 8-byte boundary; this would make each chunk a multiple of 8-bytes long. I’d have to think about that some more before I decided that it was definitely the block size to use.

When we de-allocate memory, we try to join the freed block up to either a chunk immediately to the left or right (or both), so that we don’t have chunks of contiguous free memory separated by useless headers.

Compared with my FAT system, this has pros and cons. On the positive side, huge chunks of memory can be allocated with very little waste. We could allocate a block of 16K and use only 4 bytes of memory to manage that block, whereas the FAT system would need 2K to manage the block. It’s probably faster than the FAT system, too. On the negative side, allocating a single byte would eat up at least 5 bytes of RAM, which is incredibly wasteful.

This got me thinking. What if you could group variables together and allocate them all in one go? If you were writing Pong, for example, you’d know that each player has an X value (1 byte), a Y value (1 byte), a velocity (1 signed byte) and a score (1 byte stored as a binary coded decimal). That’s 4 bytes. Allocating them individually in the linked list system would use 20 bytes (more if we align along 8-byte boundaries). However, if we decided that we’d structure it so that we knew the sequence of bytes, we could allocate it as one chunk, then use indexed addressing to get at the correct value. To get at the X value, we just use:

LDA POINTER

To get at the Y value:

LDA POINTER+1

Etc. We can store this in just 8 bytes - 4 header bytes plus 4 data bytes. I think I must be on the right track again - I’ve just re-invented the struct.

Categories: 6502 Assembly Tags:

Dynamic Memory Allocation, or Reinventing the Wheel

May 17th, 2007 Ant No comments

In my quest to learn an obsolete flavour of an ancient programming language, the biggest question I’ve found myself asking is what to do if you need to manage memory dynamically. Static memory allocation is easy enough. If I’m writing a game, I know how many enemies will be on-screen, how many players there are, how many levels; in short, I can manually map out the memory usage of the game and leave placeholders in my assembly code to fill with that data as the program is running.

It gets more complicated when you’re trying to write an application whose sole purpose is to store increasing amounts of data, such as a spreadsheet, word processor, paint package, etc. If you need to be able to dynamically allocate and free memory as it is used, how do you go about doing it?

So, I got to pondering. My first thought was that you could decide upon a large block of memory to use as dynamic storage (say the addresses $6000-$9FFF on the C64, which is at the end of the BASIC storage space). You specify that another memory location is used as a lookup table for data in that memory block. We’d use each bit in the lookup table to specify whether or not a whole byte of the memory block is free. So, say our lookup table starts at $4000, and the first value is $FF, that means that the first 8 bytes of our memory block are in use. Allocating memory is a question of scanning along the lookup until we find a clear bit; we then set the bit and use the byte in memory. If we’re trying to allocate more than a single byte, we scan along until we find enough clear bits (ie. empty bytes) to contain our data. When we free memory, we simply clear the relevant bits.

This has two major problems. First of all, our memory block uses up just under 16K of RAM, which means our lookup table uses 2K of RAM. When you’ve only got 64K to start with, and when much of that is taken up with the operating system (simple though it may be), you don’t really want to waste that much memory. One solution would be to divide the memory into chunks of 2 bytes and just remember whether we’re using a chunk or not. That would reduce the size of our lookup table by half, but it’d double the storage space needed for a single byte to two bytes. We’re not doubling the size of a byte, but we’ve reduced the resolution of our lookup to prevent us from seeing that we’ve only used one of the bytes in that 16-bit block. This solution becomes troublesome if we’re storing predominantly 8-bit data instead of 16-bit data, because we’ve effectively halved the available memory.

It occurred to me that this table is, essentially, nothing more than a FAT system implemented in RAM instead of on disk. The “chunks” are, in FAT terminology, “blocks”. Windows uses something like a 4K block size, so if you save a 2K file you end up with 4K used on the disk. The Amiga had variable block sizes, defaulting to a small size which made it great at storing small files efficiently, but hopeless at retrieving large files. First re-invention of the day! Anyway, let’s call this lookup table the “FAT”.

The second problem is that we’ll quickly end up with fragmented memory, with empty bits interspersing the used data, which will greatly increase the amount of work done by the memory allocation routine.

It’s obvious that we need some sort of defragmentation routine here. The simplest way to do this is to write a subroutine that will scan along the FAT until it hits a clear bit. It then scans along until it finds a set bit, and moves all of the back to remove the gap. It manipulates both the FAT and the data to ensure they both stay in sync.

This is, basically, what Tetris does when the game detects a full row - remove the full row, then move the rows above it down to fill the gap. And they say you can’t learn anything from videogames!

The problem with this is that any existing pointers will be thrown out - they’ll now point to the wrong location as all of the data has moved. To solve this, we need a second lookup table. The new lookup table will store pointers to the main data block, and the main program uses these pointers to refer to data (in C terminology, we double-dereference the pointer, or in asm terms we treat the address as an indirect address).

So, when we allocate memory, we store a pointer to the data in the pointer table, and pass the pointer table address back to the program. To access the data, the program would need to go to the pointer table address, go to the memory address stored there, and finally get at the desired data.

Now, there is a huge problem with this. First of all, the obvious way of storing this (one address for every byte in the memory block) would use twice as much memory as the memory block itself. The memory block is 16K in size, which is 16384 bytes of information. The pointer table would, therefore, need to store 16384 addresses, and as each address is a 16-bit, or 2 byte, value, the pointer table would use 32K of RAM. That’s half of the memory in the entire computer.

At this point I realise that there’s already a function in existence to do all of this - the malloc() function in C must do something similar. So, I look it up. It transpires that one of the more popular ways to manage memory dynamically is to produce a memory map, or “mmap”, which works in a very similar way to the FAT system outlined above (it may even be identical; I’d have to read some more to say definitively). It seems that memory fragmentation is one of the major problems with this method. The guy who wrote the Enlightenment window manager for Linux came up with a system almost identical to my pointer system above to solve this.

So, that’s twice more I’ve re-invented the wheel today. Still, it’s nice to know that I’m working on the right lines.

I’ve done some more thinking, and have a few more ideas for auto-defragmenting dynamic memory allocation. The main problem with all of them is that they’re fairly complex and therefore aren’t really suited to the C64. Still, the fragmented memory map system would be easy to implement. I might try that as my first useful 6502 project.

Categories: 6502 Assembly Tags:

Some C64 Observations

May 15th, 2007 Ant Comments off

Reading up on the C64. Interesting things learned so far:

  • The memory map of the C64 leaves the assembler programmer with 3 bytes of zero page memory to use. I’m sure there are other addresses that could be abused on the assumption that, for example, the RS-232 port won’t be used during a game. But - 3 bytes?!
  • BASIC code begins at $0800 and ends at $9FFF, which explains why the C64 code I’ve seen so far positions the code at $0800.
  • The C64 Programmer’s Reference Guide looks like it contains everything I need to get started (except for the C64 asm header code, and some way of producing a file that’ll run in an emulator).
  • The C64’s 6510 CPU lets you bank in and out extra RAM if you don’t need the BASIC ROM or the character set, just by setting a bit at a particular address. Clever!
  • The C64 has a set of built-in functions that control all of the I/O you’d need to perform. For example, printing to the screen is just a matter of loading a hex value representing an ASCII code into the accumulator, then jumping to the memory address for the print subroutine. The memory address is in something called a “jump table” (a list of pointers to the subroutines themselves, I’d imagine) near the end of the 64K address space. The jump table apparently forms the C64’s kernal.

I’ve also noticed that “Introduction to Assembly Language for the Commodore 64” by Stan Krute is selling for $194.94 (roughly £100 if you use the official exchange rate, or £194.94 if you use the Apple exchange rate) on the US Amazon site. As far as I can tell, it is second-hand, and it’s not made of gold. Those crazy Amazon guys!

Categories: 6502 Assembly Tags:

In Another Assembler…

May 13th, 2007 Ant No comments

Here’s the same thing designed to compile with DASM:

        PROCESSOR 6502
        .ORG $0200      ;Locate asm at $0200

CODE    LDA #<LIST      ;Load low byte of LIST into acc
        STA $30         ;Store low byte at address $30
        LDA #>LIST      ;Load high byte of LIST into acc
        STA $31         ;Store high byte at address $31
        JSR LOAD        ;Jump to LOAD subroutine
        BRK             ;Stop running

LOAD    LDY #$1         ;Load 1 into Y register
        LDA ($30),Y     ;Load data pointed to by address $30+1 into accumulator
        RTS             ;Return from subroutine

LIST    DC.B #$20,#$04  ;Define a list of bytes

Great, so I can now compile 6502 programs from an OSX terminal instead of loading up Parallels. The only problem is I can’t run them in OSX. Gahh.

Categories: 6502 Assembly Tags:

Progress So Far

May 13th, 2007 Ant No comments

Things I’ve learnt about the 6502 so far:

  • Memory in the range $0000-$00FF is called “zero page” memory. This can be accessed much faster than any other memory because the addresses are all 8-bit. This means that the 8-bit CPU can process the address in one go, instead of needing to process the second byte in a 16-bit address.
  • Memory in the range $0100-$01FF is used as the stack (this is why most of the programs I’ve seen so far locate themselves at address $0200 - it puts them past both the valuable zero page RAM and the unpredictable system stack).
  • Memory in the range $FFFA-$FFFF is used by the three interrupt commands.
  • The CPU supports several different addressing types (about 13, I think), all of which are far too tedious to detail here.

One thing I haven’t found out is how to get the memory address of a label. Say we have to following code:

        .ORG $0200      ;Locate asm at $0200

CODE    JSR LOAD        ;Jump to LOAD subroutine
        BRK             ;Stop running

LOAD    LDY #$1         ;Load 1 into Y register
        LDA ($30),Y     ;Load data in 1st byte after address in $30 into acc
        RTS             ;Return from subroutine

LIST    .DB #$20        ;Define a list of bytes
        .DB #$04

At the label “CODE”, before the subroutine jump, I want to load the address of the list into memory address $30. That way, my “LOAD” subroutine is completely generic. I can define as many lists as I like (each with different labels, natch), and I can call the “LOAD” subroutine on any of them as long as I load the list’s start address into $30. I’d expect to be able to do this with the commands:

         LDA LIST       ;Load the address of list into the accumulator
         STA $30        ;Store accumulator into address $30
         LDA LIST+1     ;Load other byte of address into acc
         STA $31        ;Store accumulator into address $31

The bytes are probably the wrong way around as the little-endian-ness hasn’t quite sunk in yet. Anyway, that doesn’t work - what happens here is that LIST is treated as an absolute memory address (a pointer), and the first LDA command loads the first list item into the accumulator (in C terms, it automatically dereferences the pointer and gives me the data). There doesn’t seem to be any way to get at the actual address of the list. So how am I going to make my generic list subroutine? I have no idea. I imagine it involves more reading.

(A few minutes later)

Oh, no - it’s actually quite easy. At least, it is in the “6502 Simulator” Windows program I’m using to code with. The code should look like this:

         LDA #<LIST     ;Load low byte of LIST into acc
         STA $30        ;Store low byte at address $30
         LDA #>LIST     ;Load high byte of LIST into acc
         STA $31        ;Store high byte at address $31

Prefixing a label with a hash turns it into an immediate number instead of an absolute memory address (ie. we can get the memory address instead of treating it as a pointer and retrieving the data pointed to by the label). Using the greater than/less than symbols allows us to extract individual bytes from the 16-bit address.

Categories: 6502 Assembly Tags:

Argh! My Brain!

May 13th, 2007 Ant No comments

From “6502 Software Design”, by Leo J. Scanlon, 1981:

Zero page indexed addressing is to zero page addressing as absolute indexed addressing is to absolute addressing. With zero page indexed addressing, the effective zero page address of the operand is computed by adding the contents of the X or Y register to the zero page base address contained in the second byte of the instruction.

That’s a sentence that George Eliot herself would have been proud of. It’s a good job I already understand this stuff…

Categories: 6502 Assembly Tags:

What are the Chances?

May 12th, 2007 Ant No comments

No 6502 assembly books in Hay-on-Wye. That’ll make learning it so much more difficult. Except, in a freaky coincidence, I went to a car boot in Stratford today and came across a stall selling a whole box of BBC-B programming books, including 4 6502 books. 50p each! What are the chances of that happening?

Categories: 6502 Assembly Tags: