Archive

Archive for the ‘Code Snippets’ Category

Fast Horizontal Line Drawing and Other Things

December 5th, 2007 Ant 2 comments

Based on Jeff’s discovery of DMA_Force, here’s a hardware-accelerated 16-bit horizontal line drawing routine:

void drawHorizLine(u8 screen, s16 x, s16 y, s16 width, u16 colour) {
 
	if (width > 0) {
		// Draw first pixel
		PA_DrawBg[screen][x + (y << 8)] = colour;
 
		if (width - 1 > 0) {
			// Duplicate pixel
			DMA_Force(PA_DrawBg[screen][x + (y << 8)], (PA_DrawBg[screen] + x + (y << 8) + 1), (width - 1), DMA_16NOW);
		}
	}
}

It could be improved upon by pre-calculating the (x + (y << 8)) value, but I’ll leave that as an exercise for the reader (not that I’m lazy or anything).

I’ve integrated this into Woopsi, replacing the existing pixel-plotting line routine and a section of the filled rectangle code. It seems to go a little faster, at least in DeSmuMe under VMWare Fusion.

I’ve fixed the problem with the _decorationCount getting screwed. I started pulling Woopsi apart in order to find the bug - commenting out first the demo code, and then chunks of the GUI code. Eventually I decided that there wasn’t a bug in the code - the “getDecoration()” function was returning the wrong value. Weird. Of course, what I’d done was forget to initialise the _flags.decoration value (despite having a vivid memory of writing that code - maybe I dreamt it). Hopefully I’ve uncommented everything correctly…

Also changed the boolean getter naming convention. Previously, I was using “getXXX()” (as in “getDecoration()” above). Not the best system, really. I think I decided to use it because I was trying to adhere to an explicit separation between setters (called “setXXX()”) and getters (called “getXXX()”). C# has spoiled me and I’m so used to being able to say “object.IsXXX = true” that I felt it important to draw the distinction between putting things into an object and getting them back out again. However, it just made the functions’ meaning less than obvious. All boolean getters are now called “isXXX()” (so “getDecoration()” is now called “isDecoration()”).

Last thing for this post - SourceForge seems to be down, so I can’t update the SVN repository or access the bug list. Foiled again.

Categories: Algorithms, Code Snippets, DS Coding, Woopsi Tags:

PALib Palette Rotation

November 19th, 2007 Ant No comments

As PALib doesn’t include any functions for rotating a palette, here’s a function that will rotate the background palette on a given screen between two palette indexes. It will rotate in either clockwise or anti-clockwise order.

void RotatePalette(u8 screen, u8 start, u8 end, u8 direction) {
	if (direction == 0) {
 
		// Remember first colour
		u16 firstColour = BG_PALETTE[start + ((screen) << 9)];
 
		// Shuffle palette towards front
		for (u8 i = start; i < end; i++) {
			BG_PALETTE[i + ((screen) << 9)] = BG_PALETTE[i + 1 + ((screen) << 9)];
		}
 
		// Wrap end colour
		BG_PALETTE[end + ((screen) << 9)] = firstColour;
	} else {
 
		// Remember last colour
		u16 lastColour = BG_PALETTE[end + ((screen) << 9)];
 
		// Shuffle palette towards end
		for (u8 i = end; i > start; i--) {
			BG_PALETTE[i + ((screen) << 9)] = BG_PALETTE[i - 1 + ((screen) << 9)];
		}
 
		// Wrap start colour
		BG_PALETTE[start + ((screen) << 9)] = lastColour;
	}
}

To rotate the colours from indexes 10 to 20, for example, on screen 1 in clockwise order, use this:

while (1) {
    PaletteRotate(1, 10, 20, 1);
    PA_WaitForVBL();
}

To do the same thing in anti-clockwise order:

while (1) {
    PaletteRotate(1, 10, 20, 0);
    PA_WaitForVBL();
}

BG_PALETTE is an array that stores the colours for each 8-bit screen. In fact, it’s probably just a pointer to the palette memory. Anyway, the first 256 indexes are screen 0’s background palette. Indexes 1024 to 1279 are screen 1’s background palette.

Looking through the PALib code, it seems that the indexes between 256 and 511 are the sprite palette for the lower screen, and the same is true of the 256 indexes from 1280 for the upper screen. However, this system may have been removed at some point, as there appear to be two different implementations of sprite palettes, with this simpler method commented out.

Categories: Algorithms, Code Snippets, DS Coding, PALib Tags:

Rectangles Redux Redux and a Calculator

September 22nd, 2007 Ant 2 comments

Although there isn’t a great deal of noticable difference between this latest demo of Woopsi and the last version, quite a lot of under-the-hood changes have happened.

Starting at the start (good a place as any), I’ve added a new “Textbox” gadget. This gadget just displays a single line of text. It can align the text in one of three ways in both the vertical and horizontal planes (left-aligned, right-aligned, top-aligned, bottom-aligned, and centred in both dimensions). Handy feature to have, so I’ve stripped down the “Button” gadget and made it inherit from the Textbox. Instant text alignment all round.

The Textbox can have its text altered or added to once it has been instantiated. Implementing this was more tricky than I’d expected. If we need to add text to the textbox, we need to concatenate the strings. As I’m using char pointers this means that we need to:

  • Work out how much memory we need to allocate for the concatenated string
  • Allocate the memory
  • Copy the current memory into the new memory
  • Concatenate the new string with the new memory
  • Delete the original memory
  • Update the pointer to address the new memory

Sometimes I wish I was coding in C#, where all of that is handled by a simple “+=”.

The problem I had was that I was passing a string literal into the constructor, and updating the internal pointer to point to it. You obviously can’t delete that, so I had to change the constructor to go through the business above instead (the above checks for a NULL pointer before deleting).

The second change reflects the fact that I’ve been considering how Woopsi will be used by other developers. To this end, I’ve renamed all of the internal drawing functions. They shouldn’t be used by anyone other than gadget developers, and renaming them makes way for the forthcoming drawing functions that will be useful to programmers just using the system as a pre-built library. As part of this change I went through the code and changed the access levels of anything that was marked as “public” that shouldn’t have been.

Event handling has been tricky. I initially went for a MenuDS approach, in which events trigger pointers to functions. It’s really easy to build, but I discovered that it falls apart if you’re dealing with objects. If I want a button to trigger a member function in an object, I can’t - pointers to member functions are different from standard pointers to functions. More importantly, they’re useless for this system - you have to specify the class type pointed to when declaring a pointer to a member function, and as the classes will be written by other developers, this is no use.

I had a look about on t’internet for inspiration, and eventually decided on a class-based event system. Each gadget has a pointer to an instance of an “EventHandler” class, which has three methods - “handleClick()”, “handleRelease()”, and “handleDrag()”. When a gadget is clicked/released/dragged, it triggers the relevant function in the EventHandler. The clever bit is that the EventHandler class is an abstract base class. To create an event handler object, you’d create a class that inherits from the class, implements the pure virtual event handler methods, create an instance of the class and pass it into the gadget. When the event occurs, the gadget fires the method and passes a pointer to itself into the function. This ultimately means that we know which gadget fired the method, and we can get the events to trigger any function we want.

It may be necessary to add more event types as the system gets more complicated. In fact, it might even be handy to have an event data struct that contains the gadget pointer, x and y co-ordinates of the event, etc. It’d let me send more data around and it would be extensible, too. Hmm, sounds like the .NET “EventArg” class. In fact, it’s the same thing…

Just after I posted the last entry I changed the comment system to allow comments from non-registered users. The mysterious Jeff had some initial difficulty with the system, but persevered and I’m glad he did - he posted a tour de force of bit mangling to optimise the filled rectangle routine. I’m particularly impressed with his bit flipping logical AND that produces an even number from an odd. The new version of the code looks like this:

void drawFilledRect(s16 x, s16 y, u16 width, u16 height, u8 colour) {
 
	// Draw top line
	u16 lastX = x + width;
	for (u16 i = x; i < lastX; i++) {
		PA_Put8bitPixel(0, i, y, colour);
	}
 
	// Calculate DMA copy values - we need to deal with even values only
	u16 dmaX = (x + 1) & ~1;
	s32 dmaWidth = width;
	bool drawRight = false;
 
	// Did we convert to an even x value?
	if (dmaX != x) {
		// Reduce the width to compensate
		dmaWidth -= 1;
 
		// Remember that we need to draw an extra pixel
		drawRight = true;
	}
 
	// Pre-emptive bitshift to save calculations
	dmaX = dmaX >> 1;
	dmaWidth = (dmaWidth & 1 ? dmaWidth - 1 : dmaWidth) >> 1;
 
	// Calculate last line to draw
	u16 lastY = y + height;
	lastX = x + width - 1;
 
	// Precalculate line values for loop
	u16* line0 = PA_DrawBg[0] + dmaX + (y << 7);
	u16* linei = line0 + (1 << 7);
 
	// Loop through all lines
	for (u16 i = y + 1; i < lastY; i++) {
 
		// Perform DMA copy
		if (dmaWidth > 0) {
			DMA_Copy(line0, linei, dmaWidth, DMA_16NOW);
		}
 
		// Fill left gap
		if (x & 1) {
			PA_Put8bitPixel(0, x, i, colour);
		}
 
		// Fill right gap
		if ((width & 1) || (drawRight)) {
			PA_Put8bitPixel(0, lastX, i, colour);
		}
 
		// Move to next line
		linei += (1 << 7);
	}
}

Next up is a change to the TextViewer gadget that handles vertically-scrolling text. For some completely inexplicable reason, Woopsi started crashing in No$GBA and the real DS hardware. It worked fine in DeSMuME. After a lot of digging, I realised that the problem was the TextViewer - it would crash if I put a string with more than 63 characters into its constructor. However, it wouldn’t crash if I moved the constructor above the constructor for the second screen.

Sounds like a memory leak. Had an exhaustive dig through the code, and there isn’t any memory leaking in anything called by the constructor. Even stranger, disabling all of the code within the constructor still caused the crash. Strangest of all, DeSMuME still didn’t have any problems with it.

I eventually gave up, replaced the strings with char pointers, and it all works fine. I get the added benefit of internal consistency (since every other gadget uses chars, not strings) and undoubtedly a minor speed boost.

The last new development isn’t a change to Woopsi; it’s a test application built with Woopsi. As I’ve now got screens, windows, buttons, textboxes and event handling, I decided I was at the point at which I should be able to build something useful. The second screen now hosts a very simple, integer-only, 5-digit calculator application. It’s implemented a single class that creates a window, adds gadgets, and wires up the various events.

Some vaguely accurate stats for the calculator:

  • 30 lines of event handling code
  • 30 lines of GUI setup code
  • 25 lines of GUI managing code
  • 170 lines of actual calculator code

Approximately 250 lines of code needed for a simple GUI-based calculator application, almost all of which is “real” code rather than GUI manipulation. I’m sure that most of the real code could be streamlined, too - it’s a bit of a rush job.

The latest demo is here:

Woopsi Demo V0.06

Categories: Code Snippets, DS Coding, Woopsi Tags:

Rectangles Redux

September 20th, 2007 Ant 2 comments

There’s a bug in yesterday’s rectangle code. In certain situations it copies too many pixels on the right of the rectangle, so you end up with a pixel-wide strip of extra pixels tagged onto the side of the rectangle. Fixing it was more complicated than I’d expected. We can only work with even values (an even x co-ordinate and an even width), so we have to find the most efficient way to fix any odd values and remember that we’ve fixed them for later filling in with the standard pixel writer function.

Here’s the fixed code:

void drawFilledRect(s16 x, s16 y, u16 width, u16 height, u8 colour) {
 
	// Draw top line
	for (u16 i = x; i < x + width; i++) {
		PA_Put8bitPixel(0, i, y, colour);
	}
 
	// Calculate DMA copy values - we need to deal with even values only
	u16 dmaX = x & 1 ? x + 1 : x;
	s32 dmaWidth = width;
	bool drawRight = false;
 
	// Did we convert to an even x value?
	if (dmaX != x) {
		// Reduce the width to compensate
		dmaWidth -= 1;
 
		// Remember that we need to draw an extra pixel
		drawRight = true;
	}
 
	// Pre-emptive bitshift to save calculations
	dmaX = dmaX >> 1;
	dmaWidth = (dmaWidth & 1 ? dmaWidth - 1 : dmaWidth) >> 1;
 
 
	// Perform DMA copy
	if (dmaWidth > 0) {
		for (u16 i = y + 1; i < y + height; i++) {
			DMA_Copy((PA_DrawBg[0] + (y << 7) + dmaX), (PA_DrawBg[0] + (i << 7) + dmaX), dmaWidth, DMA_16NOW);
		}
	}
 
	// Fill in any gaps
	if (x & 1) {
		// Draw left-hand side
		for (u16 i = y; i < y + height; i++) {
			PA_Put8bitPixel(0, x, i, colour);
		}
	}
 
	if ((width & 1) || (drawRight)) {
		// Draw right-hand side
		for (u16 i = y; i < y + height; i++) {
			PA_Put8bitPixel(0, x + width - 1, i, colour);
		}
	}
}

It occurred to me that I could duplicate this method in order to draw lightning-fast horizontal lines, which is essential for a rectangle-based GUI system. The basic idea was to draw half of the line normally, then save a significant number of memory writes by using the DMA hardware to copy that half into the memory address of the second half. It would halve the amount of work needed for each line.

After a bit of programming, it became apparent that the sheer amount of code needed to calculate the exact region of memory to copy and the number of pixels that had to be filled in normally later meant that the actual speed increase wouldn’t be significant enough to justify the time it was taking to write. This method would work brilliantly with 16-bit displays, but the complexity of 8-bit data only being copyable in 16-bit chunks makes it far too troublesome for 8-bit displays.

Instead of this, I removed the PALib line routines and wrote simple pixel plotting routines for horizontal/vertical line drawing. Should be a bit faster. I also tidied up the window border drawing code, which was drawing longer lines than it really needed to (the overspill was hidden by the background fill, which gets drawn after the surrounding edges).

Related to line drawing, I’ve fixed the missing corners of the XOR rectangle. This had me puzzled for a while - the lines are all the right length, so why wasn’t the corner being drawn? Simple - the corners were being drawn by both XOR lines that intersect to make the corner, which meant the corner was getting XORed twice. It was getting set back to normal.

The three new features added today are multiple screen support (Amiga-style screens, that is, not the physical DS screens, which wouldn’t make much sense), a screen depth gadget (so that screens can be swapped) and initial work on a glyph set (so that the depth gadget has an image on it). Multiple screen support pretty much wrote itself, so that wasn’t a problem. The depth gadget inherits from the button class, so that more or less wrote itself too. Glyphs are handled simply by using the existing font system (think Wingdings), so no work required there either. Object-orientation is a wonderful thing.

The latest demo is here:

Woopsi V0.05 Demo

Categories: Code Snippets, Woopsi Tags:

Speedy Rectangles

September 20th, 2007 Ant No comments

After a bit of testing, I discovered that my new rectangle drawing routine isn’t as fast as I’d thought. In fact, it’s slower than the original method of plotting individual pixels, because (obviously enough) the code to draw a line between two arbitrary points (ie. the PALib line functions) is more complex than the code to draw a line between two points in the same plane (ie. my original code). More importantly, the PALib line drawing system doesn’t use the two- or four-pixel simultaneous writing mode, because it wouldn’t work - if you’re drawing a line from one row of pixels to the next, you’re not writing to consecutive pixels.

All of this is patently obvious, so I don’t know why I didn’t realise it hours ago.

I had a ponder for a while, and realised that the best way to write data to the screen is, as always, the DMA hardware. All I need to do is plot the first row of the rectangle using the 8-bit pixel routines, and I can use the DMA hardware to duplicate that row down to the end of the rectangle.

The only complexity comes with the fact that two 8-bit pixels are squeezed into each 16-bit slot, and the DMA hardware works in 16-bits. That means that if I’m starting with an odd x value, or have an odd width, the DMA copier will either write over a part of the screen I don’t want it to, or it’ll leave gaps. In that case, I have to ensure that I compensate for odd values in the function, and fill in the gaps after the copy using the pixel plotter.

Eventually, after much fiddling (most of it involving a mental block with DMA_Copy and the nature of pointers), I’ve come up with a filled rectangle routine that in one stroke solves all of the performance issues with Woopsi. Once I get the optimised redraw logic in place it should be even faster.

Here’s the filled rectangle routine:

void drawRect(s16 x, s16 y, u16 width, u16 height, u8 colour) {
 
	// Draw top line
	for (u16 i = x; i < x + width; i++) {
		PA_Put8bitPixel(0, i, y, colour);
	}
 
	// Calculate DMA copy values
	u16 dmaX = x & 1 ? x + 1 : x;
	u16 dmaWidth = width & 1 ? width - 1 : width;
 
	// Perform DMA copy
	for (u16 i = y + 1; i < y + height; i++) {
		DMA_Copy((PA_DrawBg[0] + (y << 7) + (dmaX >> 1)), (PA_DrawBg[0] + (i << 7) + (dmaX >> 1)), (dmaWidth >> 1), DMA_16NOW);
	}
 
	// Fill in any gaps
	if (x & 1) {
		// Draw left-hand side
		for (u16 i = y; i < y + height; i++) {
			PA_Put8bitPixel(0, x, i, colour);
		}
	}
 
	if (width & 1) {
		// Draw right-hand side
		for (u16 i = y; i < y + height; i++) {
			PA_Put8bitPixel(0, x + width, i, colour);
		}
	}
}

Here’s a demo:

Woopsi Demo V0.03

Categories: Code Snippets, DS Coding, PALib, Woopsi Tags:

More WindowSystemDS Stuff

September 18th, 2007 Ant No comments

Did a little bit more coding in the wee small hours last night. First of all, button text is now centred horizontally and vertically within the button, which makes the buttons look a bit better. I’ve added a new option for the gadget outlines, which can now be either bevelled into the screen, out of the screen, or change from out to in when they are clicked.

Finally, I’ve made the distinction between active and inactive windows. This was a little tricky to do, as each new window added to the window stack needs to become the active window. The obvious solution is either to loop through the window stack and de-activate them all before adding the new window, or by storing a pointer to the active window somewhere and updating that. I went for the former option (though the latter may prove more useful; more on that later), but hit a problem. I don’t have an array of windows in my stack; instead, I have a generic list of gadgets. What I needed to do was loop through the stack and only update those gadgets that are instances of the Window class.

In C# this is easy to do, but I’d never needed to do this before in C++ (I abandoned this approach for the enemy collection in DefenderDS when I read about the performance hit, but performance isn’t so much of an issue here). A spot of googling (note the lower-case “g” - ho ho, I’m making their trademark a common-place term) lead to this:

// Create a new window
Window* newWindow = new Window(x, y, width, height, title, this);
 
for (u8 i = 0; i < gadgets.size(); i++) {
 
	// Is this gadget a window?  Compare type name with new window
	if (typeid(*gadgets[i]).name() == typeid(*newWindow).name()) {
 
		// Got a window, so make inactive
		((Window*)gadgets[i])->setActive(false);
	}
}

Essentially the same thing as C#, we just compare the type of the array item with the type of the new window I’m adding to the array, and if they match we know we’ve got a window. Easy.

The next thing on my list is to improve the screen redrawing. At the moment, the entire screen gets redrawn when windows move, which is a bit crap. After much pondering, I’ve come up with this solution. We know which area of the screen needs redrawing whilst a window is moving as we have its co-ordinates and size. We send that to the Screen object (which contains the window stack) and tell it to order its children (ie. all gadgets in the window stack) to see if they need to redraw, based on the invalid rectangle it sends in as parameters to the function. The gadgets perform a quick collision check to see if the rectangle falls within their boundaries and, if so, they redraw themselves. To be even quicker, windows could just erase the invalid rectangle, then tell their children (ie. their gadgets) to redraw based on the invalid rectangle. They add their co-ordinates and size into the list of invalid rectangles that gets sent (probably by a pointer to a vector) to each gadget. This way, only those gadgets that have been corrupted will re-draw.

If we work from the screen upwards through the window stack, from the lowest window to the highest, any redraws at the lowest level will get overwritten by any overlying windows. This means we’ll always end up with the correct screen layout, with all windows and gadgets in the correct depth order.

The nice thing about all this is that I only have to write one function - since the screens, windows and gadgets all inherit from the main Gadget class, the whole system will manage itself. Marvellous!

One problem with this is window borders. As they’re just drawn on to the screen, we’ll have to handle them separately. However, what I could do is just make each section of border into a gadget of its own, which takes away the complexity as redrawing will then be handled in the usual way. I remembered something from my Blitz days here, and had a quick look at “gimmezerozero” windows in the Amiga ROM Kernel manual. The Amiga had two ways of handling window borders - in the first mode, borders could be overwritten, and in the second they were automatically refreshed. It turns out that “gimmezerozero” windows have their borders as separate gadgets. Looks like I’m on the right path, then!

Categories: Code Snippets, DS Coding, Woopsi Tags:

Tidying, Optimisation and PALib Bugs

April 11th, 2007 Ant No comments

(Written 07/04/2007)

More tidying. I’m splitting the ScrollingText class out into a Font class, TextWriter class, and ScrollingText class, so that I can more easily create new classes like a SineScroller class, etc.

As part of this refactoring process, I found another bug in the PALib. Well, I say bug, but it’s really more of a standards problem. The PALib guys have used the printf() conversion “%c” to mean “change colour” rather than “output a single char”. This confused me for a while because, instead of getting single characters splatted on the screen, I was getting random strings of text.

I came up with this new function as a backwardly-compatible fix:

void PA_OutputChar(u8 screen, u16 x, u16 y, char text) {
char output[2];
output[0] = text;
output[1] = '\0';
PA_OutputText(screen, x, y, output);
}

Another thought about using the WiFi system. One of the things I love about recent open-source programs, especially on the Mac, is that they tend to automatically tell you when a new version is available. It should be pretty easy to get the game to fetch an XML file from my web server that contains the latest version of the game; quick check against the internal value will let us know whether or not we’ve got the latest version. I could even make it do other things, like update the scrolling text on the menu screens, list news/other games, etc. I could have a whole section for that. Hmm, maybe I’ll write that after the menu system. It’d need some planning, though - what is it for? What does it do? What use is it? Who will use it?

Had a think about optimisation when working with C++ objects, too - I read something about the amount of work done in calling a function, and now that I’ve done some assembly programming (Z80, m68k, CIL and a lot of reading) it all makes sense. Bearing this in mind, I replaced calls to class properties with local variables in the main font writer loop. Seems to have improved the performance a bit - the flicker in No$GBA (caused by its DMA copier not running at the correct speed) has shifted from the middle of the screen to 3/4 of the way to the right, which means the font writer is now running 1/4 of a horizontal scan faster.