Sega Master System / Mark III / Game Gear
Demo for Game Gear
|1.00||27 Mar 2009||Martin||CreditsScroller-GG-1.00.zip (8KB)|
There is a forum topic for Credits Scroller
by Martin Konrad (www.smspower.org/martin)
Version 1.0, 29.3.2009
This is a more detailed description about how my Credits Scroller on the Game Gear works.
The Game Gear screen has a size of 20x18 tiles (160x144). The scrolled text has a size of 18x18 tiles, meaing that the left most and the right most column are not used for text scrolling. They could be, for some speed cost, but they are not in my implementation.
The tilemap layout is the following:
0 18 36 54 .. 342 1 19 37 55 .. 343 .. .. .. .. .. ... 16 34 52 70 .. 358 17 35 53 71 .. 359
This tilemap has a size of 20x18. The actual tilemap on Game Gear of course has a size of (usually) 32x28, but the outer regions can be ignored. What's important is that there are vertical columns, each consisting of 18 tiles. 18 of these columns are used for the scrolling text.
The tiles 0..359 contain the background picture in the first 3 bitplanes. The 4th bitplane is empty at first (filled with 0). That way the background pictures uses the colors 0..7. It could also use the colors 16..23 of the second palette (chosen per tile in the tilemap).
The text is then drawn into bitplane 4 at runtime. If a text pixel is 0, then the background is shown through, and if a text pixel is 1, then it shows the text colors 8..15 (or 24..31). You can choose yourself how you set the text colors. You could set them all to white, to have a solid white text. Or you could, for example, achieve transparency effects by setting colors 8..15 to an alternated (brighter, darker) version of colors 0..7. My favourite is the anti-transparency like in the credits of Turrican II on Amiga.
The trick is how to quickly draw the text into the 4th bitplane. First, there's a 1-bitplane buffer in normal memory. The layout of it is the same as the layout of the tilemap in vram, so that it can be directly copied. But, it must be one row higher because of the scrolling.
So both the tiles and the buffer is divided into columns. The drawing/copying is done column by column. The scrolling is done by changing the offset from where the buffer is copied into vram. The size of this buffer is 144*144/8 bytes = 2592 bytes. If I add 1 byte to the offset, then the text scrolls up one pixel.
One problem is that if I add 1 or a higher value to the offset, and copy one whole column, then it copies from an out-of-bounds memory area. The fix for this is to do two copies. One from the offset to the last row, and then again one from the first row. As soon as the scrolling offset goes by the last row, it is wrapped back to the first row. This should better be shown in a picture.
The trick is how to quickly copy the one bitplane from ram to vram. The first 3 bitplanes of the background picture are supposed to stay the same always. Only the 4th bitplane needs to be written to. A tile in vram looks like this:
Byte 0: line 0 of bitplane 0 Byte 1: line 0 of bitplane 1 Byte 2: line 0 of bitplane 2 Byte 3: line 0 of bitplane 3 Byte 4: line 1 of bitplane 0 Byte 5: line 1 of bitplane 1 Byte 6: line 1 of bitplane 2 Byte 7: line 1 of bitplane 3 ... ... ... ... Byte 28: line 7 of bitplane 0 Byte 29: line 7 of bitplane 1 Byte 30: line 7 of bitplane 2 Byte 31: line 7 of bitplane 3
The 4th bitplane starts at byte 3. So I set the vdp pointer to the start of the tile, plus offset 3. After I've written one byte, I have to write to the next byte of the 4th bitplane. This is byte 7. But I can't easily write into byte 7 after I've written to byte 3. I have two possibilities:
I choose the second method, because it's much faster. I use the following code for this:
outi ; Copy byte to VRAM from HL add a, d out ($bf), a ; Change low byte of the vdp pointer
So first, I use OUTI to copy a byte. Then I overjump the next bitplanes, by changing the vdp pointer. For that I have the low byte of the vdp pointer in register A. I add 4 to it and change the low byte of the vdp pointer by writing A to port $BF. The high byte of the vdp pointer doesn't have to change yet. It only needs to change when I write to an address of the form $xxFF, and that's always at the end of a tile (offset 31 of a tile).
Next, I do this multiple times, to cover a whole tile:
outi add a, d out ($bf), a outi add a, d out ($bf), a outi add a, d out ($bf), a outi add a, d out ($bf), a outi add a, d out ($bf), a outi add a, d out ($bf), a outi add a, d out ($bf), a outi ;add a, d ;out ($bf), a
These are 8 writes, for 8 bytes (= the whole 8 lines of a single tile). After the last byte I'm done, and the last change of the vdp pointer is not needed, so I commented it.
However, to be even faster, I don't write tile by tile. I write a whole column of tiles, with each column consisting of 18 tiles, or 144 byte writes. So I use this little code part, which writes one byte, 144 times. This results into a very large function. Of course you shouldn't copy and paste the small code part, but instead you should use some functionality of your assembler (for example .REPT/.ENDR in WLA).
After the end of every tile, the high byte of the vdp pointer could change, because the address of the last byte of a tile, at offset 31 of a tile, could be of the form $xxFF. But then actually you don't :) The trick is: After writing to an address of the form $xxFF, the high byte is changed automatically because of the auto-incrementing of the vdp pointer after each write. If you had chosen to use bitplane 0 for the text and used bitplanes 1..3 for the background picture, then you couldn't use this trick, because then the last byte of a tile would be at offset 28, and that wouldn't fall together with the $xxFF addresses.
This copying of columns is still very slow. I can copy up to 5 columns in the vblank-time (extended by switching the screen off after the normal display ends.) I don't copy in the non-vblank-time. Doing that would be even slower, because you had to wait more between the vram accesses.
So in the first frame, I copy the first 5 columns, then in the next frame the next 5 columns, until I've copied all 18 columns. Then I scroll, by changing the scroll offset, and repeat everything. There's a lot of time left in the non-vblank time to update the buffer with the text for scrolling, or other things, like playing music or doing palette fading calculations.
And that's it for now :)
Let me know if you have questions.