There are a few things we can do with the program to make it a bit nicer.
Defines instead of magic numbers
In programming, there is a general rule that you shouldn’t have “magic numbers” - numbers that do something special, which are just included in the middle of the code. Instead, you should use your programming language to give them a name so people can read your code and see the meaning, not the value:
; SMS defines
.define VDPControl $bf
.define VDPData $be
.define VRAMWrite $4000
.define CRAMWrite $c000
.define is a WLA DX directive that defines a name for a value. Now when we use these names (like “VDPControl”) it will act as if we had used the value (“$bf”) instead. So we can use these instead of the numbers, for example:
; Set up VDP registers
There were a few tasks we did several times - for example, setting the VDP address, and copying data to the VDP. We can make functions to do this and use the functions instead of duplicating the code.
On the Z80, functions are mainly implemented using the
ret instructions. These make use of the stack. Let’s explain the stack first.
The stack - push, pop, call, ret
The usual description of the stack is that it’s like a stack of playing cards, with the magical limitation that we can only take the top card from the stack, or put another one on there. The important thing is, the cards come off in the reverse order they’re put on, so it’s important not to get them mixed up.
For the Z80, the stack is a section of memory containing 16-bit words, not just 8-bit bytes. We can push a register pair onto the stack and the Z80 will store it in that section of memory. We can then pop it back into any register pair, although it usually only makes sense to pop it into the one you took it from. It allows you to do something like this:
; Do something with hl (which contains $5678)
; hl now contains $1234 again
...in effect, “saving” the contents of that register so you can do something else with it, then restoring it to its previous state. The other main use for the stack is for functions. There is a Z80 instruction “
call” which is exactly like
jp, in that it makes execution jump to a certain point instead of continuing on linearly; except that first, it pushes the
pc (Program Counter) register pair, which by now contains the address of the next instruction after the
call, onto the stack. Then, some time after jumping to the given address, if it encounters a
ret instruction it will pop the stored
pc address and start executing code from there, in effect returning to the point it was at before:
Somewhere in the program, usually not in the normal flow of the program:
inc a ; Do something
ret ; return
In the normal flow:
; a now contains $01
; a now contains $02
Again, remember you have to be careful with the order you push/pop, especially when mixed in with calls and returns. This:
ret ; Error!
will not work, because the ret will take the last thing pushed, which is $1234, and execution will continue at $1234! Except in very few circumstances, that’s not something you’ll want to do, because $1234 might be some data, or some completely unrelated code, or even halfway through an instruction!
So, to conclude, the stack is an area of memory that we can
pop registers to/from; it’s also used to
call functions and
ret from them; and we have to be careful to balance our stack usage to avoid things going wrong.
Helper 1: set VDP address
To set the VDP address, to either VRAM or CRAM, we want to output it to the VDM control register, in little-endian order.
; Sets the VDP address
; Parameters: hl = address
; Affects: a
This is invoked using the code:
ld hl,$0000 | VRAMWrite
Callers need to OR the address with $4000 or $c000 depending on whether they are setting a VRAM write address or CRAM write address. “VRAMWrite” and “CRAMWrite” were
.defined earlier, to help make it clearer which one was being used, as shown above.
Notice the comments clearly state what the function does, what parameters it takes, and what registers it affects. That way, people using it can be careful not to leave anything important in those registers. An alternative would be to
pop any registers used to avoid losing their values:
; Sets the VDP address
; Parameters: hl = address
I’ve used indentation to help me be sure that my
pop are balanced, and to show what’s protected by them. However, if the calling code doesn’t care about register
a, this protection is unnecessary and will slow the program down.
Helper 2: copy data to VDP
; Copies data to the VDP
; Parameters: hl = data address, bc = data length
; Affects: a, hl, bc
-: ld a,(hl) ; Get data byte
inc hl ; Point to next letter
This is exactly what we had before, except we have bundled it into a function, used an anonymous label and
jr, both of which I will explain in a moment.
This function can be invoked like this:
In the original version, we had many labels which were only really used for looping. We had to give each one a different name, so WLA DX could tell them apart; and once you have a hundred loops in your program, thinking of new names gets difficult. Since they aren’t particularly important points in the code, we don’t need names that last throughout the entire program; we want to use temporary names. One way of doing this in WLA DX is to use anonymous labels. These fall into three categories:
|Type of label||Looks like||Used for|
|Forwards||One or more “+” signs||Places you want to jump forwards to|
|Backwards||One or more “-” signs||Places you want to jump backwards to|
|Both-ways||Two underscores: “__”||A place you want to jump forwards or backwards to, using “_f” to jump forwards and “_b” to jump backwards|
The special thing about anonymous labels is that we can re-use them. If some code wants to jump to label “-”, WLA DX will find the nearest version of that label before the jump, and use that. So for our loops we can just use “-” instead of a full label.
Jump Relative -
Before, we only used the
jp instruction to perform a jump.
jr works (almost) exactly the same, except it is a relative jump. This means that in the final code, it is stored as a number of bytes to jump forwards or backwards, whereas
jp is stored as the actual address to jump to. This has advantages and disadvantages:
- It is one byte smaller
- It can be faster to execute (because it is smaller)
- But it can be slower to execute (because the address has to be calculated)
I consider its main advantage to be to tell you (when reading the code) which jumps are to something far away (i.e. to something that is distant from the previous code, like a different section of code) and which are local (within the section of code). So I always use
jr for loops, for example, to help show that it is a jump as part of the current code block.
Writing the text in our file
Before, the tilemap data to draw the test was just a blob of data in the ROM. Wouldn’t it be nicer to store it as the text? It’s make it a lot easier to know what it said when looking at the file, and much easier to change too. To do that we need several things to happen.
Convert ASCII text to tile numbers -
.asciitable directive, we can tell WLA DX how to convert ASCII text. Our font includes everything from space (at tile number 0) to ‘~’ (at tile number $7e), in the normal ASCII order (except for a few special characters like ‘£’). We can tell WLA DX about this as so:
map " " to "~" = 0
Then we can use the
.asc directive to store text, and WLA DX will convert it so the bytes match the tile numbers:
Using a sentinel (terminating) value to signal the end
Previously, we used labels to count the size of the tilemap data. There is another way, which is to make sure there’s a special byte at the end of the text, which does not correspond to any letter. When this is encountered, the code can know that it is time to stop. This is unsuitable for general data, where any byte is valid, but suitable for text, where not every byte corresponds to a character.
Since the font uses tile numbers 0 to $7e, I will use value $ff as my “terminator”:
.asc "Hello world!"
Output full tilemap data - tilemap format,
xor a optimisation
The tilemap data does not just consist of one-byte tile numbers. For a start, it is possible to have more than 256 tiles. Additionally, there are “flag” bits, making one entry into 16 bits:
|Usage||Unused||High priority||Use sprite palette||Flip vertically||Flip horizontally||Tile number|
Since all our tiles are below number 256, and we don’t want to use any of the advanced functionality, we just want to have the first eight bits at 0 and the rest to be the tile number.
When writing tile data to the VDP, we write it in little-endian order as before. That means we will write the tile number first, and then a zero.
So the program flow becomes:
- Read byte
- Is it $ff? If so, exit
- Output it to the VDP
- Output zero to the VDP
- Loop back to 1
Here’s the code:
Notice that to exit, we use the “+” anonymous label; to loop, we use the “-” anonymous label (and
To check if the value is $ff, we use the
cp compare instruction. This sets some flags based on the comparison between register
a and the parameter to the instruction (which could be a register or a literal number). Here is a simplified version of the flag effect:
|Flag||Set if||Relevant conditions|
Internally, it works by performing a subtraction, recording the flag effect, but throwing away the result. We mentioned the flags before; we are again using the
z (zero) flag. If we subtract $ff from a value and the answer is zero, then the value must be $ff. Therefore our conditional jump will be taken, and the program code will continue with whatever comes after the
If it’s not, then it continues on to output the value to the VDP data port. Because the
cp instruction does not keep the result of the subtraction, register
a still contains the value that was read from ROM.
Next we want to output a zero. We could do
but it is faster, and takes up less ROM space, to do
This instruction performs an XOR between register
a and the register or number given as a parameter, and store the result in register
a. XOR will give a binary 0 for each bit which is the same in both register
a and the parameter, and a binary 1 where they are different from each other. Since the parameter is also register
a, it is evaluating
a, which will always give a result of 0. Or, in short,
xor a is a fast and small way to set
a to zero.
Finally, we increment
hl to move on to the next tile number and repeat until $ff is found.
External data files -
You may have noticed that in the original Hello World, more than half of the file was taken up by the font data. We don’t want to read and modify this data anyway (it’s pre-generated automatically, not created by hand), so we ought to move it to an external file. There are two main ways to include external data:
This acts as if the mentioned file’s contents had been copied and pasted into the current file. It’s a lot like
#include in C/C++. The file must therefore be text that WLA DX understands.
This includes the mentioned file as raw data. Each byte of the file will be transferred as one byte to the resulting ROM. We’ll use this for our font data. I converted the data to raw binary data and saved it as “font.bin”, then changed the code to look like this:
.incbin "font.bin" fsize FontDataSize
The “fsize” parameter tells WLA DX that I want it to set up a symbol called “FontDataSize” that corresponds to the size of the file. I can then use this instead of “FontDataEnd-FontData” any time I want the size of the data in bytes.
This looks exactly the same as the first version, but has all the changes mentioned above.