There are a few things we can do with the program to make it a bit nicer.

Defines instead of magic numbers

In programming, there is a general rule that you shouldn’t have “magic numbers” - numbers that do something special, which are just included in the middle of the code. Instead, you should use your programming language to give them a name so people can read your code and see the meaning, not the value:

;==============================================================
; SMS defines
;==============================================================
.define VDPControl $bf
.define VDPData $be
.define VRAMWrite $4000
.define CRAMWrite $c000

.define is a WLA DX directive that defines a name for a value. Now when we use these names (like “VDPControl”) it will act as if we had used the value (“$bf”) instead. So we can use these instead of the numbers, for example:

    ;==============================================================
    ; Set up VDP registers
    ;==============================================================
    ld hl,VDPInitData
    ld b,VDPInitDataEnd-VDPInitData
    ld c,VDPControl
    otir

Helper functions

There were a few tasks we did several times - for example, setting the VDP address, and copying data to the VDP. We can make functions to do this and use the functions instead of duplicating the code.

On the Z80, functions are mainly implemented using the call and ret instructions. These make use of the stack. Let’s explain the stack first.

The stack - push, pop, call, ret

The usual description of the stack is that it’s like a stack of playing cards, with the magical limitation that we can only take the top card from the stack, or put another one on there. The important thing is, the cards come off in the reverse order they’re put on, so it’s important not to get them mixed up.

For the Z80, the stack is a section of memory containing 16-bit words, not just 8-bit bytes. We can push a register pair onto the stack and the Z80 will store it in that section of memory. We can then pop it back into any register pair, although it usually only makes sense to pop it into the one you took it from. It allows you to do something like this:

ld hl,$1234
push hl
     ld hl,$5678
     ; Do something with hl (which contains $5678)
     ; ...
pop hl
; hl now contains $1234 again

...in effect, “saving” the contents of that register so you can do something else with it, then restoring it to its previous state. The other main use for the stack is for functions. There is a Z80 instruction “call” which is exactly like jp, in that it makes execution jump to a certain point instead of continuing on linearly; except that first, it pushes the pc (Program Counter) register pair, which by now contains the address of the next instruction after the call, onto the stack. Then, some time after jumping to the given address, if it encounters a ret instruction it will pop the stored pc address and start executing code from there, in effect returning to the point it was at before:

Somewhere in the program, usually not in the normal flow of the program:

MyFunction:
    inc a    ; Do something
    ret      ; return

In the normal flow:

    ld a,$00
    call MyFunction
    ; a now contains $01
    call MyFunction
    ; a now contains $02

Again, remember you have to be careful with the order you push/pop, especially when mixed in with calls and returns. This:

call MyFunction
MyFunction:
    ld hl,$1234
    push hl
    ret        ; Error!

will not work, because the ret will take the last thing pushed, which is $1234, and execution will continue at $1234! Except in very few circumstances, that’s not something you’ll want to do, because $1234 might be some data, or some completely unrelated code, or even halfway through an instruction!

So, to conclude, the stack is an area of memory that we can push and pop registers to/from; it’s also used to call functions and ret from them; and we have to be careful to balance our stack usage to avoid things going wrong.

Helper 1: set VDP address

To set the VDP address, to either VRAM or CRAM, we want to output it to the VDM control register, in little-endian order.

SetVDPAddress:
; Sets the VDP address
; Parameters: hl = address
; Affects: a
    ld a,l
    out (VDPControl),a
    ld a,h
    out (VDPControl),a
    ret

This is invoked using the code:

    ; 1. Set VRAM write address to $0000
    ld hl,$0000 | VRAMWrite
    call SetVDPAddress

Callers need to OR the address with $4000 or $c000 depending on whether they are setting a VRAM write address or CRAM write address. “VRAMWrite” and “CRAMWrite” were .defined earlier, to help make it clearer which one was being used, as shown above.

Notice the comments clearly state what the function does, what parameters it takes, and what registers it affects. That way, people using it can be careful not to leave anything important in those registers. An alternative would be to push/pop any registers used to avoid losing their values:

SetVDPAddress:
; Sets the VDP address
; Parameters: hl = address
    push af
        ld a,l
        out (VDPControl),a
        ld a,h
        out (VDPControl),a
    pop af
    ret

I’ve used indentation to help me be sure that my push and pop are balanced, and to show what’s protected by them. However, if the calling code doesn’t care about register a, this protection is unnecessary and will slow the program down.

Helper 2: copy data to VDP

CopyToVDP:
; Copies data to the VDP
; Parameters: hl = data address, bc = data length
; Affects: a, hl, bc
-:  ld a,(hl)    ; Get data byte
    out (VDPData),a
    inc hl       ; Point to next letter
    dec bc
    ld a,b
    or c
    jr nz,-
    ret

This is exactly what we had before, except we have bundled it into a function, used an anonymous label and jr, both of which I will explain in a moment.

This function can be invoked like this:

    ld hl,PaletteData
    ld bc,PaletteDataEnd-PaletteData
    call CopyToVDP

Anonymous labels

In the original version, we had many labels which were only really used for looping. We had to give each one a different name, so WLA DX could tell them apart; and once you have a hundred loops in your program, thinking of new names gets difficult. Since they aren’t particularly important points in the code, we don’t need names that last throughout the entire program; we want to use temporary names. One way of doing this in WLA DX is to use anonymous labels. These fall into three categories:

Type of labelLooks likeUsed for
ForwardsOne or more “+” signsPlaces you want to jump forwards to
BackwardsOne or more “-” signsPlaces you want to jump backwards to
Both-waysTwo underscores: “__”A place you want to jump forwards or backwards to, using “_f” to jump forwards and “_b” to jump backwards

The special thing about anonymous labels is that we can re-use them. If some code wants to jump to label “-”, WLA DX will find the nearest version of that label before the jump, and use that. So for our loops we can just use “-” instead of a full label.

Jump Relative - jr

Before, we only used the jp instruction to perform a jump. jr works (almost) exactly the same, except it is a relative jump. This means that in the final code, it is stored as a number of bytes to jump forwards or backwards, whereas jp is stored as the actual address to jump to. This has advantages and disadvantages:

  • It is one byte smaller
  • It can be faster to execute (because it is smaller)
  • But it can be slower to execute (because the address has to be calculated)

I consider its main advantage to be to tell you (when reading the code) which jumps are to something far away (i.e. to something that is distant from the previous code, like a different section of code) and which are local (within the section of code). So I always use jr for loops, for example, to help show that it is a jump as part of the current code block.

Writing the text in our file

Before, the tilemap data to draw the test was just a blob of data in the ROM. Wouldn’t it be nicer to store it as the text? It’s make it a lot easier to know what it said when looking at the file, and much easier to change too. To do that we need several things to happen.

Convert ASCII text to tile numbers - .asciitable, .asc

Using the .asciitable directive, we can tell WLA DX how to convert ASCII text. Our font includes everything from space (at tile number 0) to ‘~’ (at tile number $7e), in the normal ASCII order (except for a few special characters like ‘£’). We can tell WLA DX about this as so:

.asciitable
map " " to "~" = 0
.enda

Then we can use the .asc directive to store text, and WLA DX will convert it so the bytes match the tile numbers:

.asc "Hello world!"

Using a sentinel (terminating) value to signal the end

Previously, we used labels to count the size of the tilemap data. There is another way, which is to make sure there’s a special byte at the end of the text, which does not correspond to any letter. When this is encountered, the code can know that it is time to stop. This is unsuitable for general data, where any byte is valid, but suitable for text, where not every byte corresponds to a character.

Since the font uses tile numbers 0 to $7e, I will use value $ff as my “terminator”:

Message:
.asc "Hello world!"
.db $ff

Output full tilemap data - tilemap format, cp, flags, xor a optimisation

The tilemap data does not just consist of one-byte tile numbers. For a start, it is possible to have more than 256 tiles. Additionally, there are “flag” bits, making one entry into 16 bits:

Bit1514131211109876543210
UsageUnusedHigh priorityUse sprite paletteFlip verticallyFlip horizontallyTile number

Since all our tiles are below number 256, and we don’t want to use any of the advanced functionality, we just want to have the first eight bits at 0 and the rest to be the tile number.

When writing tile data to the VDP, we write it in little-endian order as before. That means we will write the tile number first, and then a zero.

So the program flow becomes:

  1. Read byte
  2. Is it $ff? If so, exit
  3. Output it to the VDP
  4. Output zero to the VDP
  5. Loop back to 1

Here’s the code:

-:  ld a,(hl)
    cp $ff
    jr z,+
    out (VDPData),a
    xor a
    out (VDPData),a
    inc hl
    jr -
+:

Notice that to exit, we use the “+” anonymous label; to loop, we use the “-” anonymous label (and jr).

To check if the value is $ff, we use the cp compare instruction. This sets some flags based on the comparison between register a and the parameter to the instruction (which could be a register or a literal number). Here is a simplified version of the flag effect:

FlagSet ifRelevant conditions
za = valuez, nz
ca < valuec, nc

Internally, it works by performing a subtraction, recording the flag effect, but throwing away the result. We mentioned the flags before; we are again using the z (zero) flag. If we subtract $ff from a value and the answer is zero, then the value must be $ff. Therefore our conditional jump will be taken, and the program code will continue with whatever comes after the + label.

If it’s not, then it continues on to output the value to the VDP data port. Because the cp instruction does not keep the result of the subtraction, register a still contains the value that was read from ROM.

Next we want to output a zero. We could do

    ld a,$00

but it is faster, and takes up less ROM space, to do

    xor a

This instruction performs an XOR between register a and the register or number given as a parameter, and store the result in register a. XOR will give a binary 0 for each bit which is the same in both register a and the parameter, and a binary 1 where they are different from each other. Since the parameter is also register a, it is evaluating a XOR a, which will always give a result of 0. Or, in short, xor a is a fast and small way to set a to zero.

Finally, we increment hl to move on to the next tile number and repeat until $ff is found.

External data files - .incbin

You may have noticed that in the original Hello World, more than half of the file was taken up by the font data. We don’t want to read and modify this data anyway (it’s pre-generated automatically, not created by hand), so we ought to move it to an external file. There are two main ways to include external data:

.include "filename"

This acts as if the mentioned file’s contents had been copied and pasted into the current file. It’s a lot like #include in C/C++. The file must therefore be text that WLA DX understands.

.incbin "filename"

This includes the mentioned file as raw data. Each byte of the file will be transferred as one byte to the resulting ROM. We’ll use this for our font data. I converted the data to raw binary data and saved it as “font.bin”, then changed the code to look like this:

FontData:
.incbin "font.bin" fsize FontDataSize

The “fsize” parameter tells WLA DX that I want it to set up a symbol called “FontDataSize” that corresponds to the size of the file. I can then use this instead of “FontDataEnd-FontData” any time I want the size of the data in bytes.

Final version

helloworld-enhanced.zip

This looks exactly the same as the first version, but has all the changes mentioned above.


< Stopping | Lesson 1