.org $0066
;==============================================================
; Pause button handler
;==============================================================
; Do nothing
retn
Remember how I turned off interrupts just now? Well, there are some interrupts
you
can't turn off, officially called Non-Maskable Interrupts (NMIs). On
the SMS they're not too bad, they're just used for the pause button.
Whenever the pause button is pressed, the code execution will stop whatever it's
doing and jump to offset $0066. It will then execute whatever's there until it
gets to a retn instruction (return from NMI), when it
will go back to what it was doing before it was pressed. If we don't want to do
any special handling of the pause button, we should therefore put this command
straight away at $0066. The .org $0066 directive tells WLA that
this bit has to be at $0066.
1.3.6 Main program start - stack pointer
;==============================================================
; Main program
;==============================================================
main:
ld sp, $dff0
You may remember that
sp is a special register called the
Stack
Pointer, and that I didn't explain the stack. I'm still not going to because
we don't need to use it yet; but suffice to say that the stack takes up some of
the available RAM, and we have to tell it which RAM to use. We tell it here to
take some memory
ending at offset $dff0. The amount it takes varies, but
by telling it to be at the end of RAM, we can use memory at the start of RAM and
be confident of not overlapping with it.
As a reminder, the SMS has 8KB of RAM, located from hex address $c000 to $dfff,
and mirrored at $e000 to $ffff. We don't set sp to $dfff (the very
end) for an important reason which I will come to later.
We load the register with a value using the
ld <register>,<value>
instruction. You can read it as "load stack pointer with $dff0".
1.3.7 Setting up the VDP registers - block transfer
Now we get a technical bit. The VDP (Video Display Processor) is the graphics
chip in the SMS. It has a set of registers and some RAM inside it which we
control through two ports, $be and $bf. Charles MacDonald's "Sega Master System
VDP documentation" is a very good (advanced) document on how it works, but it's
a lot to go into for now. Suffice to say that lower down in the program I've
put a block of data we can use for setting these registers to suitable initial
values:
; VDP initialisation data
VdpData:
.db $04,$80,$84,$81,$ff,$82,$ff,$85,$ff,$86,$ff,$87,$00,$88,$00,$89,$ff,$8a
VdpDataEnd:
.db is a WLA DX directive which instructs it to just put the data
you write in the ROM with no modification. There are a group of them actually,
depending on what you want to define and how; .db should be followed by a comma-
separated list of values which are evaluated to bytes and stored.
.dw stores words (with the correct byte ordering). Check the WLA
DX documentation for more information on the more advanced ones.
I have to output this data to the VDP. I'm going to do this using one of the
Z80's block transfer instructions. These take values stored in certain
registers and transfer (copy or output) a block of data according to those
values. Here's the code:
;==============================================================
; Set up VDP registers
;==============================================================
ld hl,VdpData
ld b,VdpDataEnd-VdpData
ld c,$bf
otir
otir means "output the
b bytes of data starting at the
memory location stored in
hl to the port specified in
c". That's great - we can figure out all of those, as you can see.
There are other block transfer commands, most notably ldir which
can be used for copying from one memory location to another.
1.3.8 Clearing VRAM - VRAM write access, looping, conditional jumps
We don't know what's in the VDP RAM and if we don't clean it up, it will make
our screen ugly. (In actual fact it will contain the SEGA logo from the BIOS on
a real system.) So let's do that, by setting every byte of it to zero.
;==============================================================
; Clear VRAM
;==============================================================
; 1. Set VRAM write address to 0 by outputting $4000 ORed with $0000
ld a,$00
out ($bf),a
ld a,$40
out ($bf),a
; 2. Output 16KB of zeroes
ld bc, $4000 ; Counter for 16KB of VRAM
ClearVRAMLoop:
ld a,$00 ; Value to write
out ($be),a ; Output to VRAM address, which is auto-incremented after each write
dec bc
ld a,b
or c
jp nz,ClearVRAMLoop
Wow, look at that! That's quite a piece of code there, quite daunting really.
But it's not that bad, honestly - you'll laugh at something like that in no time.
Let's see what's there.
First, we have to communicate with the VDP and tell it that we want to write to
VRAM. (VRAM is what we'll call the RAM inside the VDP.) To do this we have to
tell it the address we want to write to, and tell it we want to write. Because
there's 16KB of VRAM, we'll need 14 bits (214 = 16384
= 16KB) for the address. The last two bits to make it up to a 16-bit (2-byte)
number are used to signal what our intentions are. To get the final number we
can use an OR calculation - the number $4000 only contains the bits required to
tell the VDP we want to write to VRAM, and if we OR it with the address we'll
get the final number to send to the VDP. In out case, we want to start at
address $0000 so the final number is $4000. (Try it on a calculator which
supports hexadecimal.)
We have to output this to port $bf, the VDP control port as I told you a
while back. I also told you about the byte ordering - we split $4000 and send
it in the order $00 $40. And that's what part 1 above is doing.
Why can't we write "out ($bf),$4000"? Because the Z80 doesn't know
how to. There are restrictions on how you can handle data. In this case, we can
only output one byte at a time, and the data has to come from register
a. (There are other possible ways to do it but this is the
easiest.)
Now we've set the VDP ready to receive data. We send it data by outputting to
port $be, the VDP data port. When it gets it, it will write it to VRAM and
then (rather handily) move to the next byte of VRAM, so we can just send it a
stream of data bytes and it will write them consecutively to VRAM. So we need
to send 16384 zeroes. The way to do this is to start at $4000 (=16384), then
output a zero and decrease our counter. Then we'll repeat this until our
counter is zero. That's what part 2 is doing.
First, we store $4000 in register pair bc. Then we come
across another Z80 instruction we wish we had. We want to go in a loop,
decreasing bc by one each time and checking if it's zero -
but while there is an instruction to decrease a register pair by one
(decrement it), it doesn't have a built-in check if the answer is zero.
So we will check it ourselves, using the fast or instruction. This
combines the current byte in a with another byte, giving a result
with a binary 1 where either of the two inputs had a 1. So it can only give a
result of zero if both inputs were zero, and because it sets the z
flag, it allows us to do a conditional jump (see below) based on the result.
So, we can output something bc times using a loop like the one
shown above. Here's a version with comments describing what's happening more
verbosely:
; 2. Output 16KB of zeroes
ld bc, $4000 ; Counter for 16KB of VRAM
ClearVRAMLoop:
ld a,$00 ; Value to write
out ($be),a ; Output to VRAM address, which is auto-incremented after each write
dec bc ; decrement counter
ld a,b ; get high byte
or c ; combine with low byte
jp nz,ClearVRAMLoop ; loop until the result is zero
Here's where I explain conditional jumps. There is a short list of
"conditions" you can attach to a jump (and a few other instructions) that depend
on the result of an earlier calculation, which set some bits in the
f (flag) register according to its result. You have to check the
Z80 reference manual carefully to find which instructions affect which bits of
the flag register. In our case, we have "Z is set if the result is zero; reset
otherwise" listed under the 8-bit or s instruction. nz
evaluates to true if the z bit is not set, and false
otherwise - in other words, it's true if the result was not zero. Many
instructions do not affect the flags at all so it is possible to do a
conditional jump that's determined by an instruction some way before it.
Available conditions are:
nz | Not Zero |
z | Zero |
nc | No Carry (overflow) |
c | Carry |
po | Parity Odd |
pe | Parity Even |
p | Positive |
m | Minus (negative) |
So, this section has set all of VRAM to zero. Now we've got a blank space to
start putting our data.
1.3.9 Load palette - CRAM and SMS palette data
Now we want to define our palette. SMS and GG graphics are defined by a 4-bit
(16 colour) palette. Each pixel is one of 16 colours in the palette*, and which
colours correspond to which palette index is entirely up to us. We can define
each of the 16 to be one of 64 possible colours (4096 for the Game Gear, which
I'm not going to go into). So that's what we're going to do. We're only
actually going to use two colours, black for the background and white for the
foreground, so we only need to define those two.
The colours are defined by the low 6 bits of a byte stored in a little bit more
RAM in the VDP, called the colour RAM (CRAM). One byte's 8 bits define the
colour as follows:
| Bit: |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
| % |
Unused |
Blue |
Green |
Red |
Notice how I number the bits from 7 at the left (Most Significant Bit) to 0 at
the right (Least Significant Bit). If you're not familiar with binary numbers
then I suggest you look it up on Google and learn a bit more. Note also that %
is used to represent a binary number.
Each colour component therefore has two bits to define it, which means it can
range from 0 (%00) to 3 (%11). So, for full intensity red I want R=3, G=0 and
B=0, which is %00000011. (I put zeroes in the unused bits.) White is %00111111,
yellow is %00001111, and so on. (If you're not familiar with how colours are
made up of RGB components, look that up too.) If these binary numbers are
written as hexadecimal, they'll range from $00 (black) to $3f (white). If you run
one of the various colour test demos (available
here) you'll find all 64 colours
shown, often with their corresponding numbers, for easy reference!

Note that different emulators
represent colours differently. Here, eSMS produces a white that's quite dark.
Also, Bock chose to convert the numbers to decimal, just to confuse matters.
You'll need to convert between decimal and hex a lot anyway, so why not practise
now? Check that 63 = $3f.
Here's a JavaScript table showing the SMS colours in various data formats. Your browser might mess it up if its JavaScript engine isn't correct.
Anyway, I took the values for black and white and stored them in my data section
lower down in the source file:
PaletteData:
.db $00,$3f ; Black, White
PaletteDataEnd:
Then I want to write this (small) data block to CRAM, which is done very
similarly to VRAM. The difference is, instead of the magic number $4000 which
signals a write to VRAM, I use $c000 for a write to CRAM. The address is $0000
for me to write to the first palette index. Then I use
otir to
output my data, very similarly to what was done previously:
;==============================================================
; Load palette
;==============================================================
; 1. Set VRAM write address to CRAM (palette) address 0 (for palette index 0)
; by outputting $c000 ORed with $0000
ld a,$00
out ($bf),a
ld a,$c0
out ($bf),a
; 2. Output colour data
ld hl,PaletteData
ld b,(PaletteDataEnd-PaletteData)
ld c,$be
otir
I could have just written the two bytes manually, but when the project gets more
advanced and I want 32 colours and several palettes I'll be glad of this
automation.
* There are actually two 16-colour palettes, one for sprites or tiles
and one for tiles only. Any one tile can only use one of the palettes and is so
still limited to 16 colours. For simplicity I'll ignore the sprite palette.
1.3.10 Loading the font - SMS tile format, pointers
Now we've got our palette going - and, in fact, if we missed out the rest of the
code, it would run and we could see the palette loaded in an emulator's palette
window. Now we want to define some graphics to go with it.
The VDP works with tiles. Each tile is an 8×8 pixel square and we have
about 450 of them available. The screen is then built up by referring each
square of the screen to one of the defined tiles. So, we have to define tiles to
be able to display anything.
As I said before, each pixel of a tile can be one of the 16 colours in the
palette. So, that's 4 bits per pixel × 64 pixels per tile = 256 bits = 32
bytes per tile. The way the data is stored is a bit hard to explain, so let's
take an example. Let's take the capital A from our font, and let the background
be colour 0 (which we have set as black with our palette) and the foreground
colour 1 (white). Then we will take one row of pixels from the top:
As you can see, I've converted each pixel to its binary representation - %0000
for 0 and %0001 for 1. Then I take the LSB (Least Significant Bit, bit
0, the rightmost bit) from each pixel - which is the top row of numbers in the
diagram - and get %00111100 = $3c. I repeat this for each row, which represent
successively more significant bits, until I have four bytes which describe the
top row of the tile: $3c $00 $00 $00. If I repeat this for each line I will end
up with my 32 bytes in the format used by the VDP.
For 1 bit graphics (either colour 0 or colour 1) you will notice that the last
three bytes for each row is zero, and the first is just the row's bits stuck
together. We can use this to our advantage - knowing this, we can just store the
image as a 1bpp (bit-per-pixel) image and make sure that after each byte we
write, we put three zeroes. This also has the advantage that we can write the
image data in a way that makes it still possible to see the image:
; Character 0x41 A
.DB %00111100 ; Hex 3Ch
.DB %01100110 ; Hex 66h
.DB %01100110 ; Hex 66h
.DB %01111110 ; Hex 7Eh
.DB %01100110 ; Hex 66h
.DB %01100110 ; Hex 66h
.DB %01100110 ; Hex 66h
.DB %00000000 ; Hex 00h
The 16KB of VRAM is split into three areas - the tile definitions, where our
~450 unique tiles are stored; the name table, where we define which tile each
part of the screen shows; and the sprite table, where we control the sprites.
We can arrange these three things any way we want, but apart from a few cases,
there is one way that makes the most efficient use of the available space:
| $0000-$37ff | Tiles - 448 @ 32 bytes per tile |
| $3800-$3eff | Screen - 32 x 28 locations @ 2 bytes each |
| $3f00-$3fff | Sprites - 64 @ 4 bytes each |
That's (partly) what my block of VDP initialisation data defined before.
So, as before we set the VDP ready to receive data, this time at address $0000
because that's where the first tile will be:
;==============================================================
; Load tiles (font)
;==============================================================
; 1. Set VRAM write address to tile index 0
; by outputting $4000 ORed with $0000
ld a,$00
out ($bf),a
ld a,$40
out ($bf),a
Then we want to loop through as many bytes as the font data takes up. We'll use
the same method we used before when we wanted to loop through $4000 bytes to
clear the VRAM. The difference is, now each time we want to also progress
through the font data, and after each data byte write three zeroes. We'll use
hl to "point" to the data - that means we put the memory address in hl, and use
that as the address each time, adding one to it each time round the loop.
You'll use pointers like this a lot in assembler.
So, here we go:
; 2. Output tile data
ld hl,FontData ; Location of tile data
ld bc,FontDataEnd-FontData ; Counter for number of bytes to write
WriteTilesLoop:
; Output data byte then three zeroes, because our tile data is 1 bit
; and must be increased to 4 bit
ld a,(hl) ; Get data byte
out ($be),a
ld a,$00
out ($be),a
out ($be),a
out ($be),a
inc hl ; Add one to hl so it points to the next data byte
dec c
jp nz,WriteTilesLoop
dec b
jp nz,WriteTilesLoop
You'll notice the instruction
ld a,(hl). The brackets mean "what's
pointed to by", so in full it's "load
a with what's pointed to by
hl". Also notice the
inc hl instruction - this is
INCrement, the opposite to
DECrement, and it adds one to the
register given. Here, I'm using the 16-bit version with register pair
hl but the syntax is the same for the 8-bit version with a single
register.
Now, when the emulator gets to this point, we will see the tiles appear in the
tile displayer... but there's still nothing on the screen.
1.3.11 Writing to the name table - name table format, ASCII conversion, terminators
We have to write to the name table to tell it which tiles to display where, to
show our message. The name table is a list of
words describing the
screen background over its entirety - which is 32 × 28 locations. Each entry
describes which tile that screen location should be filled with (a number from
0 to 448) and, with the extra high bits left over, some extra attributes such
as flipping, priority and which palette to use. We'll not use those attribute
bits yet though. The first entry describes the top-left position, then entries
describe the row to its right before moving down to the next row.
Looking at the tiles we've loaded in the emulator, we can see that the tile for
space is stored at index $00, '0' is at $11, 'A' is $21, and so on. Well, just
to be very handy, I chose a set of tiles which matches the ordering of letters
according to the ASCII standard. However, it does not match ASCII exactly,
because ASCII includes 32 control characters at the start which are useless to
me. In ASCII, space is $20, '0' is $31, 'A' is $41, and so on - each one
exactly $20 more than it is in my tiles. To convert from the ASCII code to the
tile index I have to subtract $20.
Why is this important? Because if I want to store "Hello world!" I need to know
which tile index each letter corresponds to. I could do it by hand:
Message:
.db $28,$45,$4c,$4c,$4f,$00,$57,$4f,$52,$4c,$44,$21
But that's really hard to read; what if I make a mistake; and what if I want to
change it later? Well, WLA DX is clever and allows me to enter ASCII directly,
like this:
Message:
.db "Hello world!"
It will then convert the text to ASCII and store that as data in the ROM file.
Then the program has to convert from ASCII to tile indices (by subtracting $20)
for each letter and now it's really easy to check it's right and change it if
wanted.
There's one more thing. In a typical program you might want to write more than
one thing - perhaps "Welcome to Hello World XP" at the start, then "Press any
button to start", then of course the lengthy credits sequence. It's a pain to
have to keep track of not only the location of each text string, but also its
length, so we'll borrow something from the world of PCs (and no doubt a
gazillion other computers) - we'll make it so there's a terminator byte.
A terminator byte can't be bargained with. It can't be reasoned with. It
doesn't feel pity, or remorse, or fear. And it absolutely will not stop, ever,
until you are dead. Erm... no, actually it's a byte included at the end of a
stream of data which cannot possibly be valid data, and signals that the data
is finished. In our case, we'll use $00 because that's what is very often used
on PCs, especially in the C language, giving a "null-terminated string":
Message:
.db "Hello world!",0
So, here's our outline of code:
And here's the source to do it:
;==============================================================
; Write text to name table
;==============================================================
; 1. Set VRAM write address to name table index 0
; by outputting $4000 ORed with $3800+0
ld a,$00
out ($bf),a
ld a,$38|$40
out ($bf),a
; 2. Loop through text data, converting from ASCII and writing tile indices
; to name table. Stop when a zero byte is found.
ld hl,Message
WriteTextLoop:
ld a,(hl) ; Get data byte
cp $00 ; Is it zero?
jp z,EndWriteTextLoop ; If so, exit loop
sub $20 ; In ASCII, space=$20 and the rest follow on.
; We want space=$00 so we subtract $20.
out ($be),a
ld a,$00
out ($be),a
inc hl ; Point to next letter
jp WriteTextLoop
EndWriteTextLoop:
First of all, we have to set the VRAM write address. As mentioned before, the
name table is stored at $3800 in VRAM, and we want to write to the start of it,
so we have to OR $3800 with $4000 (to tell it we want to write to that address)
and output it as before. I've decided to let WLA DX do the ORing for me, just
in case I mess it up, by writing "
$38|$40" for the high byte of the
(word) address. "
|" means "OR" and WLA DX will calculate the
answer while compiling the code, so the effect is the same as if I'd written
$78, but I think it's more clear this way.
Then I've set hl to point to the address of my message (which I
labelled with "Message:"); then, similarly to before, I load what
it points to.
cp means compare. It compares a to the
value given and sets the flag register accordingly, without modifying
a. (Internally, it is doing a subtraction.) The relevant flags will
then be:
| Flag | Set if | Relevant conditions |
z | a=value | z/nz |
c | a>value | c |
| a<=value | nc |
This is a simplified interpretation of the flags, but suffices in most
cases.
So, if a=$00 the code will jump to the EndWriteTextLoop:
label (ie. the end); otherwise, it gets to the sub
instruction. You might guess that means subtract - it subtracts the value
given from a.
Notice that for the "Write to name table" step I have convert it from a byte to
a word, but that's easy because all I have to do is set the high byte of the
word to zero. Again, I must swap them around when outputting. hl
is incremented and the process repeats.
What would happen if I forgot to put the terminating zero byte by adding
",0" at the end of the .db line? The code would keep
on processing whatever followed the text. By chance, it's the palette data and
that happens to start with a zero so there is no difference - but in another
case you might well end up with junk data there.
1.3.12 Turning on the screen - VDP register ($8)1
Now you'd think we'd finished - but we haven't. The whole time, we have had the
screen turned off. Why? Because when it's turned on, we have to be careful
about how we access VRAM, to avoid graphical corruption when run on a real
system (emulators generally don't get this corruption, though). By turning it
off, we can access VRAM any way we like and it will cause no problems.
The turning on and off of the screen is done through one of the VDP
registers which you noticed me gloss over before. The VDP has several
registers which control certain aspects of its operation; some are related to
legacy (SG-1000 type) video modes, some to the normal (SMS type) video mode and
some to both. The turning on and off of the screen is done with register 1.
We access VDP registers by writing words to port $bf again; but this time, the
data format is a bit different. The magical word signifying that it's a
register write is $8000. The register we want to write to is given by the high
byte of the word; so, for register 1 we have $0100. Finally, the data to be
written to that register is given by the low byte of the word - we don't output
data to port $be at all!
So, in practice, we will actually have to do something like this:
ld a,$xy
out ($bf),a
ld a,$8z
out ($bf),a
where
$xy is the data and
z is the register number.
So, in some ways, it is easier (if inaccurate) to think that you're putting
$xy in register
$8z, hence me calling it "register
($8)1" at the top of this section.
Anyway, let's see what register $81 controls.
| Bit | Function | If set | If reset |
| 7 | VRAM size | VRAM is 16KB | VRAM is 8KB |
| 6 | Enable display | Display on | Display off |
| 5 | Vblank interrupts | Interrupt generated on VBlank | VBlank gives no interrupts |
| 4 | 28 row display | Screen shows 28 rows (eg. Codemasters games) | Screen is normal size |
| 3 | 30 row display | Screen shows 30 rows | Screen is normal size |
| 2 | Unused | No effect | No effect |
| 1 | Doubled sprites | Each sprite defined will also show the next tile under it | Normal sprites |
| 0 | Zoomed sprites | Sprites are stretched to 16x16 pixels | Normal sprites |
Charles MacDonald's "Sega Master System VDP documentation" describes all of the
registers in very good detail. Anyway, there is far too much information here
for us to remember every time, so it makes sense for us to add comments to make
it clear what we're doing. I think you'll agree that we want to set bits 7 and
6 (bit 7 should always be set; and we want the screen on now) and the rest
aren't much use to us (the bigger screen displays look appealing but introduce
more difficulties). So:
; Turn screen on
ld a,%11000100
; |||| |`- Zoomed sprites -> 16x16 pixels
; |||| `-- Doubled sprites -> 2 tiles per sprite, 8x16
; |||`---- 30 row/240 line mode
; ||`----- 28 row/224 line mode
; |`------ VBlank interrupts
; `------- Enable display
out ($bf),a
ld a,$81
out ($bf),a
You see how I've written the data byte in binary form, and labelled it? I
suggest you do that every time you access such a register (others aren't split
into many parts so you don't need to).
1.3.13 Time to stop - infinite loops
OK, now our code is almost finished. We've done everything we wanted to do; but
there's one more thing we have to do. The Z80 will execute all the code we've
written, and then when it gets to the end it will keep on going and going
forever, never stopping. We don't want that, we want it to stop; so what we'll
do is put it in an
infinite loop. Normally, infinite loops are a bad
thing because they stop your program ever continuing; you'll probably create a
few by accident and have to figure out why they're happening and fix the bug
causing them. But here, we want one. We want the processor to keep doing the
same thing (nothing) over and over again forever, which we will achieve by
making a jump point to itself:
; Infinite loop to stop program
Loop:
jp Loop
When the Z80 gets to the instruction
jp Loop it will jump to the
label
Loop: immediately. When it gets there, it will find the
instruction
jp Loop and will jump to the label
Loop:
immediately. When it gets there, it will find the instruction
jp Loop and will jump to the label
Loop:
immediately... and so on forever.
Now we've added that infinite loop, we know the program will never get past it;
so here is a safe place to put data. Why does it matter where you put data?
Because you have to make sure that the data is never accidentally interpreted as
code. The Z80 can't tell if what it's looking at is sensible program code or
data, it assumes everything is program code. So you have to make sure that the
place you insert data is outside the program code and that execution will never
accidentally get to your data. For a simple program like this one (with no
"functions", just one code block) we put it after the program. We could equally
have put it before the "main:" label, and at the start of the
program execution would have jumped straight past it to that label.
I can put the data in any order I like because it doesn't matter - it's not
necessary to put it in the order it's used. In larger projects you may choose
to order the data logically to make it easier to navigate, and maybe split the
data up according to what it's for.
Anyway, you may have noticed that the program is now finished. Press F11 to run
it again.
1.4 Enhancing our program
1.4.1 Linking to external files
You might have noticed that nearly 90% of the source to "Hello World!" is taken
up by the font; and that font is already defined for us by Mike G, so we don't
need to edit it. So why not put it in its own file, and just tell WLA DX that it
should insert that file at that point?
FontData:
.include "BBC Micro font.inc"
FontDataEnd:
Copy everything between the
Fontdata: and
FontDataEnd:
labels to a new file and save it as "BBC Micro font.inc" in the same folder as
the "Hello World.asm" source file; then delete everything you copied and replace
it with the directive shown above. The file will then be "virtually" inserted
at that point while compiling, exactly like C's
#include
preprocessor command. So whatever's in the linked file must be correct code.
There are a few more useful commands for including data from external files:
.include "name.inc" | Include source file "name.inc" |
.incbin "data.bin" | Include binary file "data.bin",
as if .includeing a version converted to source .db
directives |
.incdir "c:\path\" | Change the directory where
.included and .incbined files are assumed to be |
1.4.2 Functions - call, ret, push, pop: the stack finally explained
You might have noticed that even in such a simple program as "Hello World!", we
found ourselves doing the same thing over and over again:
; 1. Set VRAM write address to 0 by outputting $4000 ORed with $0000
ld a,$00
out ($bf),a
ld a,$40
out ($bf),a
Wouldn't it be easier if we could write some code to do that more easily?
Something like
ld VRAM write address,$3800 ; Not a real instruction :(
Well, we can't do it quite like that, but we can do something almost the same.
But first we need to learn about the stack.
The usual description is that it's like a stack of playing cards, with the
magical limitation that we can only take the top card from the stack, or put
another one on there. The important thing is, the cards come off in the reverse
order they're put on, so it's important not to get them mixed up.
For the Z80, the stack is a section of memory containing words, not bytes.
We can push a register pair onto the stack and the Z80 will store
it in that section of memory. We can then pop it back into any register
pair, although it usually only makes sense to pop it into the one you took it
from. It allows you to do something like this:
ld hl,$1234
push hl
ld hl,$5678
; Do something with hl (which contains $5678)
; ...
pop hl
; hl now contains $1234 again
...in effect, "saving" the contents of that register so you can do something
else with it, then restoring it to its previous state. The other main use for
the stack is for functions. There is a Z80 instruction "
call" which
is exactly like
jp, in that it makes execution jump to a certain
point instead of continuing on linearly; except that first, it
pushes the
pc register pair, which by now contains the address of the next
instruction after the
call, onto the stack. Then, some time after
jumping to the given address, if it encounters a
ret instruction
it will
pop the stored
pc address and start executing code
from there, in effect
returning to the point it was at before:
Somewhere in the program, usually not in the normal flow of the program:
MyFunction:
inc a ; Do something
ret ; return
In the normal flow:
ld a,$00
call MyFunction
; a now contains $01
call MyFunction
; a now contains $02
Again, remember you have to be careful with the order you push/pop, especially
when mixed in with calls and returns. This:
call MyFunction
MyFunction:
ld hl,$1234
push hl
ret ; Error!
will not work, because the
ret will take the last thing
pushed, which is $1234, and execution will continue at $1234!
Except in very few circumstances, that's not something you'll want to do,
because $1234 might be some data, or some completely unrelated code, or even
halfway through an instruction!
Anyway, let's get on with our useful function, now we know what to be careful
about. Let's make it possible to specify an address in register pair
hl which will then be ORed with $4000 and output to port $bf,
thereby setting the VRAM address and making it ready to write:
VRAMToHL:
ld a,l
out ($bf),a
ld a,h
or $40
out ($bf),a
ret
This takes the value in
hl, outputs the low byte (stored in
l), ORs the high byte (
h) with $40, then outputs that.
We have to transfer the data from
h or
l into
a to be able to output it, and incidentally also to be able to OR
it. We don't bother ORing the low byte with $00 - if you don't know why, try it
out on a calculator with different values for low byte.
We can now do something like this to set the VRAM write address to $0000:
ld hl,$0000
call VRAMToHL
Isn't that easier? But there's a problem. What will be the value in
a after this code runs?
ld a,$27
ld hl,$0000
call VRAMToHL
You'd want it to still be $27; but actually it will be $40, because in the
VRAMToHL function we used
a, overwriting what was in there before.
This is a case where we want to "save" the register using
push and
pop. Register
a gets paired with the flag register
f to form pair
af. Here's our amended function:
VRAMToHL:
push af
ld a,l
out ($bf),a
ld a,h
or $40
out ($bf),a
pop af
ret
Notice how I use indentation to make it clear which code block I'm saving
af for the duration of. In general, you should
push
every register you will change the contents of at the start of the function, and
pop them all in reverse order at the end; unless you intend to
"return a value" in one of the registers, in which case you should not save that
one. Pushing and popping has other uses, particularly in more advanced code when
you run out of registers you can use.
1.5 Exercises
- Modify "Hello World!" so it displays more text - try to fill the whole
screen.
Hint 1
Hint 2
- Change one of the times where the VRAM write address is set to use the new
function, and check if it still works by compiling (F9) and running (F11). When
it does, change all of them in the same way.
- Make it so the text displayed is stored in an external file. Does this
external file need to contain any
.db directives?
Answer
- (Compulsory exercise) Send me an email
here with any questions or problems you've come across, and any suggestions
for the next instalment.