Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - Sprite loading, flipping, and expansion routines

Reply to topic
Author Message
  • Joined: 05 Dec 2019
  • Posts: 56
  • Location: USA
Reply with quote
Sprite loading, flipping, and expansion routines
Post Posted: Thu Dec 05, 2019 8:09 pm
Last edited by PinoBatch on Sat Jul 31, 2021 12:03 am; edited 1 time in total
I began programming for the NES in about 2000 (with dedication increasing in late 2008), the Game Boy Advance in 2002, and the Game Boy in April 2018. (I can give details once I'm no longer "new here.") Now I'm trying my hand at SMS. The Z80 CPU is mostly similar to the Game Boy CPU, but the VDP looks like a fish swimming in reverse.

- 4bpp
- Background flipping, not sprite flipping
- Background priority, not sprite priority
- Background horizontal scrolling is in the opposite direction
- No mid-screen vertical scroll changes

To work around the ROM size hit of the first two of these, and knowing that I use horizontal flipping far more often than vertical, I devised subroutines to flip sprites horizontally as I load them. Each of them takes about 5 scanlines per 8x8-pixel tile. I've tested them in a framework based on Maxim's "How to Program" tutorial.

Can you spot any poor practices?

;;
; Loads 4bpp tile data to VRAM with optional bitplane transformation.
; 4bpp version: 137 cycles/sliver
; A 16x32 sprite cel like Mario is 64 slivers or 8768 cycles
; A scanline is 228 cycles, so this is 39 lines
; @param HL source
; @param B sliver count (width*height/8, or data size/4)
; @param D high byte of pointer to transformation (identity, bit
; reverse table, or scale) table
load_4bpp_cel:
  ld e, [hl]        ; 7
  inc hl            ; 6
  ld a, [de]        ; 7
  out (VDPDATA), a  ; 11
  ld e, [hl]        ; 7
  inc hl            ; 6
  ld a, [de]        ; 7
  out (VDPDATA), a  ; 11
  ld e, [hl]        ; 7
  inc hl            ; 6
  ld a, [de]        ; 7
  out (VDPDATA), a  ; 11
  ld e, [hl]        ; 7
  inc hl            ; 6
  ld a, [de]        ; 7
  out (VDPDATA), a  ; 11
  djnz load_4bpp_cel; 13
  ret

;;
; 2bpp to 4bpp expansion and optional flipping: 129 cycles/sliver
; A 16x32 sprite cel is 8256 or 37 lines
; @param HL source
; @param B sliver count (width*height/8, or data size/4)
; @param D high byte of pointer to transformation (identity, bit
; reverse table, or scale) table
; @param IX subpalette choice. $0000: use colors 0, 1, 2, 3;
; $00FF: 0, 5, 6, 7; $FF00: 0, 9, 10, 11; $FFFF: 0, 13, 14, 15
load_2bpp_cel:
  ld e, [hl]        ; 7
  ld a, [de]        ; 7
  out (VDPDATA), a  ; 11
  inc hl            ; 6
  ld c, a           ; 4
  ld e, [hl]        ; 7
  ; peak register pressure is here:
  ; HL: src ptr; DE: next flip byte; C: plane 0; B: count;
  ; A: must be open to retrieve plane 1
  ; Thus we need IX for 2bpp to 4bpp expansion
  ld a, [de]        ; 7
  out (VDPDATA), a  ; 11
  or c              ; 4

  ld c, a           ; 4
  and ixl           ; 8
  out (VDPDATA), a  ; 11
  inc hl            ; 6  - increment HL here to space out VDPDATA writes
  ld a, c           ; 4
  and ixh           ; 8
  out (VDPDATA), a  ; 11
  djnz load_2bpp_cel; 13
  ret

(I'm in the habit of using square brackets for register pairs because a widely used Game Boy assembler requires them.)

The code uses two lookup tables: one for identity (no flipping) and one for bit reversing (horizontal flipping). The same principle would allow Neo Geo-style shrinking, which incidentally I've done in a tech demo on the NES, and I might investigate that once I implement skipping source rows for vertical shrinking as well.

.section "idtable" align 256 free
identity_table:
  .repeat 256 index I
    .db I
  .endr
hflip_table:
  .repeat 256 index I
    .db ((I&$80)>>7)|((I&$40)>>5)|((I&$20)>>3)|((I&$10)>>1)|((I&$08)<<1)|((I&$04)<<3)|((I&$02)<<5)|((I&$01)<<7)
  .endr
.ends

sms-hello-with-obj.zip (100.43 KB)

  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14691
  • Location: London
Reply with quote
Post Posted: Thu Dec 05, 2019 8:37 pm
The unrolled loop could be done with a .repeat.
I didn’t do the maths but maybe your loader could use outi to combine the load, out and increment but this would also mess with b, it would need to be the byte count and that could easily exceed 8 bits.
Some assemblers don’t support ixh as it’s technically undocumented; this is an issue for interfacing with SDCC.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3763
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Fri Dec 06, 2019 11:52 am
Hi! I remember you from the GBAdev.org forum times - welcome here!
As for your code, I'd say it's pretty good, it shows you're not a first-timer. (I would say most of times you load tiles to VRAM when screen is off or when in vblank, so there are no speed constraints - but if your code can work while in vdraw without needing to slow it down on purpose, that's nice!)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Dec 2019
  • Posts: 56
  • Location: USA
Reply with quote
Post Posted: Mon Dec 09, 2019 7:52 pm
I've made a slightly less trivial example. The player movement code is written in Game Boy ASM, and it translated to Z80 almost verbatim. It currently reloads the character's 16x24-pixel (6-tile) cel every vblank; application in a real game would reload cels for only those actors whose cels have changed.

Does this ROM behave as expected on authentic hardware? It runs in BlastEm, but all I have hardware-wise are an EverDrive and a Genesis 3 VA1 that hasn't been modified to wire up the signals used by the PBC. (I bought a Genesis 2 on eBay, but its power button proved unstable, and I've sent it back.)
sms-move-podge.zip (103.06 KB)

  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14691
  • Location: London
Reply with quote
Post Posted: Mon Dec 09, 2019 8:22 pm
Emulicious is probably the best emulator for testing if it breaks any timing constraints.
  View user's profile Send private message Visit poster's website
Reply to topic



Back to the top of this page

Back to SMS Power!