Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - Is this VRAM<->VRAM copy routine safe?

Reply to topic
Author Message
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Is this VRAM<->VRAM copy routine safe?
Post Posted: Fri Apr 14, 2017 7:51 pm
Hi all,

I am trying to track down a bug I have and I am wondering if this simple subroutine that does an effective ldir within vram is safe:


; a faster method that copies more than one byte at a time should be sought

INCLUDE "config_private.inc"

SECTION code_clib
SECTION code_crt_common

PUBLIC asm_sms_memcpy_vram_to_vram

asm_sms_memcpy_vram_to_vram:

   ; memcpy within vram
   ;
   ; enter : hl = void *src in vram
   ;         de = void *dst in vram
   ;         bc = unsigned int n > 0
   ;
   ; exit  : hl = void *src, &byte after last read
   ;         de = void *dst, &byte after last written
   ;         bc = 0
   ;
   ; uses  : af, bc, de, hl, af'
   
loop:

   ; must yield opportunities for an interrupt to occur

   di

   ld a,l
   out (__IO_VDP_COMMAND),a
   ld a,h
   out (__IO_VDP_COMMAND),a

   in a,(__IO_VDP_DATA)
   ex af,af'

   ld a,e
   out (__IO_VDP_COMMAND),a
   ld a,d
   or $40
   out (__IO_VDP_COMMAND),a

   ei

   ex af,af'
   out (__IO_VDP_DATA),a
   
   inc de
   
   cpi
   jp pe, loop
   
   ret


I'm specifically looking at how close (in time) the data port access is to the command port writes. Is it ok to have vdp data port access this close to vdp command port accesses?
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14726
  • Location: London
Reply with quote
Post Posted: Sat Apr 15, 2017 6:53 am
This one goes even faster: https://github.com/maxim-zhao/bmp2tilecompressors/blob/master/decompressors/aPLi... and I seem to remember it's fine on hardware.
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sat Apr 15, 2017 3:09 pm
Ah cheers for that and it keeps the ldir semantic by updating de and hl.

That's also a newer version of aplib decompress than the one I have here.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14726
  • Location: London
Reply with quote
Post Posted: Sun Apr 16, 2017 8:23 am
It's optimized for speed at the expense of code size. It's still pretty slow though.
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sun Apr 16, 2017 11:46 pm
I updated my vram->vram copy to be more like yours.

There is a small trick there to do a 16-bit loop using djnz.

Maxim wrote
It's optimized for speed at the expense of code size. It's still pretty slow though.


I've written a new decompression to vram using zx7 which is small, relatively fast and optimal for lz77:

dzx7_standard_vram

I haven't tested it or compared speed to aplib yet but it's the main compressor we use for everything. I suspect the vram->vram copy dominates speed anyway.

Some more reading material.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14726
  • Location: London
Reply with quote
Post Posted: Mon Apr 17, 2017 6:38 am
I already made a BMP2Tile plugin for ZX7, I'm not sure why I didn't commit the decompressor yet. The performance seemed not to compare well to ApLib.
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Mon Apr 17, 2017 2:53 pm
Maxim wrote
I already made a BMP2Tile plugin for ZX7, I'm not sure why I didn't commit the decompressor yet. The performance seemed not to compare well to ApLib.


I ran some compressions on some stuff I had handy:

8x8 1-bit Fonts

raw

768 font_8x8_bbc_system.bin
768 font_8x8_clairsys.bin
768 font_8x8_clairsys_bold.bin
768 font_8x8_zx_system.bin


zx7

428 font_8x8_bbc_system.bin.zx7
429 font_8x8_clairsys.bin.zx7
453 font_8x8_clairsys_bold.bin.zx7
453 font_8x8_zx_system.bin.zx7


appack

426 font_8x8_bbc_system.bin.ap
430 font_8x8_clairsys.bin.ap
449 font_8x8_clairsys_bold.bin.ap
440 font_8x8_zx_system.bin.ap


4x8 1-bit Fonts


raw

768 font_4x8_default.bin


zx7

360 font_4x8_default.bin.zx7


appack

357 font_4x8_default.bin.ap


1-Bit Music

raw

2,629 bitm_journey.bin
1,513 bitm_triceropop.bin


zx7

1,512 bitm_journey.bin.zx7
906 bitm_triceropop.bin.zx7


appack

1,473 bitm_journey.bin.ap
885 bitm_triceropop.bin.ap


Semi-sampled sound effects from Mario Bros for AY-3-8910


raw

5,412 ay_effects.bin


zx7

4,303 ay_effects.bin.zx7


appack

4,099 ay_effects.bin.ap


Midi music from Mario Bros for AY-3-8910


raw

510 midi_pb_title.bin


zx7

281 midi_pb_title.bin.zx7


appack

274 midi_pb_title.bin.ap


Pietro Bros graphics tiles (zx spectrum nirvana game)


raw

10,368 pietro.btile.bin


zx7

3,829 pietro.btile.bin.zx7


appack

3,781 pietro.btile.bin.ap




Total:


raw = 34,886 bytes
zx7 = 19,632 bytes - 56.3%
appack = 19,036 bytes - 54.5%


I wish I had a more complete set to compare with handy but I don't.

zx7 can also prime its dictionary by using a prefix during the decompression which can lead to better compression ratios.

The reason we use it as the main compressor is its small size, no static ram requirement and ability to do overlapped decompression (important on ram machines).
  View user's profile Send private message Visit poster's website
Reply to topic



Back to the top of this page

Back to SMS Power!