|
ForumsSega Master System / Mark III / Game GearSG-1000 / SC-3000 / SF-7000 / OMV |
Home - Forums - Games - Scans - Maps - Cheats - Credits Music - Videos - Development - Hacks - Translations - Homebrew |
Author | Message |
---|---|
|
Is this VRAM<->VRAM copy routine safe?
Posted: Fri Apr 14, 2017 7:51 pm
|
Hi all,
I am trying to track down a bug I have and I am wondering if this simple subroutine that does an effective ldir within vram is safe: ; a faster method that copies more than one byte at a time should be sought INCLUDE "config_private.inc" SECTION code_clib SECTION code_crt_common PUBLIC asm_sms_memcpy_vram_to_vram asm_sms_memcpy_vram_to_vram: ; memcpy within vram ; ; enter : hl = void *src in vram ; de = void *dst in vram ; bc = unsigned int n > 0 ; ; exit : hl = void *src, &byte after last read ; de = void *dst, &byte after last written ; bc = 0 ; ; uses : af, bc, de, hl, af' loop: ; must yield opportunities for an interrupt to occur di ld a,l out (__IO_VDP_COMMAND),a ld a,h out (__IO_VDP_COMMAND),a in a,(__IO_VDP_DATA) ex af,af' ld a,e out (__IO_VDP_COMMAND),a ld a,d or $40 out (__IO_VDP_COMMAND),a ei ex af,af' out (__IO_VDP_DATA),a inc de cpi jp pe, loop ret I'm specifically looking at how close (in time) the data port access is to the command port writes. Is it ok to have vdp data port access this close to vdp command port accesses? |
|
|
Posted: Sat Apr 15, 2017 6:53 am |
This one goes even faster: https://github.com/maxim-zhao/bmp2tilecompressors/blob/master/decompressors/aPLi... and I seem to remember it's fine on hardware. | |
|
Posted: Sat Apr 15, 2017 3:09 pm |
Ah cheers for that and it keeps the ldir semantic by updating de and hl.
That's also a newer version of aplib decompress than the one I have here. |
|
|
Posted: Sun Apr 16, 2017 8:23 am |
It's optimized for speed at the expense of code size. It's still pretty slow though. | |
|
Posted: Sun Apr 16, 2017 11:46 pm |
I updated my vram->vram copy to be more like yours.
There is a small trick there to do a 16-bit loop using djnz.
I've written a new decompression to vram using zx7 which is small, relatively fast and optimal for lz77: dzx7_standard_vram I haven't tested it or compared speed to aplib yet but it's the main compressor we use for everything. I suspect the vram->vram copy dominates speed anyway. Some more reading material. |
|
|
Posted: Mon Apr 17, 2017 6:38 am |
I already made a BMP2Tile plugin for ZX7, I'm not sure why I didn't commit the decompressor yet. The performance seemed not to compare well to ApLib. | |
|
Posted: Mon Apr 17, 2017 2:53 pm |
I ran some compressions on some stuff I had handy: 8x8 1-bit Fonts raw
768 font_8x8_bbc_system.bin 768 font_8x8_clairsys.bin 768 font_8x8_clairsys_bold.bin 768 font_8x8_zx_system.bin zx7 428 font_8x8_bbc_system.bin.zx7
429 font_8x8_clairsys.bin.zx7 453 font_8x8_clairsys_bold.bin.zx7 453 font_8x8_zx_system.bin.zx7 appack 426 font_8x8_bbc_system.bin.ap
430 font_8x8_clairsys.bin.ap 449 font_8x8_clairsys_bold.bin.ap 440 font_8x8_zx_system.bin.ap 4x8 1-bit Fonts raw 768 font_4x8_default.bin
zx7 360 font_4x8_default.bin.zx7
appack 357 font_4x8_default.bin.ap
1-Bit Music raw 2,629 bitm_journey.bin
1,513 bitm_triceropop.bin zx7 1,512 bitm_journey.bin.zx7
906 bitm_triceropop.bin.zx7 appack 1,473 bitm_journey.bin.ap
885 bitm_triceropop.bin.ap Semi-sampled sound effects from Mario Bros for AY-3-8910 raw 5,412 ay_effects.bin
zx7 4,303 ay_effects.bin.zx7
appack 4,099 ay_effects.bin.ap Midi music from Mario Bros for AY-3-8910 raw 510 midi_pb_title.bin
zx7 281 midi_pb_title.bin.zx7
appack 274 midi_pb_title.bin.ap
Pietro Bros graphics tiles (zx spectrum nirvana game) raw 10,368 pietro.btile.bin
zx7 3,829 pietro.btile.bin.zx7
appack 3,781 pietro.btile.bin.ap
Total: raw = 34,886 bytes zx7 = 19,632 bytes - 56.3% appack = 19,036 bytes - 54.5% I wish I had a more complete set to compare with handy but I don't. zx7 can also prime its dictionary by using a prefix during the decompression which can lead to better compression ratios. The reason we use it as the main compressor is its small size, no static ram requirement and ability to do overlapped decompression (important on ram machines). |
|