Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - Graphics corpus for compression research

Reply to topic
Author Message
  • Joined: 05 Dec 2019
  • Posts: 56
  • Location: USA
Reply with quote
Graphics corpus for compression research
Post Posted: Wed Feb 05, 2020 2:09 am
Say I want to test theories about graphics compression, such as adapting UFTC or the like to SMS. Is there a good corpus of 4bpp sprite sheets to test with?
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14740
  • Location: London
Reply with quote
Post Posted: Wed Feb 05, 2020 7:37 am
I guess tile sets are more appropriate than sprite sheets in the strict sense. We have a full dump of the assets from Alex Kidd in Miracle World here: https://www.smspower.org/Development/AlexKiddInMiracleWorld-SMS and I have a partial equivalent for Phantasy Star as part of the retranslation project, only covering the backgrounds.

For my experiments I was somewhat hoping to make a testing framework that would include cycle counting the decompressor, confirming the correctness of the output, and indeed having a corpus of images; none of these targets has been achieved yet.
  View user's profile Send private message Visit poster's website
  • Joined: 29 Mar 2012
  • Posts: 886
  • Location: Spain
Reply with quote
Post Posted: Wed Feb 05, 2020 8:45 am
Not really big, but that one maybe is useful:
https://github.com/kusfo/mastersystembrawler/blob/master/gfx-source/player_spritesheet.png
  View user's profile Send private message
  • Joined: 05 Dec 2019
  • Posts: 56
  • Location: USA
Reply with quote
Post Posted: Wed Feb 05, 2020 6:33 pm
Had I said "tile sets" then people might have assumed I was talking about backgrounds and only backgrounds. I acknowledge that it would be valuable to include background tile sets in a corpus for several reasons:

  1. Lack of sprite tile flipping on SMS/GG encourages streaming tiles into VRAM instead of decompressing them and parking them there the way you'd do on NES and GB. You'd have to decompress a cel to work RAM and then copy or flip-copy it to VRAM.
  2. Background tiles may have different statistics: less detail in general and less area of flat color 0.
  3. The last three NES projects I've worked on (Haunted: Halloween '85, Haunted: Halloween '86, and the forthcoming Full Quiet) have had (I'd estimate) five to ten times more background tile data than sprite tile data.


One reservation that I've had about building a corpus out of assets from proprietary games, such as Alex Kidd or Sonic Chaos, is that posting their tile sets publicly is copyright infringement. Should Sega come under new management that becomes as protective of its copyrights as Disney and [a video game company in Redmond that isn't Microsoft] have been, I don't want this to cost me my GitHub account for being a "repeat infringer." So I'd prefer SMS, MD, SNES, or GBA games whose 4bpp assets are already lawfully released under a license allowing verbatim distribution and excerpting.

I'll start a new topic about unit testing.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14740
  • Location: London
Reply with quote
Post Posted: Wed Feb 05, 2020 7:05 pm
That makes it much harder indeed, because it may be hard to produce a representative corpus. The Alex Kidd set shows a wide range of tile counts per chunk - although that matters little for RLE, it does give LZ less to work with. Homebrew may have a different distribution depending on the preferences of the developer.

Some games do use compressed assets for sprites, as the majority do not need to be streamed. But I'd approximate that the majority of "advanced" games do stream tiles for animation.

My previous tests have often focused on title screens and large tile sets, and found LZ to do well.

You could provide the corpus in the form of decompressors and lists of data offsets which can generate the corpus from the entirely legally obtained ROM images, leaving the act of infringement up to the user.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Dec 2019
  • Posts: 56
  • Location: USA
Reply with quote
Post Posted: Wed Feb 05, 2020 10:33 pm
Finding tile sets on OpenGameArt that appear representative of SMS graphics is one approach we could try. Ketsuban in the gbdev Discord server suggested "Tuxemon", a background tile set for RPG exteriors, posted by Buch to OpenGameArt under CC BY-SA 3 license. It looks like SNES/GBA class detail.
  View user's profile Send private message Visit poster's website
Reply to topic



Back to the top of this page

Back to SMS Power!