Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - Ms. Pac-Man (SMS) graphical overhaul hack mini-project

Reply to topic
Author Message
  • Joined: 12 Jul 2022
  • Posts: 2
  • Location: LINUX, MA
Reply with quote
Ms. Pac-Man (SMS) graphical overhaul hack mini-project
Post Posted: Tue Jul 12, 2022 10:09 pm
Ahoy, I'm working on a graphical hack for Ms. Pac-Man on the SMS. Everyone in the Pac-Man fan community agrees that the visuals are incredibly cursed and I would like to remedy that! So far, I have replaced the standard sprites of Ms. Pac-Man and Pac-Man, the ghosts, the score bonus visuals, and the regular fruits. I still need to work on the Pac-Booster variants of Ms. Pac-Man and Pac-Man, as well as the extended fruits.

However, I've got a problem. Here's a screenshot of how things look in-game:

Notice how the UI graphics are unchanged? This is because they are stored compressed in the ROM... and I got no clue on where to start looking for them so I can decompress them, rework them, then recompress them. I would also love to be able to change the weird offsetting of sprite animation slices so I can actually use the entire 16x16 pixel area without things getting off-centered and jagged (it's why the ghosts are smaller!) but I don't think that can be fixed without disassembling and modifying the game code? But my main goal here is finding the addresses in the ROM for where the compressed graphics are stored, getting them decompressed, then getting my new graphics recompressed and inserted. There's plenty of free space in the ROM; duplicated sprite graphics were used as padding at the end so if the new compressed data is larger than the original I can clear those out, then it's just a matter of finding where the pointers are in the ROM to edit them.

Any and all information is deeply appreciated! :) Thank you!
mspacsms_earlyscreen.png (5.63 KB)
Attachment fairy
mspacsms_earlyscreen.png

  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14686
  • Location: London
Reply with quote
Post Posted: Tue Jul 12, 2022 10:17 pm
Emulicious can help you find where the data came from but the compression is probably proprietary and we don’t have many tools to cover European games. Either way, you’re going to need to disassemble, and write a compressor.
  View user's profile Send private message Visit poster's website
  • Joined: 12 Jul 2022
  • Posts: 2
  • Location: LINUX, MA
Reply with quote
Post Posted: Wed Jul 13, 2022 12:05 am
Maxim wrote
Emulicious can help you find where the data came from but the compression is probably proprietary and we don’t have many tools to cover European games. Either way, you’re going to need to disassemble, and write a compressor.
After a lot of pain from OpenJDK being a jerkface, I got Emulicious going and found where the compressed data is located, as well as the routine in the ROM for decompressing it. For the record, the routine starts at address 01:7749 (or 01:76E3?) and the level tiles/UI are at 07:9AA6.
Shame my assembly skills are rooted with m68k... :|
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3759
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Wed Jul 13, 2022 9:55 am
a better take could be to remove their de-compressor and compressed data and replace with a different de-compressor (one that has source available) and compressed data... PSGaiden compressor for instance
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14686
  • Location: London
Reply with quote
Post Posted: Wed Jul 20, 2022 8:10 pm
The decompressor seems to start at $76e3, I'm working on a disassembly now...

Edit: $76e3 is an entry point to look art up in a table from $14000, with 8 bytes per asset, and big-endian words which is super weird. There's almost identical code later that loads art from the same table, but maybe in a different situation. It seems to support four different decompression functions, but only used modes 1 and 3 in the table. More to come...
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14686
  • Location: London
Reply with quote
Post Posted: Thu Jul 21, 2022 8:31 am
Last edited by Maxim on Thu Jul 21, 2022 1:14 pm; edited 1 time in total
For each asset, the header contains 8 bytes:

0, 1: mapped address, big-endian
2, 3: length in bytes for first part, big-endian
4, 5: length in bytes for second part, big-endian. Usually 0
6: page number of first part
7: compression method

The first and second parts are used to allow splitting assets across bank boundaries. The code decompresses the first number of bytes from the address and page number given, then the second number of bytes from address $8000 and the next page number. This allows splitting assets across page boundaries.

Mode 0 works as follows:

* The first byte tells you what the RLE marker is. This may be any value.
* Subsequent data is either
* The RLE marker, in which case it is followed by a 1-byte value and a 1-byte counter. Emit the value the given number of times.
* Or some other value, in which case it is raw data.

Mode 1 is the same as mode 0, but the data is emitted to a 32-byte buffer. Once full, they are emitted to the VDP, interleaved by 8. This means it emits bytes 0, 8, 16, 24, 1, 9, 17, 25, 10, 18, ... which is almost, but not quite, the right sort of interleaving for tile data to improve RLE performance.

Mode 2 is much simpler: it is just raw VDP data emitted directly.

Mode 3 is like mode 1 except instead of a 32-byte buffer, it has a 3-byet buffer; and when it emits data, it does byte 0, low nibble of byte 2, byte 1, high nibble of byte 2. This is a way to pack two 12-bit values into three bytes; it presumably uses this for tilemaps, which are 13 bits per entry.

As mentioned above, the data seems to only use modes 1 (for tiles) and 3 (for tilemaps).

It should be pretty simple to make a compressor for bmp2tile to support these, but the header would need to be corrected manually. The use of a data-defined RLE marker is interesting, I guess you would pick the least used byte. Any appearance of that byte in the data would have to be “escaped” as an RLE sequence of length 1. Therefore you could also use any byte which always appears in a sequence of 3 or more to maximise compression. Consider that bmp2tile has an “exe” plugin that allows you to invoke things like a Python script or any other language you might prefer to use.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14686
  • Location: London
Reply with quote
Post Posted: Thu Jul 21, 2022 12:57 pm
Oh, and:

Quote
duplicated sprite graphics were used as padding at the end

Very many games from this era were built in such a way that the unused areas of the ROM are not cleared to a "blank" value, instead they contain either arbitrary contents of RAM from the computer building the ROM (e.g. sometimes we see some of the source code from the assembler's RAM) or older versions of the ROM, or even unrelated ROMs (where the ROM is built by writing only the used bytes over some image in a ROM emulator). This can make it hard to reason about what's used and what isn't.

Ms. Pac-Man put its compressed graphics data from $14000 to $1dabf, the area beyond that seems unused (as you say, it seems to have copies of sprite art embedded) so there's plenty of spare space, especially if you can be bothered to re-pack it.
  View user's profile Send private message Visit poster's website
Reply to topic



Back to the top of this page

Back to SMS Power!