Forums

Sorry if I missed something obvious, but I found almost no information on how games store/compress their level data and I was curious if someone has experience in regards to that topic.

I currently try to write an engine for a game, which includes horizontal + vertical scrolling. I made a little sample level, which is 3 screens wide and high (768x576 pixel or 96x72 characters) and if I want to save the complete tilemap of the level, I basically burn through a whole 16KB bank (it actually boils down to 13824 bytes). Now I am wondering what is the best way to compress the data without loosing a lot of computational resource (which I need for other stuff).

What I found on the internet:
- Sonic uses RLE compression (http://info.sonicretro.org/SCHG:Sonic_the_Hedgehog_(8-bit)#RLE_Compression).
- In addition they use block mappings for mapping one block index to a map of 4x4 tiles
- Based on the symbol table shared by Bock for WB3, it seems to use some sort of RLE as well (http://www.smspower.org/forums/14446-WonderBoyIIITechnicalQuestions). However looking at the maps of the game, they all seem very repetitive, so I am not sure if RLE is the only technique used here.

RLE seems like a rather simple and effective format. However the problem I have with it (or I simply can't think of an easier solution) is that the tile data isn't column aligned anymore. When I want to draw a new column (=horizontal scrolling) I have to iterate each line which can take a long time (for a wide level)...

Let's say a level is 10 screens long and 2 high (1 screen = 8x6 blocks) then I need to iterate over 7 block-lines (because the camera is not block aligned) and up to 80 block-columns (even though this might be the worst case scenario). That makes up to 560 iterations of whatever code has to be executed. Assuming one iterations needs ~50 cycles to compare blocks, increment counters, etc. it would take ~25k (~42% of a frame) just to find the right blocks (not decoding them and putting them into the screenmap of the VDP), which is simply to long for my needs.

So I was thinking, that maybe I would use _only_ the block mapping without RLE. This would be worse compression, but actually the worstcase (no repeating blocks) would have the same compression, it is easier to code and the most important advantage: The block-columns would be aligned. So for a horizontal scroll, you could basically calculate the interesting blocks. This would decrease "finding the correct blocks for a horizontal scroll" to probably <1% of a frame in the above example (again not including the actual decoding and writing to vram) .

Two other possibilities I considered were these:
- I could use RLE + block mappings, but align it every screen (=only compress within 8x6 block sections). This would be basically a compromise between data compression and computation, as you could calculate the screen the interesting blocks are in and only search within that screen. So the worstcase for the example above would be something like 7*8 iterations = 56 iterations -> 4,2% of a frame.
- Instead of encoding the level per line, one could encode it per column. This way the compressed data would be column aligned, but probably the compression would suck for complex backgrounds

- Do you know of other techniques?
- What is your experience in regard to the topic?
- Do you see any flaws with my thinking?

I think you should also consider how you will scroll trough the map. Say that you're always going right and you can't go left: you could store compressed 'slices' of a given number of columns. This way your big map can be compressed effectively and you'll be decompressing just a rather small portion of it into RAM/VRAM.
Actually the same is possible also if you need to go left, you just need to add an header before each 'slice' which tell you how much you need to go back to find the beginning of the previous slice.
For huge maps this might not be viable, as is doesn't break the map vertically, of course.

1. Metatiles are key. 16×16 pixels gives you 4x reduction in space, plus the metatiles can reuse tile anyway.
2. The metatile map is often decompressed to RAM, for example Sonic uses 4KB for it. One benefit of this is that you can modify it at runtime, too.
3. As you noticed, horizontal scrolling then requires traversing up to 13 metatiles (for 16×16), but you can optimise the lookups since it is such a bottleneck.

Psycho Fox does something a bit unusual, it has tall rectangle metatiles which optimise for horizontal scrolling (which is slower to write to the VDP) at the cost of more CPU work on vertical scrolling.

The metatile size for a shooter might be 32×32, Micro Machines uses 96×96 to get relatively huge maps.

RLE can be substituted for something stronger if you don't need to decompress on the fly.

I have a multiscroll engine with 16x16 tiles. Without compression, you obtain 1/8 reduction, as metatiles does not have mirroring flag :) i found it fast enough even doing 16 calculations per row and 14 per column. Although you cannot change tiles (does not make a copy) you can check collisions fast as it is not compressed. I think bigger (32x32) metatiles will produce simpler maps, or you will have more metatiles, which , having to save 16 references by metatile, lead to additional space needs. So i think you need a balance here.

Also, if you are thinking in a platformer, check if your tiles have to be in high part, as is easy to use up to 256 sprites with big (>16x16) animated characters.

With system explained the 768x576 stage is saved in 1.728 bytes :)

I'm still deciding between on-the-fly vs decompressed RAM tilemaps myself. On-the-fly opens up room for larger levels and perhaps even 2-bytes per block, with the second byte containing extra bits for the block ref to allow more than 256, plus the number of RLE runs it is repeated. There could also be room for flags (h/v flip block).

icy wrote

When I want to draw a new column (=horizontal scrolling) I have to iterate each line which can take a long time (for a wide level)...

I haven't actually tried this, but one option I've considered is having a runner in RAM for each visible row (affectionately dubbed 'row runners'), with a pointer to their current position in the compressed data and the number of runs their current block has left. When you request the next block in a row it decrements the runs remaining and returns the block, or if there are no runs left it increments the pointer and returns the next one. This would be repeated for each row but should be efficient enough for incremental reads.

For left scroll, the row runner would have to be able to also work its way backwards through runs. It would also need to know to first backtrack x number of blocks to get to the left of the screen and go backwards from there, then when the camera scrolls right again to go forward x blocks again. It would be more efficient if the runner allowed adds and subtracts rather than single inc/dec.

Vertical scrolling is a little more expensive. The row runner going off screen would need be detected and reinitialized to point to the first block of the newly visible row, then skip ahead x number of blocks to get to the relevant column. To cater for wide levels I wonder whether it would suffice for each row data to simply store a pointer to its mid-way point to provide a shortcut for the runner if the skip is beyond this.

Author	Message
icy Joined: 07 Jun 2017 Posts: 4	What Level/Map Compression Techniques are there? Posted: Thu Jun 15, 2017 11:03 am
icy Joined: 07 Jun 2017 Posts: 4	Sorry if I missed something obvious, but I found almost no information on how games store/compress their level data and I was curious if someone has experience in regards to that topic. I currently try to write an engine for a game, which includes horizontal + vertical scrolling. I made a little sample level, which is 3 screens wide and high (768x576 pixel or 96x72 characters) and if I want to save the complete tilemap of the level, I basically burn through a whole 16KB bank (it actually boils down to 13824 bytes). Now I am wondering what is the best way to compress the data without loosing a lot of computational resource (which I need for other stuff). What I found on the internet: - Sonic uses RLE compression (http://info.sonicretro.org/SCHG:Sonic_the_Hedgehog_(8-bit)#RLE_Compression). - In addition they use block mappings for mapping one block index to a map of 4x4 tiles - Based on the symbol table shared by Bock for WB3, it seems to use some sort of RLE as well (http://www.smspower.org/forums/14446-WonderBoyIIITechnicalQuestions). However looking at the maps of the game, they all seem very repetitive, so I am not sure if RLE is the only technique used here. RLE seems like a rather simple and effective format. However the problem I have with it (or I simply can't think of an easier solution) is that the tile data isn't column aligned anymore. When I want to draw a new column (=horizontal scrolling) I have to iterate each line which can take a long time (for a wide level)... Let's say a level is 10 screens long and 2 high (1 screen = 8x6 blocks) then I need to iterate over 7 block-lines (because the camera is not block aligned) and up to 80 block-columns (even though this might be the worst case scenario). That makes up to 560 iterations of whatever code has to be executed. Assuming one iterations needs ~50 cycles to compare blocks, increment counters, etc. it would take ~25k (~42% of a frame) just to find the right blocks (not decoding them and putting them into the screenmap of the VDP), which is simply to long for my needs. So I was thinking, that maybe I would use _only_ the block mapping without RLE. This would be worse compression, but actually the worstcase (no repeating blocks) would have the same compression, it is easier to code and the most important advantage: The block-columns would be aligned. So for a horizontal scroll, you could basically calculate the interesting blocks. This would decrease "finding the correct blocks for a horizontal scroll" to probably <1% of a frame in the above example (again not including the actual decoding and writing to vram) . Two other possibilities I considered were these: - I could use RLE + block mappings, but align it every screen (=only compress within 8x6 block sections). This would be basically a compromise between data compression and computation, as you could calculate the screen the interesting blocks are in and only search within that screen. So the worstcase for the example above would be something like 7*8 iterations = 56 iterations -> 4,2% of a frame. - Instead of encoding the level per line, one could encode it per column. This way the compressed data would be column aligned, but probably the compression would suck for complex backgrounds - Do you know of other techniques? - What is your experience in regard to the topic? - Do you see any flaws with my thinking?

sverx Joined: 05 Sep 2013 Posts: 3811 Location: Stockholm, Sweden	Posted: Thu Jun 15, 2017 12:00 pm
	I think you should also consider how you will scroll trough the map. Say that you're always going right and you can't go left: you could store compressed 'slices' of a given number of columns. This way your big map can be compressed effectively and you'll be decompressing just a rather small portion of it into RAM/VRAM. Actually the same is possible also if you need to go left, you just need to add an header before each 'slice' which tell you how much you need to go back to find the beginning of the previous slice. For huge maps this might not be viable, as is doesn't break the map vertically, of course.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14732 Location: London	Posted: Thu Jun 15, 2017 1:28 pm
	1. Metatiles are key. 16×16 pixels gives you 4x reduction in space, plus the metatiles can reuse tile anyway. 2. The metatile map is often decompressed to RAM, for example Sonic uses 4KB for it. One benefit of this is that you can modify it at runtime, too. 3. As you noticed, horizontal scrolling then requires traversing up to 13 metatiles (for 16×16), but you can optimise the lookups since it is such a bottleneck. Psycho Fox does something a bit unusual, it has tall rectangle metatiles which optimise for horizontal scrolling (which is slower to write to the VDP) at the cost of more CPU work on vertical scrolling. The metatile size for a shooter might be 32×32, Micro Machines uses 96×96 to get relatively huge maps. RLE can be substituted for something stronger if you don't need to decompress on the fly.

eruiz00 Joined: 28 Jan 2017 Posts: 556 Location: Málaga, Spain	Posted: Thu Jun 15, 2017 5:10 pm
	I have a multiscroll engine with 16x16 tiles. Without compression, you obtain 1/8 reduction, as metatiles does not have mirroring flag :) i found it fast enough even doing 16 calculations per row and 14 per column. Although you cannot change tiles (does not make a copy) you can check collisions fast as it is not compressed. I think bigger (32x32) metatiles will produce simpler maps, or you will have more metatiles, which , having to save 16 references by metatile, lead to additional space needs. So i think you need a balance here. Also, if you are thinking in a platformer, check if your tiles have to be in high part, as is easy to use up to 256 sprites with big (>16x16) animated characters. With system explained the 768x576 stage is saved in 1.728 bytes :)

eljay Joined: 17 Jun 2017 Posts: 19	On-the-fly vs RAM decompressed tilemaps Posted: Sat Jun 17, 2017 9:51 am
eljay Joined: 17 Jun 2017 Posts: 19	I'm still deciding between on-the-fly vs decompressed RAM tilemaps myself. On-the-fly opens up room for larger levels and perhaps even 2-bytes per block, with the second byte containing extra bits for the block ref to allow more than 256, plus the number of RLE runs it is repeated. There could also be room for flags (h/v flip block). icy wrote When I want to draw a new column (=horizontal scrolling) I have to iterate each line which can take a long time (for a wide level)... I haven't actually tried this, but one option I've considered is having a runner in RAM for each visible row (affectionately dubbed 'row runners'), with a pointer to their current position in the compressed data and the number of runs their current block has left. When you request the next block in a row it decrements the runs remaining and returns the block, or if there are no runs left it increments the pointer and returns the next one. This would be repeated for each row but should be efficient enough for incremental reads. For left scroll, the row runner would have to be able to also work its way backwards through runs. It would also need to know to first backtrack x number of blocks to get to the left of the screen and go backwards from there, then when the camera scrolls right again to go forward x blocks again. It would be more efficient if the runner allowed adds and subtracts rather than single inc/dec. Vertical scrolling is a little more expensive. The row runner going off screen would need be detected and reinitialized to point to the first block of the newly visible row, then skip ahead x number of blocks to get to the relevant column. To cater for wide levels I wonder whether it would suffice for each row data to simply store a pointer to its mid-way point to provide a shortcut for the runner if the skip is beyond this.

Forums

View topic - What Level/Map Compression Techniques are there?