 Joined: 16 May 2002
 Posts: 1247
 Location: italy

Theoretical programming question: Sudoku compression
Posted: Wed May 05, 2021 12:19 am

No, I don't plan to write a Sudoku game for the Master System, I wouldn't even know where to start. But it looks like this kind of programming question is a welcome exercise around here, and since I had this idea, I might as well brainstorm it with you.
So, imagine that you are indeed writing a Sudoku game for the Master System, and you want to include as many boards as possible within the ROM: what is the most efficient way to do that? Is it possible to obtain a good compression ratio without writing insanely complex code?
Let's explore our options. Someone might find it useful in the future.
Here is a Sudoku I created: 2 ¦ 8¦
¦ 3 ¦ 5
4¦ ¦8
++
5 ¦4 3¦
29 ¦ 5 ¦ 61
¦1 2¦ 7
++
1¦ ¦9
3 ¦ 7 ¦
¦6 ¦ 4
Option 1: uncompressed.
This will serve as a benchmark. The most intuitive way to store a Sudoku is to use a byte for each cell. It is wasteful, and it obviously needs 81 bytes (assuming that a blank cell can be stored as a 0).
This is probably how it can be represented in RAM while in game, with a separate secondary matrix to distinguish between readonly and writeable cells. Note that the writeability of the cells doesn't need to be saved within the data, because it's obvious that all the stored numbers will be readonly while solving that board.
Option 2: packed BCD.
Since the digits can only range between 0 and 9, one can store a digit per nybble. This isn't much harder than before, and it almost halves the needed space (40 bytes and a half doesn't really make sense, so it gets rounded to 41 bytes).
Option 3: powers of two.
Here is where things get interesting.
It is well known that every tenth power of two is near a third power of ten, e.g. 2^10 = 1024 ~= 1000 = 10^3 (this fact, amongst other things, gave birth to the stupid "kibibyte / mebibyte / etc" nonsense).
Therefore, one can imagine that you could store a row of Sudoku data as a 30bit integer, and a whole sudoku as an insane 270bit integer. Except that such a thing would be a hell to handle. One might make things a bit easier by storing each row as a 32bit integer, but it would still be quite painful. Space required: 34 bytes for the gigantic 270bit variant, and 36 bytes for the 9×32bit variant. Better than BCD, but at what computational cost?
Option 4: RLE of blank cells.
In my opinion, this is the best idea I could come up with. Since it's unlikely to have more than 15 blank cells between two "numbered" cells (imagine the Sudoku as a linear 81cells array), one might store two different informations in the two nybbles: the first 4 bits can encode the number of blank cells until the next number, and the second nybble can encode the actual number. A terminator byte (or rather, a terminator nybble) would be required, but that would be easy, as any "invalid" digit would do (e.g. $0F, since 15 isn't a valid Sudoku digit). Likewise, another "invalid" second nybble can be used in those rare cases with more than 15 blank cells between two numbers, à la variablelenght data used in MIDI files. For example, $FE $5N might mean that there are $F+$5 = 20 (decimal) blank cells before digit N.
The huge drawback of this method is that you can't predict the space needed for your data (one byte per number + one terminator + occasional "more than 15" markers), meaning that seeking towards later boards would be hard.
For reference, the encoded data for the Sudoku above would be:12 38 73 35 24 38 25 24 13 32 09 25 26 01 31 12 27 21 39 23 37 76 34 1F 24 bytes, best of all methods, and farily easy to decode, too, but with an annoying seeking problem. Perhaps a 1byte "header" with the length can be used instead of the terminator byte, but one would still need to jump from header to header and read them all to reach later boards. Unless you make it mandatory to solve a specific board at a time and keep a 16bit (or even 24bit) bookmark somewhere in RAM (or SRAM) with the location of the next board in ROM, and only that.
Option 5: you tell me!
So, what would you other programmers do to solve this task? As I said, I don't plan to make a Sudoku game, but someone else might want to in the future, so this research could still be useful.
Yes I'm bored.

 Joined: 05 Dec 2019
 Posts: 37

Posted: Wed May 05, 2021 3:06 am

A 9x9cell bitmap where givens are 1 and empty cells are 0 takes 81 bits. If known in advance to be symmetric, it takes only 41 bits. Thus each puzzle could begin with a 6 or 11byte header, where the first byte tells whether the puzzle is symmetric and whether the center cell is a given.
Each given is a symbol with 9 possibilities. A sequence of 5 such symbols thus has 59,049 possibilities, which fit in a 16bit word. To extract them, divide by 9 several times and use the remainders.
This puzzle is 180 degree rotationally symmetric and has 23 givens. This gives 6 bytes for the blank cell bitmap and 10 for five groups of five givens.
Total: 16 bytes

 Joined: 16 May 2002
 Posts: 1247
 Location: italy

Posted: Wed May 05, 2021 4:21 am

Ha, that's a very clever solution, awesome!
Just like my fourth idea, though, it makes seeking for a specific puzzle harder because of the unpredictability of the data. I wonder if one might want to waste a couple of bytes and always store e.g. six groups of givens since your data is packed in such an efficient way anyway, it would be a tradeoff to allow for more seeking flexibility.
You also have a point about symmetry, yeah, if I had a project like this, I'd arbitrarily restrict the archive to symmetric puzzles because I like them more, and, as above, it would help with the predictability of the data length.

 Site Admin
 Joined: 19 Oct 1999
 Posts: 13777
 Location: London

Posted: Wed May 05, 2021 6:35 am

A game I’m working with recently stores its script as lengthprefixed data, costing one extra byte per entry. Seeking is still O(n) so there is also a small table storing the address of every 256th item. Overall, seek time is on the order of a few frames, not really noticeable.
Using multiplication and division is slow on Z80, but again not really a concern as puzzle loading time isn’t likely to be an issue if it’s faster than a second or so.
I’m not a Sudoku expert, but I imagine that more difficult puzzles tend to be very sparse, so optimising for that case may be a good idea  so encoding the “gap lengths”, perhaps as a run of nybbles, and the multiplicationpacked values as above, might work well there; but for simpler puzzles, it may be more efficient to instead store the populated squares as a bitmap.
The ultimate compression might be to implement a puzzle generation algorithm. You might then algorithmically generate every puzzle as a function of its “index”, with that serving as both a random seed and a difficulty level, at a cost of 0 bytes per puzzle  but probably a lot of time to generate.

 Joined: 26 Feb 2021
 Posts: 39

Posted: Mon May 10, 2021 12:56 pm

There are so many ways to skin a cat.
In your example you could store the nonempty positions in 10 bytes + 1 bit:
010 001 000
000 010 001
001 000 100
100 101 000
110 010 011
000 101 001
001 000 100
100 010 000
000 100 010
Then you store on 4 bits each nonempty value starting arbitrarily at the top left and reading from left to right, top to bottom:
2 = 0010, 8 = 1000, etc.
In your example 12,5 bytes are needed, considering your have 7 spare bits from the previous table, that whole grid would fit in (10 + 1 + 12) 23 bytes.
That wouldn't need a complicated algorithm.
