Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - Look-up tables and code for dialogue strings.

Reply to topic
Author Message
  • Joined: 25 Feb 2023
  • Posts: 99
Reply with quote
Look-up tables and code for dialogue strings.
Post Posted: Mon Aug 21, 2023 5:19 pm
The other day I was thinking on how to write code that would take data formatted in an alternating pattern of VDP address and tile index to write dialogue to the screen.

.dw $7804,$0034,$7806,$0027,...
example data block with VRAM tilemap address pre-ORed to reduce processing, followed by some tile representing a letter or punctuation.


I believe my idea to be sound on paper, but I need some way to pull up the appropriate lines of dialogue, and my thought was to use a look-up table. The problem is I'm not exactly sure how I would write one based on my current idea.

Let's say that I have a short string of dialogue that says "Nothing here." and I set that dialogue's index to $00 out of $2F strings of dialogue. How would I take that index and use it in a lookup table to point to the correct data block?
  View user's profile Send private message
  • Joined: 23 Aug 2009
  • Posts: 213
  • Location: Seattle, WA
Reply with quote
Post Posted: Mon Aug 21, 2023 5:54 pm
You will want a table that maps dialogue ID to the location of the corresponding string. Or if you are embedding more detail than just the text, the table maps a dialogue ID to the struct that has all of the relevant info for the string. I can't remember if you're using C or WLA DX but each has its own peculiarities for defining structs. Your function should be something analogous to get_dialogue_struct_by_ID().

I'm honestly not sure you want to go through the hoops of pre-cooked VRAM locs, but if you do, I'd suggest a macro that takes ROW, COL and calculates it for you.
  View user's profile Send private message
  • Joined: 25 Feb 2023
  • Posts: 99
Reply with quote
Post Posted: Mon Aug 21, 2023 6:09 pm
SavagePencil wrote
You will want a table that maps dialogue ID to the location of the corresponding string. Or if you are embedding more detail than just the text, the table maps a dialogue ID to the struct that has all of the relevant info for the string. I can't remember if you're using C or WLA DX but each has its own peculiarities for defining structs. Your function should be something analogous to get_dialogue_struct_by_ID().

I'm honestly not sure you want to go through the hoops of pre-cooked VRAM locs, but if you do, I'd suggest a macro that takes ROW, COL and calculates it for you.


I am open to more efficient ideas, I just wasn't sure what would work for this purpose. Also, I am using WLA-DX.

So from what I can tell, if I have several data blocks for each dialogue string and they're all named something like "DialogueString_Nothing:", "DialogueString_Well" my table should be something like
lut_dialoguestrings:
DialogueString_Nothing, DialogueString_Well,...
in order of their index, correct?
  View user's profile Send private message
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Mon Aug 21, 2023 6:57 pm
That'd do it. Using dialogue IDs is a bit cumbersome, though; why not use labels like "DialogueString_Well" at the point you need them, instead of looking them up in a table? As you're likely to have more than 256 string in total, 16-bit IDs end up using as much space as 16-bit pointers.
  View user's profile Send private message Visit poster's website
  • Joined: 25 Feb 2023
  • Posts: 99
Reply with quote
Post Posted: Mon Aug 21, 2023 7:37 pm
Maxim wrote
That'd do it. Using dialogue IDs is a bit cumbersome, though; why not use labels like "DialogueString_Well" at the point you need them, instead of looking them up in a table? As you're likely to have more than 256 string in total, 16-bit IDs end up using as much space as 16-bit pointers.


That might work, but I'm not too sure. There are many different NPCs in the game for the player can talk to, and the game needs to see who the player is talking to and correctly match up the dialogue to the NPC.

Looking at the original game code, there is a "TalkToObject" routine that takes the NPC/object's index and spits out the ID for their dialogue, which is then fed to the "DrawDialogueBox" and "DrawDialogueString" subroutines. Interesting to note that the game actually does convert XY coords to the NES PPU name table address, so it might be possible I could mimic that somewhat.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Tue Aug 22, 2023 2:06 pm
Sure you can store XY pairs or even the address in PNT in two bytes, but why would you do that? I think you surely need to store the strings but you will likely have to calculate the addresses for the output at run time anyway...
  View user's profile Send private message Visit poster's website
  • Joined: 23 Aug 2009
  • Posts: 213
  • Location: Seattle, WA
Reply with quote
Post Posted: Tue Aug 22, 2023 4:26 pm
Without a lookup table per se, but going on what Maxim said, you can do something like this:


; Our structure
.STRUCT sDialogueEntry
    Column DB
    Row DB
    pString DW
    StringLen DB
.ENDST

.DSTRUCT Dialogue_NPC_Steve_WelcomeToTown INSTANCEOF sDialogueEntry VALUES:
    Column: .DB 4
    Row: .DB 17
    pString: .DW String_WelcomeToTown
    StringLen: .DB len_String_WelcomeToTown
.ENDST

GetDialogueFromNPC:
    ld hl, Dialogue_NPC_Steve_WelcomeToTown
    ret
  View user's profile Send private message
  • Joined: 25 Feb 2023
  • Posts: 99
Reply with quote
Probably not the best way, but it works...
Post Posted: Fri Aug 25, 2023 1:21 am
I went around in circles overcomplicating my code until I realized I was doing exactly that, but now I'm looking at this code and I know there's a way to condense this.

Before this subroutine is called, this code is executed

   ld hl,IntroText
   ld bc,IntroTextEnd-IntroText
   call DrawComplexString


DrawComplexString:
-: ld a,(hl)           ;load low byte of address into A
   out (VDPControl),a  ;set address
   inc hl              ;point to high byte of address
   dec bc
   ld a,(hl)           ;load high byte of address into A
   out (VDPControl),a  ;set address
   inc hl              ;point to low byte of tile index
   dec bc
   ld a,(hl)           ;load low byte of tile index into A
   out (VDPData),a     ;set tile
   inc hl              ;point to high byte of tile index
   dec bc
   ld a,(hl)           ;load high byte of tile index into A
   out (VDPData),a     ;set tile
   inc hl              ;point to low byte of next address
   dec bc
   ld a,b              ;then loop until done
   or c
   jr nz,-
   ret


IntroText:
.dw $784A, $0014, $784C, $0022, $784E, $001F, $7852, $0031, $7854, $0029, $7856, $002C, $7858, $0026, $785A, $001E, $785E, $0023, $7860, $002D, $7864, $0030, $7866, $001F, $7868, $0023, $786A, $0026, $786C, $001F, $786E, $001E, $7872, $0023
.dw $7874, $0028, $78C8, $001E, $78CA, $001B, $78CC, $002C, $78CE, $0025, $78D0, $0028, $78D2, $001F, $78D4, $002D, $78D6, $002D, $78D8, $0041, $78DC, $0014, $78DE, $0022, $78E0, $001F, $78E4, $0031, $78E6, $0023, $78E8, $0028, $78EA, $001E
.dw $78EE, $002D, $78F0, $002E, $78F2, $0029, $78F4, $002A, $78F6, $002D, $78F8, $0040, $7952, $002E, $7954, $0022, $7956, $001F, $795A, $002D, $795C, $001F, $795E, $001B, $7962, $0023, $7964, $002D, $7968, $0031, $796A, $0023, $796C, $0026
.dw $796E, $001E, $7970, $0040, $79C4, $001B, $79C6, $0028, $79C8, $001E, $79CC, $002E, $79CE, $0022, $79D0, $001F, $79D4, $001F, $79D6, $001B, $79D8, $002C, $79DA, $002E, $79DC, $0022, $79E0, $001C, $79E2, $001F, $79E4, $0021, $79E6, $0023
.dw $79E8, $0028, $79EA, $002D, $79EE, $002E, $79F0, $0029, $79F4, $002C, $79F6, $0029, $79F8, $002E, $79FA, $0041, $7A52, $0014, $7A54, $0022, $7A56, $001F, $7A5A, $002A, $7A5C, $001F, $7A5E, $0029, $7A60, $002A, $7A62, $0026, $7A64, $001F
.dw $7A68, $0031, $7A6A, $001B, $7A6C, $0023, $7A6E, $002E, $7A70, $0040, $7AC4, $002E, $7AC6, $0022, $7AC8, $001F, $7ACA, $0023, $7ACC, $002C, $7AD0, $0029, $7AD2, $0028, $7AD4, $0026, $7AD6, $0033, $7ADA, $0022, $7ADC, $0029, $7ADE, $002A
.dw $7AE0, $001F, $7AE2, $0040, $7AE6, $001B, $7AEA, $002A, $7AEC, $002C, $7AEE, $0029, $7AF0, $002A, $7AF2, $0022, $7AF4, $001F, $7AF6, $001D, $7AF8, $0033, $7AFA, $0043, $7AFC, $0043, $7B82, $003F, $7B84, $0017, $7B86, $0022, $7B88, $001F
.dw $7B8A, $0028, $7B8E, $002E, $7B90, $0022, $7B92, $001F, $7B96, $0031, $7B98, $0029, $7B9A, $002C, $7B9C, $0026, $7B9E, $001E, $7BA2, $0023, $7BA4, $002D, $7BA8, $0023, $7BAA, $0028, $7BAE, $001E, $7BB0, $001B, $7BB2, $002C, $7BB4, $0025
.dw $7BB6, $0028, $7BB8, $001F, $7BBA, $002D, $7BBC, $002D, $7C06, $0006, $7C08, $0029, $7C0A, $002F, $7C0C, $002C, $7C10, $0017, $7C12, $001B, $7C14, $002C, $7C16, $002C, $7C18, $0023, $7C1A, $0029, $7C1C, $002C, $7C1E, $002D, $7C22, $0031
.dw $7C24, $0023, $7C26, $0026, $7C28, $0026, $7C2C, $001D, $7C2E, $0029, $7C30, $0027, $7C32, $001F, $7C34, $0043, $7C36, $0043, $7C38, $003F, $7C86, $0001, $7C88, $0020, $7C8A, $002E, $7C8C, $001F, $7C8E, $002C, $7C92, $001B, $7C96, $0026
.dw $7C98, $0029, $7C9A, $0028, $7C9C, $0021, $7CA0, $0024, $7CA2, $0029, $7CA4, $002F, $7CA6, $002C, $7CA8, $0028, $7CAA, $001F, $7CAC, $0033, $7CAE, $0040, $7CB2, $0020, $7CB4, $0029, $7CB6, $002F, $7CB8, $002C, $7D0A, $0033, $7D0C, $0029
.dw $7D0E, $002F, $7D10, $0028, $7D12, $0021, $7D16, $0031, $7D18, $001B, $7D1A, $002C, $7D1C, $002C, $7D1E, $0023, $7D20, $0029, $7D22, $002C, $7D24, $002D, $7D28, $001B, $7D2A, $002C, $7D2C, $002C, $7D2E, $0023, $7D30, $0030, $7D32, $001F
.dw $7D34, $0040, $7D8C, $001F, $7D8E, $001B, $7D90, $001D, $7D92, $0022, $7D96, $0022, $7D98, $0029, $7D9A, $0026, $7D9C, $001E, $7D9E, $0023, $7DA0, $0028, $7DA2, $0021, $7DA6, $001B, $7DA8, $0028, $7DAC, $000F, $7DAE, $0012, $7DB0, $0002
.dw $7DB2, $0041
IntroTextEnd:


Everything here functions, but when I try to condense DrawComplexString using loops instead of repeating the same couple lines of codes, I can't quite make it work. My idea was using a loop counter in RAM to count up from 0 to 2 before moving onto the next step, but I find myself struggling to understand what I can do with RAM and how to increment the value there or compare it to the value i want.
  View user's profile Send private message
  • Joined: 23 Aug 2009
  • Posts: 213
  • Location: Seattle, WA
Reply with quote
Post Posted: Fri Aug 25, 2023 1:32 am
1. What happens when you single step through this with your debugger?
2. If you make this only one or two strings, does it work?
3. Are you doing this in VBLANK? Any chance you are running out of time?
  View user's profile Send private message
  • Joined: 25 Feb 2023
  • Posts: 99
Reply with quote
Post Posted: Fri Aug 25, 2023 1:39 am
SavagePencil wrote
1. What happens when you single step through this with your debugger?
2. If you make this only one or two strings, does it work?
3. Are you doing this in VBLANK? Any chance you are running out of time?


Everything i wrote here works perfectly, and I'm not doing this during Vblank, so I'm good there. My main issue is really just wanting to make the DrawComplexString code shorter by looping each half twice before starting back at the beginning again.

So i want the
ld a,(hl)
out (VDPControl),a
inc hl
dec bc

chunk to execute two times, then have the
ld a,(hl)
out (VDPData),a
inc hl
dec bc

chunk execute twice as well before it runs through the last bit of code and loops back to the first chunk.
  View user's profile Send private message
  • Joined: 25 Feb 2023
  • Posts: 99
Reply with quote
Post Posted: Fri Aug 25, 2023 4:01 am
CerezaSaturn64 wrote
SavagePencil wrote
1. What happens when you single step through this with your debugger?
2. If you make this only one or two strings, does it work?
3. Are you doing this in VBLANK? Any chance you are running out of time?


Everything i wrote here works perfectly, and I'm not doing this during Vblank, so I'm good there. My main issue is really just wanting to make the DrawComplexString code shorter by looping each half twice before starting back at the beginning again.

So i want the
ld a,(hl)
out (VDPControl),a
inc hl
dec bc

chunk to execute two times, then have the
ld a,(hl)
out (VDPData),a
inc hl
dec bc

chunk execute twice as well before it runs through the last bit of code and loops back to the first chunk.


Yeah I don't think using loops will actually optimize this code lol, never mind.
  View user's profile Send private message
  • Joined: 04 Jul 2010
  • Posts: 542
  • Location: Angers, France
Reply with quote
Post Posted: Fri Aug 25, 2023 6:43 am
When you write into the VDP (via out(be), a or out(c), r) vdp adr is auto incremented... So no Need to re-push the VDP address until necessary.

You should use stringmap (with tbl file) and a terminator flag for end of Line (a specific byte you'll check to tell " hey, go to next Line" or "stop")
Next Line is generaly simple ; just an addition of the previous setted one (+64)

To reduce dat size, you can use some compression. RLE give good result with map (text is also map)
  View user's profile Send private message
  • Joined: 25 Feb 2023
  • Posts: 99
Reply with quote
Post Posted: Fri Aug 25, 2023 7:06 am
ichigobankai wrote
You should use stringmap (with tbl file) and a terminator flag for end of Line (a specific byte you'll check to tell " hey, go to next Line" or "stop")
Next Line is generaly simple ; just an addition of the previous setted one (+64)

To reduce dat size, you can use some compression. RLE give good result with map (text is also map)


Could you possibly provide an example of what you're talking about?

Also, my data is specifically formatted to be the tilemap address to write to (like $784A), then the tile to use there (The letter "T", which is $0014). Simply letting the VDP auto-increment wouldn't work for the way I've set things up.
  View user's profile Send private message
  • Joined: 06 Mar 2022
  • Posts: 671
  • Location: London, UK
Reply with quote
Post Posted: Fri Aug 25, 2023 7:14 am
As Ichigo points out, the way you are storing text could be made much much more efficient. Even without RLE you could encode some control characters for null termination but also line feeds, carriage returns, etc. ASCII anyone? :)

But sounds like you're focused on trying to reduce the size of your algorithm as written. Since you're just dealing with data structures of fixed size of four bytes could you make use of the index registers and instructions like `ld a,(ix+d)` to cut down on manual index increments?
  View user's profile Send private message Visit poster's website
  • Joined: 04 Jul 2010
  • Posts: 542
  • Location: Angers, France
Reply with quote
Post Posted: Fri Aug 25, 2023 7:17 am
Last edited by ichigobankai on Fri Aug 25, 2023 7:43 am; edited 2 times in total
Check the wla documentation :
https://wla-dx.readthedocs.io/en/latest/#stringmaptable-script-script-tbl

The auto increment of the VDP IS only for address not data.
tilemap data are 2 bytes (value then attrib), so when set the VDP @784a
Write 1st byte (adr += 1)
Write 2nd byte (adr += 1)
; Here the address is 784C
....


Just read/get your next value to write via
ld a,(hl)
inc hl

Between each
out(be), a

A very basic piece of code written on a stupid notepad clone on my smartphone ;

Ld b, 6 ; nb lines
Ld hl, mytable_with_elements ; first vdp address, then letters

next_line:
Push bc
Ld b, 17 ; nb letter/line
Ld a,(hl)
Inc hl
Out(bf),a
Ld a,(hl)
Inc hl
Out(bf),a

letter:
Ld a,(hl)
Inc hl
Out(be), a ; value
Ld a,(hl)
Inc hl
Out(be), a ; attrib
Djnz letter

Pop BC
Djnz next_line


If you need to waste time between out(be) writes, you can use push ix/pop ix, nops, ... As code above is written for vblank (a little bit to fast for active display)
  View user's profile Send private message
  • Joined: 25 Feb 2023
  • Posts: 99
Reply with quote
Post Posted: Fri Aug 25, 2023 7:33 am
willbritton wrote
As Ichigo points out, the way you are storing text could be made much much more efficient. Even without RLE you could encode some control characters for null termination but also line feeds, carriage returns, etc. ASCII anyone? :)

But sounds like you're focused on trying to reduce the size of your algorithm as written. Since you're just dealing with data structures of fixed size of four bytes could you make use of the index registers and instructions like `ld a,(ix+d)` to cut down on manual index increments?


I'm open to making everything more compressed and efficient, I just have no clue how. I'm completely unsure of how to handle the various compression options bmp2tile presents to me, and I was already flying by the seat of my pants with the data formatting I came up with.

I still don't believe I can use .asciitable given that I don't have or want some of the characters between the ones that I'm using and my set of letters also includes symbols for swords, armor, helmets, and the like.

Taking a look at the original Final Fantasy's code for this on NES, it basically drags the PPU nametable across the screen with each letter like a typewriter, and each character has it's own code, which might signal it's a regular letter, a DTE letter, a line break, a double line break, etc..

I also had no idea I could use the ix or iy registers that way. I rarely ever see them get used in example code or when i open official games with a debugger.
  View user's profile Send private message
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Fri Aug 25, 2023 8:31 am
Stringmaptable could work well for you to use text and symbols to encode your script. Having the text be readable will be super helpful for anyone reading the code - including yourself. You can use any characters or emoji (if you are careful with your use of UTF-8) to allow you to put symbols in text. You can also map strings to bytes, e.g. so "<line>" becomes a single byte.

ASCIITable can also work well if you don't mind using ASCII characters to represent symbols. You only have to map the ones you are using.
  View user's profile Send private message Visit poster's website
  • Joined: 14 Oct 2008
  • Posts: 513
Reply with quote
Post Posted: Fri Aug 25, 2023 2:23 pm
That seems like a LOT of ROM space spent on data that could be easily recalculated in code. Could just store the initial address data to a RAM variable, then after each character increment that variable by 2.
If it is for speed gain, I'm not sure how I see that increasing speed by a meaningful amount, especially for a text display routine. It won't print any faster than what can be read.
  View user's profile Send private message
  • Joined: 06 Mar 2022
  • Posts: 671
  • Location: London, UK
Reply with quote
Post Posted: Fri Aug 25, 2023 2:41 pm
Last edited by willbritton on Fri Aug 25, 2023 4:29 pm; edited 1 time in total
@Cereza I had a bit of a play with an example using `.stringtable` and a print routine handling some simple control codes to do newlines and various tabs / indents. It produces the same screenmap as your original code when I test it out:

https://gist.github.com/willbritton/c8f427edb8d3bd70ceb8592ab524db52/c7f265589908d2869d1b0cf94cb79d1b957adacb

Mostly just knocked up on the fly, please don't assume anything here is bullet proof code ;)

Possibly I'm not quite using stringtable in exactly the way it was intended here but anyway, a bit of a demo of what might be possible.

One irritant was that it's apparently not possible to use backslash escaped characters in the table file, so I couldn't do `\0` or `\n`, despite the latter apparently being used in the example in the WLA-DX docs 🤷‍♀️
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Fri Aug 25, 2023 4:05 pm
I think you’re using it exactly correctly. I think backslashes used to work, they are intended to not be interpreted at all - but I think some changes to WLA DX to make it process them inside strings has broken that part. I tend to use fake pseudo-html <tags> for control codes myself.
  View user's profile Send private message Visit poster's website
  • Joined: 06 Mar 2022
  • Posts: 671
  • Location: London, UK
Reply with quote
Post Posted: Fri Aug 25, 2023 4:17 pm
Maxim wrote
I think you’re using it exactly correctly.

Well there's a first time for everything!

Maxim wrote
I tend to use fake pseudo-html <tags> for control codes myself.

Ah of course, that's a much better idea than trying to dig through emojis for one that makes sense and that VS Code won't lose its sh*t over 😂
I am struggling to reprogram my brain into understanding that it's okay to use more than one character to represent a single character!

EDIT: Alternative with Maxim-style pseudo-tags here.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Fri Aug 25, 2023 6:39 pm
I’m also a big fan of using emoji, especially in cases where they are rendered on a square grid (depending on your OS and editor, unfortunately) to define grid-oriented data. In this case, there’s a desire to have swords 🗡️ and shields 🛡️ and helmet 🪖 glyphs… and then emoji fails to have anything for armour, I guess 🦺 maybe?

Under the covers it’s just mapping sequences of one or more bytes in a string (emoji in UTF-8 being quite a few bytes) to output sequences of one or more bytes (you can have multi-byte outputs by putting more hex characters on the left of the =), and erroring if it can’t make a match. It always prefers the longest match. The TBL format seems to be often used in the world of translations, which is partly why I used it - I already had one for the stuff I was working on.
  View user's profile Send private message Visit poster's website
  • Joined: 25 Feb 2023
  • Posts: 99
Reply with quote
Thank You, Guys!
Post Posted: Sat Aug 26, 2023 10:59 pm
Sorry for not getting back to this sooner, a huge storm knocked my internet out for a few days and I've been real busy with going back to uni.

Took a look at the stringmap examples and it looks MUCH EASIER than what I was trying to do.

Does this also work with compressed tiles? Right now the fonticons.bin file is just raw uncompressed binary, and I'm curious as to what compression options might be best for text tiles while also working with this stringmap thing.
  View user's profile Send private message
  • Joined: 06 Mar 2022
  • Posts: 671
  • Location: London, UK
Reply with quote
Post Posted: Sun Aug 27, 2023 6:44 am
CerezaSaturn64 wrote
Sorry for not getting back to this sooner, a huge storm knocked my internet out for a few days and I've been real busy with going back to uni.

Hope everything is back to normal after the storm!

CerezaSaturn64 wrote
Does this also work with compressed tiles?

Compression only allows you to use fewer bytes when storing tiles in your ROM - you need to decompress the tiles in order to transfer them into VRAM so the tiles will be the same in VRAM whether or not you store them compressed in ROM. So basically yes, this will still work.

The process of encoding is pretty easy working from decompressed tiles - I just loaded your original tiles into VRAM, opened up the Tile Viewer in Emulicious, and literally typed out the characters that I saw into `default.tbl`, in the order they appeared on screen. The control characters are "virtual" so don't need to really know about your tiles at all.

You might need different tables for different parts of the game I suppose, so that might be an added complexity, but for the most part I think you'd want one table + set of string definitions per translation language.

EDIT: worth also noting that storing the strings in this case already gives you around a 6x compression over storing the raw screen map as before. You only store one byte for each tile instead of two, you use one byte for empty lines (an extra <br> although if you made a double space crlf code you would save even that) and all whitespace compresses down to a single byte with the tab codes, so it's something like 250 bytes vs. 1.5kB for a whole visible screen map of 24 rows.

However, you could also apply more compression to your strings if you wanted and you wouldn't need anything particularly complicated. Consider that a lot of your text will probably be repeated words. So something like "warrior" could be encoded as a single "dictionary" code byte, instead of 7 individual characters. Then you would modify your `text_print` routine to lookup any dictionary words from a list and print them out in full. A basic form of dictionary encoding.

In a game like this, I wouldn't be surprised if your string data is much bigger overall than your tile data so ultimately compression applied there could reap greater rewards.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Sun Aug 27, 2023 7:45 am
Last edited by Maxim on Sun Aug 27, 2023 7:54 am; edited 1 time in total
It’s worth remembering however that unless you are aiming to make physical cartridges with a size limit, compression is not necessary until you run into other limits like the 1MB size limit for older Everdrives and clones. It only acts to slow things down.

At the same time, it’s also used extensively in the Phantasy Star retranslation, as part of the goal to fit in the original 512KB ROM size - there is dictionary coding for common words and also conditional Huffman coding for the data itself.
  View user's profile Send private message Visit poster's website
  • Joined: 25 Feb 2023
  • Posts: 99
Reply with quote
Post Posted: Sun Aug 27, 2023 7:52 am
Maxim wrote
It’s worth remembering however that unless you are aiming to make physical cartridges with a size limit, compression is not necessary until you run into other limits like the 1MB size limit for older Everdrives and clones. It only acts to slow things down. At the same time, it’s also used extensively in the Phantasy Star retranslation, as part of the goal to fit in the original 512KB ROM size - there is dictionary coding for common words and also conditional Huffman coding for the data itself.


If I remember correctly, the original game is only around 130-140KB in size, and the original is jam packed with inefficient code and bizarre choices. I was extremely generous with my estimated size limit of 256KB to accurately replicate the original game on the Master System, as I knew I'd need to accommodate for sprites not flipping and other hardware quirks. There is a good chance that I could easily manage with uncompressed text tiles and simply compress standard map tiles for the overworld, towns, and dungeons.

I am, of course, quite far away from a finished product so I have plenty of time to assess and reassess what I need to do to fit in my limits.
  View user's profile Send private message
  • Joined: 23 Jan 2010
  • Posts: 439
Reply with quote
Post Posted: Sun Aug 27, 2023 9:38 am
CerezaSaturn64 wrote
Maxim wrote
It’s worth remembering however that unless you are aiming to make physical cartridges with a size limit, compression is not necessary until you run into other limits like the 1MB size limit for older Everdrives and clones. It only acts to slow things down. At the same time, it’s also used extensively in the Phantasy Star retranslation, as part of the goal to fit in the original 512KB ROM size - there is dictionary coding for common words and also conditional Huffman coding for the data itself.


If I remember correctly, the original game is only around 130-140KB in size, and the original is jam packed with inefficient code and bizarre choices. I was extremely generous with my estimated size limit of 256KB to accurately replicate the original game on the Master System, as I knew I'd need to accommodate for sprites not flipping and other hardware quirks. There is a good chance that I could easily manage with uncompressed text tiles and simply compress standard map tiles for the overworld, towns, and dungeons.

I am, of course, quite far away from a finished product so I have plenty of time to assess and reassess what I need to do to fit in my limits.


FF1 have 256 KB of ROM and 8 KB of CHR RAM (That can be compressed) + 8 KB of SRAM for saves.
I like of @willbritton idea about "dictionary" for words. I think i got the principle of "reference". Something as "warrior" =1. "What´s your name 1?" - "Wow, you are a great 1!"
  View user's profile Send private message
  • Joined: 06 Mar 2022
  • Posts: 671
  • Location: London, UK
Reply with quote
Post Posted: Sun Aug 27, 2023 10:10 am
segarule wrote
Something as "warrior" =1. "What´s your name 1?" - "Wow, you are a great 1!"

Yes although with .stringtable in WLA-DX, you can still write the string as "What's your name warrior?", but WLA-DX will replace the "warrior" part with the byte, e.g. 0x01 when it assembles (actually it's probably technically in the linking stage) the binary — so you don't need to worry about it when writing out your strings.

In fact, I could have done the same thing with the tabs now that I think about it — instead of inventing a special "4️⃣" or "<4>" character, I could have just defined a symbol of four literal spaces in the string table, and replaced it with 0xf4 just the same.

I guess a good way to estimate how much ROM you'll need for strings at least is to find all the text in the original ROM and count how many characters it is, excluding things like non-significant whitespace. That should give you something close to a worst case scenario of unoptimised string storage requirements. That way if you do want to constrain yourself in terms of ROM size you can easily decide whether or not you need to think about compressing the strings or not.
  View user's profile Send private message Visit poster's website
  • Joined: 23 Jan 2010
  • Posts: 439
Reply with quote
Post Posted: Sun Aug 27, 2023 12:14 pm
I was thinking something like
Warrior ascii = 77 61 72 72 69 6f 72 in HEX. If we had something as "alphabet" so 72 could be use only 1 time. In alphabet list warrior could be
77 61 and only 72 reference instead warrior every time. And more some characters can be replaced. For example: 1 and l in a typewriter.
  View user's profile Send private message
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Sun Aug 27, 2023 12:39 pm
Ruining the font to save tiles is only useful if you have VRAM space issues :) it probably saves nothing on the text encoding.

For the majority of an RPG, the text is drawn into a box and you may need to substitute names so it needs to handle word wrapping and pausing at runtime. Concentrate on that - and having the data be simple to edit - and special-case the times when you want to draw somewhere else as they’ll be rarer and likely more deterministic, eg the intro screens can just have a few screen locations and text pointers and it’s only a few dozen bytes.
  View user's profile Send private message Visit poster's website
  • Joined: 23 Jan 2010
  • Posts: 439
Reply with quote
Post Posted: Sun Aug 27, 2023 12:57 pm
Maxim wrote
Ruining the font to save tiles is only useful if you have VRAM space issues :) it probably saves nothing on the text encoding.

For the majority of an RPG, the text is drawn into a box and you may need to substitute names so it needs to handle word wrapping and pausing at runtime. Concentrate on that - and having the data be simple to edit - and special-case the times when you want to draw somewhere else as they’ll be rarer and likely more deterministic, eg the intro screens can just have a few screen locations and text pointers and it’s only a few dozen bytes.

You is correct once time again @Maxim. What matter in the end is we can help @CerezaSaturn64.
  View user's profile Send private message
  • Joined: 06 Mar 2022
  • Posts: 671
  • Location: London, UK
Reply with quote
Post Posted: Mon Nov 20, 2023 10:13 am
willbritton wrote

One irritant was that it's apparently not possible to use backslash escaped characters in the table file, so I couldn't do `\0` or `\n`, despite the latter apparently being used in the example in the WLA-DX docs
🤷‍♀️


Think this may now be fixed in v10.6

Quote

[ALL] .STRINGMAP and .STRINGMAPTABLE handle now special characters like "\n" properly.

  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Mon Nov 20, 2023 12:17 pm
I reported that :)
  View user's profile Send private message Visit poster's website
Reply to topic



Back to the top of this page

Back to SMS Power!