Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - PSGlib 'official' topic [was: "Music engines"]

Reply to topic Goto page Previous  1, 2, 3, 4, 5  Next
Author Message
  • Joined: 01 Feb 2014
  • Posts: 517
Reply with quote
Post Posted: Fri Aug 29, 2014 7:28 am
sverx wrote
OK, so I finally put this optimization into my converter.
No more CPU 'spikes' now (as it won't ever have to push more than 11 bytes to the PSG chip each frame) and the maximum time it takes now is 13 scanlines [15 if the file is using compression] :)

Great!

One question, though: Did you change something fundamentally with your tools? I re-converted and re-compressed all my sound files with the current versions of vgm2psg and psgcomp, and while the longer music files achieved an even better compression rate than before, all the short sound effects seemed to come out of a few bytes bigger than before. It's not really an issue, but I was wondering what was going on.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Fri Aug 29, 2014 7:55 am
The PSG file coming from the vgm2psg converter should eventually be (few bytes) smaller, never larger, since it's now removing unnecessary writes.
The way I create the file (not the format BTW) now has changed, and this will affect the compression, as I noticed too. Sometimes then the compressor shaves off few more bytes, sometimes few less (I usually compress the tunes only, as my SFX are just few bytes).
All this was needed to avoid the frame processing routines to unnecessarily eat too much CPU... it was something that was out of control, since the original VGM could potentially lead to a lot of writes performed one after the other, and the converter was just translating these into the PSG file with no check whatsoever.
Also when creating the PSG file I chose to write all the PSG commands grouped by channel number instead of grouped by 'type', as it usually comes out from VGMTool.
I mean that the commands were issued in this order:
- Set the freq for chan 0
- Set the freq for chan 1
- Set the freq for chan 2
- Set the noise mode/cycle for chan 3
- Set the volume for chan 0
- Set the volume for chan 1
- Set the volume for chan 2
- Set the volume for chan 3

while now I switched to something like this:
- Set the freq for chan 0
- Set the volume for chan 0
- Set the freq for chan 1
- Set the volume for chan 1

... and so on.
I believe any difference is hardly noticeable, but I felt it was conceptually better to change all that needs to be changed for each channel before switching to the next channel.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Fri Aug 29, 2014 12:50 pm
Within the VGM format, all the writes are considered to happen instantaneously so the order is unimportant. On a real system, the chip is producing audio for the fraction of a second before the next write happens, so the order might matter - by practically, it doesn't affect the sound much.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Fri Aug 29, 2014 2:04 pm
Maxim wrote
Within the VGM format, all the writes are considered to happen instantaneously so the order is unimportant.

That's the reason I wondered why those double writes to the same register weren't removed by the optimizers... :|
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Fri Aug 29, 2014 3:21 pm
I dug out the code (https://github.com/maxim-zhao/vgmtool) and I think it's just logic errors, it's dumping state once it finds it is at the loop point, rather than at the first pause after the loop point. I'm looking forward to VGMTool 3 :)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Sat Aug 30, 2014 1:09 pm
Kagesan wrote
[...] all the short sound effects seemed to come out of a few bytes bigger than before.


Oh wait, now I realize why it's so. I broke the channel 'filter' for SFX with the last update of the converter. I bet the files are from 6 to 9 bytes longer now... and they will disturb the background music playing on PSG channel 0 and 1 too!
Sorry, I'm going to fix it ASAP next week.
[man, that's bleeding edge technology LOL ;)]
  View user's profile Send private message Visit poster's website
  • Joined: 01 Feb 2014
  • Posts: 517
Reply with quote
Post Posted: Sat Aug 30, 2014 3:17 pm
sverx wrote
I bet the files are from 6 to 9 bytes longer now...

Yes, that sounds about right.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Mon Sep 01, 2014 9:32 am
Conversion tool fixed in the repository. I made some test conversions and looks like the SFX conversion works correctly again. Let me know if something else look/sounds wrong.
Sorry for that, pal!
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Thu Nov 06, 2014 2:36 pm
Speaking of alternatives, I just found this.
It's very similar to what I've done, but it seems to achieve much better music data compression by splitting channels frequencies, volumes and sync info in 12 separate streams... and has SFX priority too.
Replay library it's for ColecoVision and TI-99, but I guess it could eventually be ported to SMS too.
(Given that all VGMs are compressed into a single file, probably you won't be able to page out what's not needed, but that's relevant only if it gets very big... in that case probably it's better to compress the VGMs in groups...)
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Thu Nov 06, 2014 4:20 pm
Nice, although I'd be tempted to write the player in assembly due to a misplaced distrust of compilers. It's a bit lossy, but that might not be a bad thing as it does it in order to save space. I guess it uses more CPU than yours, and could get smaller files at the cost of even more CPU by not storing all the zeroes mentioned in the docs.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Thu Nov 06, 2014 8:21 pm
Maxim wrote
I guess it uses more CPU than yours


I hope so! ;)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Fri Jul 10, 2015 9:04 am
I just tagged current release as v1.0.0 as I feel the lib is mature enough :)

Now I'm working with Calindro to completely break it :D LOL ;)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Wed Aug 05, 2015 1:24 pm
Library updated.
Thanks to Calindro for brilliantly solving the clashing between PSG channels #2 and #3 (noise) :)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Thu Sep 10, 2015 11:54 am
I just wanted to mention these days I found there's an interesting alternative technique for generating VGMs (to be converted to PSG files to be used with PSGlib)

The alternative is: track a 4 channel XM/MOD file with a normal tracker (MilkyTracker, for example) and convert the tune to MML format using xm2mml. The result can be converted to VGM using XPMCK export to VGM functionality. Instruments can have volume envelopes, and the result isn't so bad overall :)

Anyway the suggested way still remains using DefleMask/Mod2PSG2 to track your tunes, export each tune to VGM format and then convert and compress using PSGlib's vgm2psg and psgcomp tools.
  View user's profile Send private message Visit poster's website
  • Joined: 23 Mar 2013
  • Posts: 572
  • Location: Copenhagen, Denmark
Reply with quote
Optimization>compression workflow
Post Posted: Sun Jan 03, 2016 9:26 pm
The PSGlib readme warmly recommends that Maxim's VGMTool is used to optimize the vgm file prior to the conversion. When I do that - I click the optimize button - the vgm is compressed into vgz. And then vgm2psg complains, because it wants an uncompressed vgm?!
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Mon Jan 04, 2016 8:57 am
I forgot to place a note about that: when you optimize it, the file also gets compressed. You then have to switch to VGMTool's last tab on the right 'more functions' and press the 'decompress VGZ' button.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Mon Jan 04, 2016 9:04 am
Vgm2psg should support compression... Just swap FILE* for gzfile* and likewise for the file access functions. The hard part is compiling zlib, and that's not very hard.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Mon Jan 04, 2016 10:02 am
Wow, I didn't know zlib supported uncompressed streaming from a gzip file, that's cool! I'm reading details about that right now, and I'll add support for compressed VGMs soon.
  View user's profile Send private message Visit poster's website
  • Joined: 23 Mar 2013
  • Posts: 572
  • Location: Copenhagen, Denmark
Reply with quote
Post Posted: Mon Jan 04, 2016 12:19 pm
sverx wrote
I forgot to place a note about that: when you optimize it, the file also gets compressed. You then have to switch to VGMTool's last tab on the right 'more functions' and press the 'decompress VGZ' button.
Thanks - got it!
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Mon Jan 04, 2016 1:28 pm
Maxim wrote
Just swap FILE* for gzfile* and likewise for the file access functions.


Weird, now it seems to work with vgz compressed files but it no longer works with uncompressed VGMs :|

zlib manual wrote
gzopen can be used to read a file which is not in gzip format; in this case gzread will directly read from the file without decompression. When reading, this will be detected automatically by looking for the magic two-byte gzip header.

I wonder if this means that I can't use gzseek / gzgetc / gzungetc to read an uncompressed file...
  View user's profile Send private message Visit poster's website
  • Joined: 14 Apr 2013
  • Posts: 516
Reply with quote
Post Posted: Mon Jan 04, 2016 1:33 pm
hang-on wrote
The PSGlib readme warmly recommends that Maxim's VGMTool is used to optimize the vgm file prior to the conversion. When I do that - I click the optimize button - the vgm is compressed into vgz. And then vgm2psg complains, because it wants an uncompressed vgm?!

You could abuse KiddEd for this task. It allows importing uncompressed/compressed vgm files and export to psg format.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Mon Jan 04, 2016 1:56 pm
The gzip file functions should just pass through to the C ones when the file doesn't have a gzip header, that includes gzgetc etc - but gzungetc only allows one character of pushback, are you checking your return codes?
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Mon Jan 04, 2016 2:18 pm
Maxim wrote
The gzip file functions should just pass through to the C ones when the file doesn't have a gzip header, that includes gzgetc etc - but gzungetc only allows one character of pushback, are you checking your return codes?


I'm quite puzzled.
If I use the zlib 1.2.8 I compiled myself (or one I found on the Internet) all I get is:
C:\MinGW\Projects\vgm2psg>vgm2psg.exe lztest.vgm lztest.psg
*** Sverx's VGM to PSG converter ***
Warning: unknown frame rate, assuming NTSC (60Hz)
Info: no loop point defined
Fatal: found unknown char 0xffffffff

C:\MinGW\Projects\vgm2psg>vgm2psg.exe lztest.vgz lztest.psg
*** Sverx's VGM to PSG converter ***
Info: NTSC (60Hz) VGM detected
Info: loop point at 0x0000004d
Warning: GameGear stereo info discarded
Warning: pause length isn't perfectly frame sync'd
Warning: GameGear stereo info discarded
Info: conversion complete


but if I instead use an old zlib 1.2.3 I also found on the Internet, then the behavior is the expected one (but I'd prefer not to use that!)
Should I make some configuration before making the lib myself?

edit: I suspect it's a bug hidden somewhere in zlib. I can make that work using zlib 1.2.5 and zlib 1.2.6:

C:\MinGW\Projects\vgm2psg>vgm2psg.exe lztest.vgz lztest.psg
*** Sverx's VGM to PSG converter ***
Info: NTSC (60Hz) VGM detected
Info: loop point at 0x0000004d
Warning: GameGear stereo info discarded
Warning: pause length isn't perfectly frame sync'd
Warning: GameGear stereo info discarded
Info: conversion complete

C:\MinGW\Projects\vgm2psg>vgm2psg.exe lztest.vgm lztest.psg
*** Sverx's VGM to PSG converter ***
Info: NTSC (60Hz) VGM detected
Info: loop point at 0x0000004d
Warning: GameGear stereo info discarded
Warning: pause length isn't perfectly frame sync'd
Warning: GameGear stereo info discarded
Info: conversion complete


but NOT with 1.2.7 and 1.2.8 :|
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Mon Jan 04, 2016 3:24 pm
Ok, just rolled up an update. Geez, I didn't plan to spend so much time on that! Now input file can be both a VGM or a VGZ. I compiled that with zlib 1.2.6 (later won't work) and wrote zlib authors about the misbehavior. Will recompile later, eventually.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Mon Jan 04, 2016 4:39 pm
Seems to be https://github.com/madler/zlib/issues/95 which includes a workaround. It seems you can static link without the issue, I recommend that (no dlls).
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Mon Jan 04, 2016 6:19 pm
it seems related, as I'm of course using MinGW. But I'm already statically linking the lib...

edit:can be this. I'll run a test.
  View user's profile Send private message Visit poster's website
  • Joined: 15 Sep 2009
  • Posts: 375
Reply with quote
Post Posted: Mon Jan 04, 2016 7:29 pm
The zlib issue seems to be a bug with MinGW, which seems to have an incomplete implementation of a 64-bit seek function.
I made a DLL build of zlib 1.28 using MSVC 6 not long ago that fixes the issue. (I also did a few minor improvements for seeking, but just building with Visual Studio fixed the issue already.)
  View user's profile Send private message
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Mon Jan 04, 2016 10:05 pm
Aargh, I hadn't even conceived that you might be using MinGW in this day and age. I guess it's still the only really free option, which is a shame. Visual Studio has free-as-in-beer options now, though.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Tue Jan 05, 2016 10:02 am
@Maxim: Yes, I'm on MinGW.
@ValleyBell: thanks for the hint, I ended up using that other workaround.

(update pushed)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Tue May 31, 2016 12:25 pm
I just rolled a small update: two functions added:

PSGSilenceChannels
PSGRestoreVolumes

which can be used when you need to suspend audio completely, as in 'pause' mode, when you won't process audio frames (no calls to PSGFrame and PSGSFXFrame performed) ... and then restore the previous PSG state when unpausing.

I hope I didn't break anything else :)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Tue Jun 21, 2016 3:03 pm
I'm working on a better vgm2psg converter, as I noticed that using vibrato/glissando in tunes makes a lot of PSG tone changes on the channels, but in fact it's often just changing the value of the lower 4 bits.

Observe for instance from this fragment how many times "Latch/Data: Tone Ch 0" changes the value but the following "Data" just stay the same:

0x00000065: 50 87       SN76496:   Latch/Data: Tone Ch 0 -> 0x007
0x00000067: 50 14       SN76496:   Data: 14
0x00000069: 50 94       SN76496:   Latch/Data: Volume Ch 0 -> 0x4 = 73%
0x0000006B: 50 F5       SN76496:   Latch/Data: Volume Ch 3 -> 0x5 = 66%
0x0000006D: 63          Wait:   882 samples (1/50 s)   (total   3528 (00:00.08))
0x0000006E: 50 86       SN76496:   Latch/Data: Tone Ch 0 -> 0x006
0x00000070: 50 14       SN76496:   Data: 14
0x00000072: 50 E6       SN76496:   Noise Type: 6 - White, Low (1731Hz)
0x00000074: 50 F7       SN76496:   Latch/Data: Volume Ch 3 -> 0x7 = 53%
0x00000076: 63          Wait:   882 samples (1/50 s)   (total   4410 (00:00.10))
0x00000077: 50 87       SN76496:   Latch/Data: Tone Ch 0 -> 0x007
0x00000079: 50 14       SN76496:   Data: 14
0x0000007B: 50 E6       SN76496:   Noise Type: 6 - White, Low (1731Hz)
0x0000007D: 50 F9       SN76496:   Latch/Data: Volume Ch 3 -> 0x9 = 40%
0x0000007F: 63          Wait:   882 samples (1/50 s)   (total   5292 (00:00.12))
0x00000080: 50 88       SN76496:   Latch/Data: Tone Ch 0 -> 0x008
0x00000082: 50 14       SN76496:   Data: 14
0x00000084: 50 FB       SN76496:   Latch/Data: Volume Ch 3 -> 0xB = 26%
0x00000086: 63          Wait:   882 samples (1/50 s)   (total   6174 (00:00.14))
0x00000087: 50 8A       SN76496:   Latch/Data: Tone Ch 0 -> 0x00A
0x00000089: 50 14       SN76496:   Data: 14
0x0000008B: 50 FC       SN76496:   Latch/Data: Volume Ch 3 -> 0xC = 20%
0x0000008D: 63          Wait:   882 samples (1/50 s)   (total   7056 (00:00.16))
0x0000008E: 50 8B       SN76496:   Latch/Data: Tone Ch 0 -> 0x00B
0x00000090: 50 14       SN76496:   Data: 14
0x00000092: 50 FD       SN76496:   Latch/Data: Volume Ch 3 -> 0xD = 13%
0x00000094: 63          Wait:   882 samples (1/50 s)   (total   7938 (00:00.18))
0x00000095: 50 8A       SN76496:   Latch/Data: Tone Ch 0 -> 0x00A
0x00000097: 50 14       SN76496:   Data: 14
0x00000099: 50 FF       SN76496:   Latch/Data: Volume Ch 3 -> 0xF = 0%
0x0000009B: 63          Wait:   882 samples (1/50 s)   (total   8820 (00:00.20))
0x0000009C: 50 88       SN76496:   Latch/Data: Tone Ch 0 -> 0x008
0x0000009E: 50 14       SN76496:   Data: 14
0x000000A0: 50 98       SN76496:   Latch/Data: Volume Ch 0 -> 0x8 = 46%
0x000000A2: 63          Wait:   882 samples (1/50 s)   (total   9702 (00:00.22))
0x000000A3: 50 87       SN76496:   Latch/Data: Tone Ch 0 -> 0x007
0x000000A5: 50 14       SN76496:   Data: 14


Thus I'll soon release a newer converter, that would avoid changing the upper part when it's not needed.
In my tests I could save almost 5KB on a vgm that was converted into a 20KB psg :)
  View user's profile Send private message Visit poster's website
  • Joined: 14 Apr 2013
  • Posts: 516
Reply with quote
Post Posted: Tue Jun 21, 2016 7:00 pm
sverx wrote
I'm working on a better vgm2psg converter, as I noticed that using vibrato/glissando in tunes makes a lot of PSG tone changes on the channels, but in fact it's often just changing the value of the lower 4 bits.

Observe for instance from this fragment how many times "Latch/Data: Tone Ch 0" changes the value but the following "Data" just stay the same:

0x00000065: 50 87       SN76496:   Latch/Data: Tone Ch 0 -> 0x007
0x00000067: 50 14       SN76496:   Data: 14
0x00000069: 50 94       SN76496:   Latch/Data: Volume Ch 0 -> 0x4 = 73%
0x0000006B: 50 F5       SN76496:   Latch/Data: Volume Ch 3 -> 0x5 = 66%
0x0000006D: 63          Wait:   882 samples (1/50 s)   (total   3528 (00:00.08))
0x0000006E: 50 86       SN76496:   Latch/Data: Tone Ch 0 -> 0x006
0x00000070: 50 14       SN76496:   Data: 14
0x00000072: 50 E6       SN76496:   Noise Type: 6 - White, Low (1731Hz)
0x00000074: 50 F7       SN76496:   Latch/Data: Volume Ch 3 -> 0x7 = 53%
0x00000076: 63          Wait:   882 samples (1/50 s)   (total   4410 (00:00.10))
0x00000077: 50 87       SN76496:   Latch/Data: Tone Ch 0 -> 0x007
0x00000079: 50 14       SN76496:   Data: 14
0x0000007B: 50 E6       SN76496:   Noise Type: 6 - White, Low (1731Hz)
0x0000007D: 50 F9       SN76496:   Latch/Data: Volume Ch 3 -> 0x9 = 40%
0x0000007F: 63          Wait:   882 samples (1/50 s)   (total   5292 (00:00.12))
0x00000080: 50 88       SN76496:   Latch/Data: Tone Ch 0 -> 0x008
0x00000082: 50 14       SN76496:   Data: 14
0x00000084: 50 FB       SN76496:   Latch/Data: Volume Ch 3 -> 0xB = 26%
0x00000086: 63          Wait:   882 samples (1/50 s)   (total   6174 (00:00.14))
0x00000087: 50 8A       SN76496:   Latch/Data: Tone Ch 0 -> 0x00A
0x00000089: 50 14       SN76496:   Data: 14
0x0000008B: 50 FC       SN76496:   Latch/Data: Volume Ch 3 -> 0xC = 20%
0x0000008D: 63          Wait:   882 samples (1/50 s)   (total   7056 (00:00.16))
0x0000008E: 50 8B       SN76496:   Latch/Data: Tone Ch 0 -> 0x00B
0x00000090: 50 14       SN76496:   Data: 14
0x00000092: 50 FD       SN76496:   Latch/Data: Volume Ch 3 -> 0xD = 13%
0x00000094: 63          Wait:   882 samples (1/50 s)   (total   7938 (00:00.18))
0x00000095: 50 8A       SN76496:   Latch/Data: Tone Ch 0 -> 0x00A
0x00000097: 50 14       SN76496:   Data: 14
0x00000099: 50 FF       SN76496:   Latch/Data: Volume Ch 3 -> 0xF = 0%
0x0000009B: 63          Wait:   882 samples (1/50 s)   (total   8820 (00:00.20))
0x0000009C: 50 88       SN76496:   Latch/Data: Tone Ch 0 -> 0x008
0x0000009E: 50 14       SN76496:   Data: 14
0x000000A0: 50 98       SN76496:   Latch/Data: Volume Ch 0 -> 0x8 = 46%
0x000000A2: 63          Wait:   882 samples (1/50 s)   (total   9702 (00:00.22))
0x000000A3: 50 87       SN76496:   Latch/Data: Tone Ch 0 -> 0x007
0x000000A5: 50 14       SN76496:   Data: 14


Thus I'll soon release a newer converter, that would avoid changing the upper part when it's not needed.
In my tests I could save almost 5KB on a vgm that was converted into a 20KB psg :)

Nice improvement! Did you reduce from 25 KB to 20 KB or from 20 KB to 15 KB? Which vgm is it?
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Tue Jun 21, 2016 7:08 pm
I meant from 20KB to 15KB, the test was on a 38KB VGM from TomyS.
I'll do the same to the tunes I'm using in MARKanoIIId and see how much KB I'm shaving off :)

edit: testing MARKanoIIId tunes (uncompressed PSGs):
1st tune: was 27.5KB - now 24.6KB (the source VGZ is 13.1KB)
2nd tune: was 24.8KB - now 22.3KB (the source VGZ is 10.6KB)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Thu Jun 23, 2016 8:27 am
vgm2psg has been just updated, as the resulting (smaller!) PSG files seems to just play the same as those generated before. Please let me know if you find some difference :)

speaking of psgcomp -the PSG compressor- instead, I'm wondering if there's some faster way to compress a PSG, compared to what I'm doing now (here the source).
I'm basically searching every possible substring match, of any length (starting from the longest) from the beginning of the data. This means N*N*N calls to memcmp(), which is the reason why it takes so much time when the input file is just bigger than a single KB, not to mention when it's like 20KBs (and if a match it's not found, we just skip one byte and start over...)

How do compressors search for substring match effectively? It seems I can't :|
  View user's profile Send private message Visit poster's website
  • Joined: 08 Dec 2013
  • Posts: 200
Reply with quote
Post Posted: Thu Jun 23, 2016 1:45 pm
This is a very nieve answer, but how about multi-threading?
Also, sometimes the solution is to not try do everything at once and instead filter in various stages -- that is, do some work, collect the results, and then begin another filtering process on those results; dividing and conquering the work.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Fri Jun 24, 2016 1:14 pm
I'm not sure how multi threading would help, other than speeding up (a bit) on multicore/multiprocessor PCs, at the cost of having it almost stuck working on the compression. :|

What I really meant is: is there a faster way to look up an array in a longer array and have as a result where exactly there's the most matching? I mean something that does

seek(needle,haystack)

that will return where in haystack (its exact position) there's the longest part of needle?
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Fri Jun 24, 2016 1:43 pm
memcmp can be very fast because it can optimise to an SIMD inner loop, and return after the first mismatch; but what you really want is an implementation of memmem(), which is not standard (GCC has it, VC not). Here's something: http://www.codeproject.com/Articles/250566/Fastest-strstr-like-function-in-C which is frankly quite scary.

But that's not what you really want - you really want to find the longest duplicated substrings, subject to a maximum length. You could implement that in a forward-only fashion, looking for past data runs matching the future data, but it would be sub-optimal in some cases. Even selecting the longest runs as I think you are doing is not guaranteed optimal.Anyway, I'd avoid the memcmp (or memmem) entirely and instead do something like:

- for each offset
-- find any matching byte and then walk forward byte-wise until a mismatch is found
-- store the longest run offset and length
- select the longest match and substitute the match data
- amend all the other existing run data
-- shift the offset for items after the substitution
-- discard runs inside the substitution
-- truncate the length of runs starting before the substitution, and running into it
- repeat until all runs are consumed

I may be missing something in here, and the upper bound complexity is still fairly high. You could optimise the case where the maximum run length is found (assume it will be chosen and don't bother checking substrings within it), but that may not matter much.

Don't be afraid to use C++ (collections make it easier, even if they can be slow) and memory (it's not a concern these days). Or even go to Python, C#, Java etc if it helps.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Fri Jun 24, 2016 6:01 pm
Wow, thanks! I logged on to post that I had just read about the nonstandard memmem() function, and how that could be interesting, even if not entirely fit to my needs but... ok, you beat me once more ;)

I'll evaluate your approach and I'll see if it can be computationally less heavy... as for using other languages, I wonder how they could speed up things... I mean, as long as I'm searching a (small) segment into a (long) buffer, they won't use any different data structures than arrays, or am I wrong?
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Fri Jun 24, 2016 7:58 pm
I'm thinking that seeking if a given "substring" has been already found before is a dictionary search problem, so maybe I could read the input data and build a dictionary and -at the very same time- see if the data is already there, which in that case would detect a match.
Instead of outputting immediately the result of the matching, I should anyway delay that because I want to be sure I'm not 'wasting' a better result, think about compressing some
ABCABCABCABC
I don't want this to become the suboptimal
ABC<0,3<0,3<0,3
but I'm seeking the
ABCABC<0,6
which is surely better.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Sat Jun 25, 2016 7:42 am
The thing that makes this harder is that you want to have the back references refer to the compressed data, whereas most LZ compression refers to the uncompressed data. This means you modify the reference data on every iteration so it's more intensive as a result. It also means all the stuff on the internet about LZ compression isn't much help.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Sat Jun 25, 2016 12:04 pm
Maxim wrote
It also means all the stuff on the internet about LZ compression isn't much help.


So true, unfortunately :|
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Tue Jun 28, 2016 9:58 am
I just want to share this: compare the size of each psgc file with the size of the vgz it comes from:

size (bytes)   name
11.854 WI50A.psgc
 9.719 WI50A.vgz

11.802 WI50B.psgc
 9.452 WI50B.vgz

11.900 WI50C.psgc
10.970 WI50C.vgz


... now if only I could make the compressor faster :|

(tunes from TomyS, written with VGM Music Maker, exported to VGMs, then passed thru Maxim's VGMTool optimizer, which also compress them into VGZs)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Mon Oct 16, 2017 1:03 pm
Last edited by sverx on Tue Oct 17, 2017 7:59 am; edited 1 time in total
I just updated the psgcomp tool (the PSG compressor): now it's WAY faster (from a few minutes to just some seconds to turn a PSG file into its compressed form).

In my tests, it also often achieve a better compression. I'm still not sure why, but the decompression tests I've run show that the file hasn't been damaged.

As I could only test that with a limited set of files, I suggest you backup the older compressor before switching to the new compressor (and if you want to try re-compressing your PSGs please keep the older compressed files, as they might occasionally be smaller)
  View user's profile Send private message Visit poster's website
  • Joined: 01 Feb 2014
  • Posts: 517
Reply with quote
Post Posted: Tue Oct 17, 2017 7:12 am
sverx wrote
In my tests, it also often achieve a better compression.

How much better? Just a few bytes or something more substantial?
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Tue Oct 17, 2017 7:58 am
Kagesan wrote
How much better? Just a few bytes or something more substantial?


Hardly anything interesting, at least if you're not really desperate for the last few bytes. In my tests the best improvement I've got is around 1.5% smaller, one hundred and some bytes shaved on a 11 KB file.

But again, the point was to make the compressor faster. I was really sick and tired by how much time it took with previous version. Now, even if it isn't blazingly fast, I can call it fine.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Tue Oct 17, 2017 4:17 pm
I've dabbled in writing compressors a few times, although yours is unusual in that it refers into the compressed stream so it's a bit harder to find examples. What did you do to change the strategy? I tried a few techniques like longest match first, but it was always slow and memory hungry. A simple greedy forward compressor seems almost as good and very fast. I have seen people mention optimal LZ compression but I can't figure out how that's even possible except by brute force.
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 377
  • Location: Málaga, Spain
Reply with quote
Post Posted: Tue Oct 17, 2017 5:13 pm
Have to test NOW as i spent 93 kb for psgs. If could get more space it would be great.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Tue Oct 17, 2017 7:21 pm
@Maxim: I switched from testing any possible length from longer to shorter using memcmp() to repeating to test single bytes match until it lasts or until maximum length is reached. It's faster, but I don't get why it compresses better (and not even always!)

if you want to try to create a compressor for this format, you're welcome!

eruiz00 wrote
Have to test NOW as i spent 93 kb for psgs. If could get more space it would be great.


not sure it's worth... you're going to save 1 KB probably ;) (please backup your files before starting)
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 13239
  • Location: London
Reply with quote
Post Posted: Tue Oct 17, 2017 7:32 pm
I keep meaning to make a version which has the channels in separate streams - more overhead to play but I think the self similarity of each channel will be much better. Some Huffman coding will probably work very well but probably not very fast on Z80.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 2687
Reply with quote
Post Posted: Wed Oct 18, 2017 8:15 am
I also think it would compress much better, but you need to rewrite PSGlib from scratch - which is just the same as creating a PSGlib alternative.

BTW I still prefer something that wastes more ROM than something that wastes more CPU, as the SMS CPU is just as fast as it was back then, whereas ROM is much more affordable these days (I mean we really don't need to fit in 32/64/128 KB, as it was the case back then)
  View user's profile Send private message Visit poster's website
Reply to topic Goto page Previous  1, 2, 3, 4, 5  Next



Back to the top of this page

Back to SMS Power!