Forums

Ok. So I grabbed some data from Bram Stoker's dracula.

I only found one. And it seems to be called from one place. I didn't find any other similar data elsewhere in the rom so I think this game only has one voice sample, and it is called in one way.

In the SMS version, it is loaded and played at 6041. You can even jump to it in the emu and get the playback. That's how I logged it.

Data is (Physically) 36A9A to 37A89 (length FF0).

There seems to be some concordance between the vgm log and the sound data.

The issue is how to import?

I imported it raw at 3000 hz and it seems the right length and speed. But the pitch is slightly off. The only indication I have of the temp is that it loads $20 into b at 6076 and $25 at 6094. b counts down 20 then increases hl (which starts at AA9A (logical) then counts up. Then it counts down from $25, reaches 0 and increases DE (which starts at FF0 (data length) and counts down.

I have no idea how to do the math. Do we add the two (is it 45 or 47 (includes 00?) and then multiply by some value to get the sample rate? Changing either by $10 results in a higher pitch sample, while increasing it in lower (makes sense). Smaller values harder to detect by my ear on casual listen....

In Audacity it sounds muddy, but is the right sample rate is ~3200-4000hz.

When I compare wave outputs. The vgm has the waveform only in the top section (not sure what that means) while the raw import has it on both sides.

Is there such a thing as a 3 bit or 4 bit header for wave files? Can I attach it to the raw data? How do I prepare it?

Also, does psg wave data (not sure what to call it) generally consist of a lot of 88 8x...7x,9x commands? Not sure how to interpret it...but it kinda seemed to follow the vgm a bit for the parts I checked.

update: here's the entire code for the function that does the wave playback, since it isn't long.

update 2: Also added Voice SGC. Update function doesn't interfere thankfully.

_LABEL_6041_:
ld hl, $AA9A
ld de, $0FF0
ld a, $0D
ld (_RAM_FFFF_), a
di
ld a, $80
out ($7F), a
ld a, $00
out ($7F), a
ld a, $A0
out ($7F), a
xor a
out ($7F), a
ld a, $C0
out ($7F), a
xor a
out ($7F), a
ld a, $FF
out ($7F), a
out ($7F), a
ld a, $9F
out ($7F), a
ld a, $BF
out ($7F), a
ld a, $DF
out ($7F), a
-:
ld a, (hl)
srl a
srl a
srl a
srl a
ld b, $20
-:
djnz -
or $90
out ($7F), a
or $A0
out ($7F), a
res 5, a
or $C0
out ($7F), a
ld a, (hl)
inc hl
and $0F
ld b, $25
-:
djnz -
or $90
out ($7F), a
or $A0
out ($7F), a
res 5, a
or $C0
out ($7F), a
dec de
ld a, d
or e
jr nz, -
ld a, $9F
out ($7F), a
ei
ret

Dracula PSG 16-Bit Dac.mp3 (31.9 KB)
linear

Bram Stoker's Dracula (E) [!]-01.vgm (86.68 KB)

Dracula PSG wave.zip (2.15 KB)

data (.wav / .psg not allowed)

Dracula-Voice-SGC.zip (5.76 KB)

normal and infinte loop version

The game is outputting the data at a rate proportional to the time it takes to run the code. That means you need to measure the time from one sample to the next. Meka has a "clock" command to help with that. You may end up with a weird sample rate. Check if the time between samples is consistent, use the average if it isn't.

The data is 4-bit, you may need to unpack it (two samples per byte, high nibble first) if Audacity doesn't support it. Also, you may want to try converting it to the logarithmic volume levels of the PSG - it may sound better or worse that way.

It would be fairly easy to convert an entire ROM this way, then just look for non-noisy data visually.

Maxim wrote

The game is outputting the data at a rate proportional to the time it takes to run the code. That means you need to measure the time from one sample to the next. Meka has a "clock" command to help with that. You may end up with a weird sample rate. Check if the time between samples is consistent, use the average if it isn't.

The data is 4-bit, you may need to unpack it (two samples per byte, high nibble first) if Audacity doesn't support it. Also, you may want to try converting it to the logarithmic volume levels of the PSG - it may sound better or worse that way.

It would be fairly easy to convert an entire ROM this way, then just look for non-noisy data visually.

Ok neat. I tried it with the clock command. I don't understand how it works. I set a breakpoint at the playback function, reset the clock and stepped out. I got 4529209 cycles.

That's very different from what I got with another tool z80cycles which counts through cycles (from the SMPS research pack).

It counted 495 + (32-1)* 13 + (37-1)*13 =1366 cycles

I then divided the clock by that value and obtained 2620.45754.
It seems to play 2 samples per byte/run through, but neither the obtained value nor double that are close to what sounds appropriate ~3200. Actually, I have no way of knowing if audacity is obscuring things as I have never worked with this type of data before in such a way.

I have no idea how to do any of the things you mention. Audacity doesn't seem to support 4 bit data so I imported it raw. What do you mean by "unpack" it? Or converting to logarithmic volume levels? All I did was find the data. I have no idea how it actually works

I appreciate the reply, but you just skipped a few levels past me :) I can't tell if you're recommending running it through some program or script or something much simpler, which is nevertheless currently going over my head.

Please, again, in simpler language? I learn best from concrete examples. I am not a good theoretician.

Thanks.

I'd like to write a small tool for this. If you have data like

55 67 84 32

Then this is actually values 5, 5, 6, 7, 8, 4, 3, 2. Each is in the range 0..15. You could map it to normal 16-bit audio either linearly (equally spaced) or logarithmicly (to match how the PSG works, each step down is -2dB). Then the sample rate is the last unknown, and hard to guess...

Set a breakpoint on the first out ($7f) of each block. Hit one, then reset the clock and run to the second. Display the count and repeat (back to the start). There's your two times between samples - ideally the same, or very close. The average is the cycle count per cycle, divide by the CPU clock (decide NTSC or PAL as you think best) to get the sampling rate in Hz.

I might decide to get on board since I like to write converters. Just a question which might be obvious, are we talking signed or unsigned here? 8-bit PCM is usually unsigned (0..255), while 16-bit PCM is usually signed (-32768..32767), so I'm not sure about 4-bit, but what you're saying is making me guess it's unsigned at the very least, if not logarithmic.
edit: yup, definitely unsigned.

sherpa: just so you know, most WAV files are nothing more than raw PCM data with a header, though the WAV container can also be used with compressed ADPCM and other things. It's not really relevant right now but I thought to mention it as a bonus. Theoretically nothing prevents you from writing "4" in the header chunk used for the bits, though I'm not sure how compatible it would be in practice as most audio editors don't expect any values other than 8 and 16 (and maybe 32).

I'm almost done writing something... update soon.

Edit: done. See attached, it does linear and handles both nibble orders. Examples are at a guessed 8kHz, I didn't measure the game timing.

Edit 2: see https://github.com/maxim-zhao/SampleToWav/releases/tag/v0.1

dracula-linear.mp3 (10.13 KB)
Linear example

dracula-log.mp3 (10.13 KB)
Logarithmic example

Well, I tried to warn you :) Now do PWM... (I have no idea how to do that.)

Edit: my code is here:

https://github.com/maxim-zhao/SampleToWav/

Fork away... I used unsigned 16-bit data arbitrarily, signed might be more compatible and would be easy enough to adjust to. Maybe better to let you pick the zero point if that's the case - the example above has a bit of DC offset as a result. Convert a whole ROM to see if you can find hidden samples!

Man I must have just missed you guys. I ended up testing on the entire rom. It only has that one sample. Who knew it would be so easy. I will need to look at your results a bit.

Can I change the playback rate just by editing the header? That would be convenient. I love how i can just open your file. I'm still not sure what the right sample is: from the first out ($7F), a after the loop starts to the next iteration is 1208 samples. Earlier I had it wrong as I was calculating for an ntsc clock, but the rom is 50hz...

I'll have to study your work so I can add headers to these files more easily. I ended up using linear volume 00-240.

How would I use decibel?

From the development wiki:

int volume_table[16]={
32767, 26028, 20675, 16422, 13045, 10362, 8231, 6568,
5193, 4125, 3277, 2603, 2067, 1642, 1304, 0
};

Are those valid decibel values? But all of those are much larger than 8 bit values for volume (00-FF) How would we translate this in practice?
Since you guys seem to know what's going on easily, would you mind offering a quick lesson?

I'll have to look at the source code to see if I can derive any understanding to my question.

Thanks guys. Pretty sure the sample rate isn't any of the above...Wish I had a clear example.

I'll have to look at these a bit closer though

Thanks a lot!

1208 cycles per two samples = 3579545/1208*2 = 5926 samples per second (using the NTSC clock). Using the PAL clock of 3546895Hz you get 5872Hz. Both are on the low side - but maybe they match the game. 8kHz sounds right to me...

Audio is its raw form is most often expressed as linear PCM. That means for each sample, there's a number telling you where the speaker should be, to move it back and forth fast enough to make the right sound. The values are usually linear, so they are evenly spaced between a minimum and maximum (which might be 0 to 256, or - 32767 to +32768, for example).

There's also some common sampling rates people use, for example 44100 (used on CDs) or 8000 (common for older computer audio). That makes it reasonably likely that SMS samples may be at 8kHz.

Next, the difficulty of sample playback on the SMS. It doesn't have any PCM support, but as a (possibly accidental) hack, rapidly changing the volume can produce waveforms in the shape of the volume levels. However these aren't linear - the steps get smaller as the volume gets quieter. The sample data may take this into account - but it often doesn't. Dracula seems not to. The log output is closer to what you hear in the VGM, SGC and real console, but the linear one is more like what they intended.

Finally, there's other ways of storing samples, so these tools won't always work. Some games store PWM data - which is harder to interpret into a WAV or MP3. Some store 8-bit data - or even compressed data - and convert it on the fly.

I redid the tests more carefully. When in the proper mode, the value is ~1110 cycles per byte. I got similar numbers to what I found above. Values aren't perfectly consistent, but they hover around there at an average ~1110.89-1110.1 Each nibble section takes 575 cycles. I would have thought they would be a bit more uneven due to the different values of b, but it was consistent. I think I think I got similar values earlier, but I discarded them because I thought they did not make sense.

The rates that match up closest to the original are 3150-3200. I actually like how 3150 lines up better than 3200, but when I did the math.

3546893/1110 , I got rates in the range of 3195.199 to 3195.399.

I ended up using that value.

I prefer to have the data in its original form as an option. I'm not too clear on Tom's above comment about not making PCM data into wav. Thanks to tom's work (and the internet). It was easy to make my own,

I would like to output linear and logrithmic conversions. The tools are nice, but I'd like to be able to understand how they work, for mine and others edification. It's nice when these topics get discussed.

I had the eight bit sample done by the time I read your posts, but I was still not clear on the proper frequency. I think I was looking at the values you were using, and I thought they were way too high. But actually 8000 would likley have been an acceptable (though high pitched) rate for the 8 bit DAC. I got a bit fixated on how the 4 bit version was sounding with what l thought at the time were supposedly "correct" values.

I got the 4 bit version working after I read these posts, and just tested double the rate in the 8 bit and liked what I saw. Only downside is it's linear, which you mentioned might be appropriate here. It would be nice to be able to produce something similar on my own. If anything it helps me use and appreciate others' tools better.

I'm still not clear why vgm>wav files show up on only one side of the line as Tom alluded to before, but my conversions show up on both. It doesn't work right if i import it signed, so it's definitely unsigned...so why are values like 80 80 70 F0 drawn on both sides? Maybe it's some feature in audacity-I don't really understand how that maps to emulate changes in air pressure/sound anyway, but it would be nice to have consistency.

With the 8 bit dac, I could have used tom's conversion, but I ended up just working with what I did manually to get my bearings and practice working with the header which is still a bit confusing, especially as the guides I found use decimal notation with values starting at 1 to describe hex data. I'm not clear on the significance of byte 0x10...is it always 0x10 (16)? I'm assuming byte 0x14 will always be 1 = PCM for the files we work with here. (Do SMS ever do PWM? [I don't even know what that is]). Byte 0x20 makes no sense at all to me, I haven't found any resource that actually explains it.

I'm guessing it's the reason the sample values at 0x18 and 0x1C match. But didn't we say that each byte produces 2 values? So is it supposed to be 1 for 8 bit and 2 for 4 bit? (edit: i changed the value and noticed no changes)=no clue what this is.

Lastly, if the average rate of a file, in this case the 4 bit dac is 3195.399 and the 8 bit conversion is made...do we just discard the excess or round it up from 6390 to 6391? No one will notice the difference at that rate, but the purist in me would like to know if there is a common practice.

Lastly, where on the site are these goodies (and those to come) going to be organized?

Dracula PSG 4-Bit Dac.zip (2.2 KB)

Dracula PSG 8-Bit Dac (linear).zip (2.34 KB)

should sample rate be 1 higher due to rounding?

(post blanked by author)

Oh ok. Makes sense. I thought you were mentioning that PCM in general should not be converted to wav for some reason not explained. It makes sense that adding arbitrary extensions would not be helpful overall.

Just realized Maxim's tool converts to 16 bit wave files. I'm going to have to look at the math, but it looks light it might use values similar to what is in the wiki table.I had trouble imagining how to convert them to 8 bit values other than dividing by 256 or 128 whichever was appropriate.

Dracula PSG 16-Bit Dac.mp3 (31.9 KB)

Dracula PSG 16-Bit Dac.zip (2.84 KB)

converted using maxim's tool

(post blanked by author)

Ok that makes a bit more sense.

I think I understand the logrthmic a little better. It's not that there's some magical formula out there that converts the electrical current value to some volume value, it's that compared to max, say 65535 (FFFF) reducing that by a logrithmic value has a closer counterpart to the spectrum of values, so you reduce the max value FF or FFFF by a relative amount and derive the scale.

I actually had a request. Would you guys be willing to update your apps to handle batch processes? It would speed up my work alot digging through games which have uncompressed audio. Just realized a lot of games are not included in the [Sampled Audio] tag that should have it.

Thanks for your help

(post blanked by author)

The PSG actually only outputs 0 and 1 values (before volume scaling), and relies on extra components to remove the DC bias (signal on one side of the line). Most emulators just output a balanced signal directly.

When in sample mode, however, the signal is stuck at 1. The balanced signal is now unbalanced. Volume changes affect the height of the line, but it's always above the line. A dynamic DC bias removal stage in code would result in what you expect.

The table mentioned above is for volume scaling in 16-bit signed mode, and it would make no sense to have negative values - it's a list of amplitudes. A full volume tone would go between +32767 and - 32767, but a full volume sample goes between 0 and +32767. That's why voices are so quiet - and why they appear above the line.

For my tool, I was not trying to make a PSG emulator, I was trying to capture the full dynamic range of the sample, so I stretched it to the absolute maximum range.

I can adapt my tool to batch processing very easily - multi selecting files would be a simple addition. I'll do it when I get a chance (maybe Monday) or maybe someone will send me a PR before then. Tom, here's your chance to escape VB ;)

Also, I went for a GUI tool rather than a command line for ease of use, but the core conversion function could work either way.

Ok. Makes sense. I've been using the conversion app to make finding voice data easier. A lot of games seem to compress it, so I'll have to see how the emulator uses it.

Alex kidd has several samples, of "Miracle Ball", apparently in different banks. I haven't verified whether it is actually accessed in game. I've been using logrithmic setting to convert, but that means when I try to reverse lookup the data, I'm looking at information i need to translate, so I'll stick to using the linear setting.

@Maxim Would it be easy to add 8 and 4 bit (native) settings to the output? Working with this a bit more, I see why you went with 16 bit.

Audacity doesn't even let you export 8 bit audio. The only way seems to require we create our own tools, which in this case seems a relatively easy conversion process once the main work has been done. Speeds will still be an issue.

I don't think you can make a 4 bit WAV that works reliably in most software. Some sound hardware can't reliably play anything below 44kHz 16-bit audio anyway. My goal was to make something convertible to MP3 without too much difficulty.

If you can get your editor to display sample numbers - I think Audacity can - then the file offset is the sample count divided by 2, regardless of the bit depth of the WAV.

Maxim wrote

I don't think you can make a 4 bit WAV that works reliably in most software. Some sound hardware can't reliably play anything below 44kHz 16-bit audio anyway. My goal was to make something convertible to MP3 without too much difficulty.

If you can get your editor to display sample numbers - I think Audacity can - then the file offset is the sample count divided by 2, regardless of the bit depth of the WAV.

I almost got happy, but unfortunately, samples is not a reliable indicator of physical space. Interestingly, if I set the sample rate to 4000, 8 bit pcm, draula lines up to the exact byte if I change the sample value to hex.

Alex kidd seems to be in 16 bit or at least is represented differently than wave data in dracula.

I'll have to look a bit closer later. Unfortunately, changing the bit or sample rate affects the offset, so preserving the bytes seems the way to go. Unfortunately audacity converts everything to 16bit...~logarithmic? values so it will probably not be the go to tool for this unless i can figure this out.

Audacity is fine for preparing files for playback - like the MP3s I made above. For preservation, just pull out the data and document the format.

Here's an update. You can select multiple files now (outputs to generated filenames like Tom's), and you can select "the whole file". Internally, the code is no longer all in one file, and there's a half-assed attempt at splitting the concerns up: interpreting the data in the ROM, and rendering the intermediate data to samples. You can now use it to extract 8-bit unsigned data - like in Beavis and Butt-Head - and it offers to render that 8-bit data down to logarithmic PSG output (by discarding the low 4 bits). It should be easier to handle more data formats.

I had a look at Alex Kidd: The Lost Stars. The data does seem to be 4-bit linear PCM - and the repeats in higher banks is presumably filling unused space. The "I'm the Miracle Ball" sample is so much clearer as linear PCM!

Edit: attached the real update now

Edit 2: see https://github.com/maxim-zhao/SampleToWav/releases/tag/v0.2

Hey, Maxim. Do you know any reason why the AHHH! sound in alex kidd works in your wave tool? I just realized I've had some issues finding some sounds in game when i import into audacity as an 8 bit sample or otherwise, but it works great when I use your tool.

Does your tool do something special to the samples to make them audible? Some secret function? Decompression? The other samples are audible by importing into audacity but the AHHH! is not.

I've also had the issue with other games not showing any voices when I imported them. I thought it was compression, but when I use your tool it works great. I just used it on the voice data and finally got it to work. Lost an hour trying to figure it out.

Could someone take a look and see if they can figure out why this is not working? My 4 bit dac of dracula worked fine once i put a header on it. Can't figure it out.

Also, what should I call the voice data without header? PCM? Foobar recognizes the extension, but doesn't play it, too bad.

Thanks for your help guys. I'll post a thread once I finish the other voices.

The other "miracle ball" samples are at 13700 and 17700. They run to the end of the bank. I have not found any code referencing it, though it might be used. Anyone remember other places where miracle ball is used besides the beginning?

I can rip it, though without the actual in game access I can't be 100% sure that i have the right points, only that the bytes before these don't match the first sample but they do after it. They are a subset of sample 2 not sure why they are in the rom or if they are used. I'll look a bit more.

Alex Kidd - The Lost Stars (UE) [!] - Voice 01 - Ahhh! (logarithmic).mp3 (22.65 KB)

Alex Kidd - The Lost Stars (UE) [!] - Voice 01 - Ahhh! (linear).mp3 (22.65 KB)

Alex Kidd - The Lost Stars (UE) [!] - Voice 02 - Find the Miracle Ball-4-bit.mp3 (36.12 KB)
sounds bad...why

Alex Kidd - The Lost Stars (UE) [!] - Voice 02 - Find the Miracle Ball (logarithmic).mp3 (36.12 KB)

Alex Kidd - The Lost Stars (UE) [!] - Voice 02 - Find the Miracle Ball (linear).mp3 (36.12 KB)

Alex Kidd - The Lost Stars (UE) [!] - Voice 01 - Ahhh!.zip (5.26 KB)

Alex Kidd - The Lost Stars (UE) [!] - Voice 02 - Find the Miracle Ball.zip (1.8 KB)

The repeats of the sample data are just filling unused space. Some of the dev hardware would tend to fill unused space with left over data from previous banks or previous builds, so you have to expect to find unexpected things sometimes.

Your bad sounds are most likely 4 bit data being interpreted as 8 bit, or some other format mismatch.

That could be true. I wonder if the way dracula processed the sound allowed it to be played back appropriately as 8 bit..while alex kidd's method doesn't.

I confirmed (as best I could) that the data at the end of banks 4 and 5 is just junk filler data. I was wondering why they would waste that much physical space when it is was expensive back then..but it isn't much (length 75B). That's just a bit smaller than AHHH! though.

@Maxim. I've had problems with your tool and Tom's. I've been using yours for the logarithmic feature, though in the future, I think I might stick to the linear conversions until it starts to make sense. These two games both seem not to benefit much from it. Maybe the higher quality voices will. I'm not able to select multiple inputs as you claimed with your tool. I think it saves the configuration of the last working path somewhere. I don't see it creating a file, so I assume it must use the registry somehow. Could the new program be using the old form configuration as well? I didn't see a new executable on github, I assume I would need to compile it, which i don't think I can do (outside knowledge based reasons). I'm guessing it's visual C or something similar. Could you check to see if you didn't actually upload the same one? The file sizes are the same thought the dates are different. I assume added code would result in a new size & date.

Otherwise, instructions on clearing the old config would be appreciated. My version only lets me select one file at a time. It would be convenient to have batch processing once I start creating PCM packs with more than a few voices. Originally, I mainly wanted it to accelerate finding music, but results aren't always so lucky. And proper conversion takes time to verify. Especially as I'm still not too used to working with this format.

Also, any recommendations on a good wav > mp3 converter? maybe command line? I use audacity, but it takes time to open and export each wave. I don't want anything bulky, just something that would allow me to prepare the files to post in the pcm thread. I probably prefer wav myself. At these sizes, they don't take up much space anyway. A sample based vgm is actually ~20-40 times larger. I'm guessing the difference would be exponential.

I will be posting the Lost Stars data in its own thread shortly.

Thanks for the help guys. This would have taken much longer without you.

uhm... why not use SoX?

For the simple reason that I did not know about it :)

Thanks for the tip. There seems to be a bit of a learning curve. I can't convert from wav to mp3 just like that. It seems to require me setting some paramenters. But it looks like what I was looking for.

Thanks!

Sorry, I didn't mean to suggest SoX for mp3 conversion, even if it can do that. I meant to suggest it for raw 4 bit data to wav file conversion...

Ok, I might look into that. I had problems trying to do just that in lame, as i don't really understand either the software or the sound standards. I'm not really sure how wav>mp3 compare to one another except one is compressed, or how raw relates to possible wav/mp3 standards.

I'll look a bit more into it.

Thanks

I did check SoX - I used it years ago so it seemed plausible, but the docs don't suggest it works and I found some posts online about it saying it doesn't work. The log output isn't going to be easy with SoX either.

I updated my incorrect attachment above.

One thing to note is that some games seem to use very clipped/compressed samples, perhaps to make them sound louder, which makes them very hard to see on a PCM waveform plot. Switch to a spectrograph and it's easy to see voices show up as bands of overtones amongst a load of noise and excessively symmetric patterns - see the attachments.

Space Harrier spectrograph.PNG (407.5 KB)

Space Harrier waveform.PNG (38.89 KB)

I had a look at Populous, its sample playback is kind of weird. The data is actually a complete embedded .VOC file, including the header, at location 2c000, plus some raw data later on. The code reads out a sample and then uses some tables in ROM from $3800 to emit a sample on each of the three tone channels - the table is 256 bytes per channel, not interleaved, giving raw PSG command bytes.

The "Welcome to Populous" voice seems to be a mistake - the data is linear PCM with a VOC header, as mentioned above, and I have a suspicion that it plays the header as data, hence the clicky sound. The "ha ha" voice seems to be logarithmically pre-adjusted to sound right through the PSG, and doesn't have a header. Quite weird... and without working through the volume tables, very hard to determine quite what effect that has on the samples.

Awsome! I love the new settings! I still can't select more than one file at a time though. Curious. I assume you've been using it, so it works for you.

Using the spectrogram is a good idea. Thanks for the tip. The sailormoon still only shows one sample, even with the new tool. This doesn't make logical sense to me, but oh well.

Interesting info on these other games. I'll have to look into it.

To open multiple files, just select more than one file in the Open dialog. It'll then say "5 files selected" instead of the filename. When you press save, it converts all of them in one go.

I finished investigating Populous, it's a complicated case and seems to have multiple mistakes in the sample handling: http://www.smspower.org/Development/Populous-SMS

Neat. That page has a lot of useful info. I didn't peek at all at this game, but based on all those details, I imagine it might have been a little over my head :) I appreciate your time to do this. Might be nice to add the raw versions of the laugh and an "unofficial" version of the laugh joining parts 1 and 2. For the first sample, it's neat that it loads into video lan player and plays directly without need for the wav conversion.

[Thread split here to Game samples extraction mega-thread]

Author	Message
sherpa Joined: 28 Nov 2014 Posts: 365	Voice data sample conversion (How to?) Posted: Fri Dec 18, 2015 1:41 pm Last edited by sherpa on Sat Dec 19, 2015 11:51 pm; edited 1 time in total
sherpa Joined: 28 Nov 2014 Posts: 365	Ok. So I grabbed some data from Bram Stoker's dracula. I only found one. And it seems to be called from one place. I didn't find any other similar data elsewhere in the rom so I think this game only has one voice sample, and it is called in one way. In the SMS version, it is loaded and played at 6041. You can even jump to it in the emu and get the playback. That's how I logged it. Data is (Physically) 36A9A to 37A89 (length FF0). There seems to be some concordance between the vgm log and the sound data. The issue is how to import? I imported it raw at 3000 hz and it seems the right length and speed. But the pitch is slightly off. The only indication I have of the temp is that it loads $20 into b at 6076 and $25 at 6094. b counts down 20 then increases hl (which starts at AA9A (logical) then counts up. Then it counts down from $25, reaches 0 and increases DE (which starts at FF0 (data length) and counts down. I have no idea how to do the math. Do we add the two (is it 45 or 47 (includes 00?) and then multiply by some value to get the sample rate? Changing either by $10 results in a higher pitch sample, while increasing it in lower (makes sense). Smaller values harder to detect by my ear on casual listen.... In Audacity it sounds muddy, but is the right sample rate is ~3200-4000hz. When I compare wave outputs. The vgm has the waveform only in the top section (not sure what that means) while the raw import has it on both sides. Is there such a thing as a 3 bit or 4 bit header for wave files? Can I attach it to the raw data? How do I prepare it? Also, does psg wave data (not sure what to call it) generally consist of a lot of 88 8x...7x,9x commands? Not sure how to interpret it...but it kinda seemed to follow the vgm a bit for the parts I checked. update: here's the entire code for the function that does the wave playback, since it isn't long. update 2: Also added Voice SGC. Update function doesn't interfere thankfully. _LABEL_6041_: ld hl, $AA9A ld de, $0FF0 ld a, $0D ld (_RAM_FFFF_), a di ld a, $80 out ($7F), a ld a, $00 out ($7F), a ld a, $A0 out ($7F), a xor a out ($7F), a ld a, $C0 out ($7F), a xor a out ($7F), a ld a, $FF out ($7F), a out ($7F), a ld a, $9F out ($7F), a ld a, $BF out ($7F), a ld a, $DF out ($7F), a -: ld a, (hl) srl a srl a srl a srl a ld b, $20 -: djnz - or $90 out ($7F), a or $A0 out ($7F), a res 5, a or $C0 out ($7F), a ld a, (hl) inc hl and $0F ld b, $25 -: djnz - or $90 out ($7F), a or $A0 out ($7F), a res 5, a or $C0 out ($7F), a dec de ld a, d or e jr nz, - ld a, $9F out ($7F), a ei ret Dracula PSG 16-Bit Dac.mp3 (31.9 KB) linear Bram Stoker's Dracula (E) [!]-01.vgm (86.68 KB) Dracula PSG wave.zip (2.15 KB) data (.wav / .psg not allowed) Dracula-Voice-SGC.zip (5.76 KB) normal and infinte loop version

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Fri Dec 18, 2015 8:47 pm
	The game is outputting the data at a rate proportional to the time it takes to run the code. That means you need to measure the time from one sample to the next. Meka has a "clock" command to help with that. You may end up with a weird sample rate. Check if the time between samples is consistent, use the average if it isn't. The data is 4-bit, you may need to unpack it (two samples per byte, high nibble first) if Audacity doesn't support it. Also, you may want to try converting it to the logarithmic volume levels of the PSG - it may sound better or worse that way. It would be fairly easy to convert an entire ROM this way, then just look for non-noisy data visually.

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Fri Dec 18, 2015 10:02 pm
sherpa Joined: 28 Nov 2014 Posts: 365	Maxim wrote The game is outputting the data at a rate proportional to the time it takes to run the code. That means you need to measure the time from one sample to the next. Meka has a "clock" command to help with that. You may end up with a weird sample rate. Check if the time between samples is consistent, use the average if it isn't. The data is 4-bit, you may need to unpack it (two samples per byte, high nibble first) if Audacity doesn't support it. Also, you may want to try converting it to the logarithmic volume levels of the PSG - it may sound better or worse that way. It would be fairly easy to convert an entire ROM this way, then just look for non-noisy data visually. Ok neat. I tried it with the clock command. I don't understand how it works. I set a breakpoint at the playback function, reset the clock and stepped out. I got 4529209 cycles. That's very different from what I got with another tool z80cycles which counts through cycles (from the SMPS research pack). It counted 495 + (32-1)* 13 + (37-1)*13 =1366 cycles I then divided the clock by that value and obtained 2620.45754. It seems to play 2 samples per byte/run through, but neither the obtained value nor double that are close to what sounds appropriate ~3200. Actually, I have no way of knowing if audacity is obscuring things as I have never worked with this type of data before in such a way. I have no idea how to do any of the things you mention. Audacity doesn't seem to support 4 bit data so I imported it raw. What do you mean by "unpack" it? Or converting to logarithmic volume levels? All I did was find the data. I have no idea how it actually works I appreciate the reply, but you just skipped a few levels past me :) I can't tell if you're recommending running it through some program or script or something much simpler, which is nevertheless currently going over my head. Please, again, in simpler language? I learn best from concrete examples. I am not a good theoretician. Thanks.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Fri Dec 18, 2015 10:26 pm Last edited by Maxim on Wed Feb 03, 2016 9:39 am; edited 1 time in total
	I'd like to write a small tool for this. If you have data like 55 67 84 32 Then this is actually values 5, 5, 6, 7, 8, 4, 3, 2. Each is in the range 0..15. You could map it to normal 16-bit audio either linearly (equally spaced) or logarithmicly (to match how the PSG works, each step down is -2dB). Then the sample rate is the last unknown, and hard to guess... Set a breakpoint on the first out ($7f) of each block. Hit one, then reset the clock and run to the second. Display the count and repeat (back to the start). There's your two times between samples - ideally the same, or very close. The average is the cycle count per cycle, divide by the CPU clock (decide NTSC or PAL as you think best) to get the sampling rate in Hz.

Tom Joined: 16 May 2002 Posts: 1356 Location: italy	Posted: Fri Dec 18, 2015 10:51 pm
Tom Joined: 16 May 2002 Posts: 1356 Location: italy	I might decide to get on board since I like to write converters. Just a question which might be obvious, are we talking signed or unsigned here? 8-bit PCM is usually unsigned (0..255), while 16-bit PCM is usually signed (-32768..32767), so I'm not sure about 4-bit, but what you're saying is making me guess it's unsigned at the very least, if not logarithmic. edit: yup, definitely unsigned. sherpa: just so you know, most WAV files are nothing more than raw PCM data with a header, though the WAV container can also be used with compressed ADPCM and other things. It's not really relevant right now but I thought to mention it as a bonus. Theoretically nothing prevents you from writing "4" in the header chunk used for the bits, though I'm not sure how compatible it would be in practice as most audio editors don't expect any values other than 8 and 16 (and maybe 32).

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Fri Dec 18, 2015 11:17 pm Last edited by Maxim on Wed Dec 23, 2015 11:15 pm; edited 2 times in total
	I'm almost done writing something... update soon. Edit: done. See attached, it does linear and handles both nibble orders. Examples are at a guessed 8kHz, I didn't measure the game timing. Edit 2: see https://github.com/maxim-zhao/SampleToWav/releases/tag/v0.1 dracula-linear.mp3 (10.13 KB) Linear example dracula-log.mp3 (10.13 KB) Logarithmic example

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Fri Dec 18, 2015 11:34 pm
	Well, I tried to warn you :) Now do PWM... (I have no idea how to do that.) Edit: my code is here: https://github.com/maxim-zhao/SampleToWav/ Fork away... I used unsigned 16-bit data arbitrarily, signed might be more compatible and would be easy enough to adjust to. Maybe better to let you pick the zero point if that's the case - the example above has a bit of DC offset as a result. Convert a whole ROM to see if you can find hidden samples!

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Sat Dec 19, 2015 12:43 am
sherpa Joined: 28 Nov 2014 Posts: 365	Man I must have just missed you guys. I ended up testing on the entire rom. It only has that one sample. Who knew it would be so easy. I will need to look at your results a bit. Can I change the playback rate just by editing the header? That would be convenient. I love how i can just open your file. I'm still not sure what the right sample is: from the first out ($7F), a after the loop starts to the next iteration is 1208 samples. Earlier I had it wrong as I was calculating for an ntsc clock, but the rom is 50hz... I'll have to study your work so I can add headers to these files more easily. I ended up using linear volume 00-240. How would I use decibel? From the development wiki: int volume_table[16]={ 32767, 26028, 20675, 16422, 13045, 10362, 8231, 6568, 5193, 4125, 3277, 2603, 2067, 1642, 1304, 0 }; Are those valid decibel values? But all of those are much larger than 8 bit values for volume (00-FF) How would we translate this in practice? Since you guys seem to know what's going on easily, would you mind offering a quick lesson? I'll have to look at the source code to see if I can derive any understanding to my question. Thanks guys. Pretty sure the sample rate isn't any of the above...Wish I had a clear example. I'll have to look at these a bit closer though Thanks a lot!

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Sat Dec 19, 2015 8:32 am
	1208 cycles per two samples = 3579545/1208*2 = 5926 samples per second (using the NTSC clock). Using the PAL clock of 3546895Hz you get 5872Hz. Both are on the low side - but maybe they match the game. 8kHz sounds right to me... Audio is its raw form is most often expressed as linear PCM. That means for each sample, there's a number telling you where the speaker should be, to move it back and forth fast enough to make the right sound. The values are usually linear, so they are evenly spaced between a minimum and maximum (which might be 0 to 256, or - 32767 to +32768, for example). There's also some common sampling rates people use, for example 44100 (used on CDs) or 8000 (common for older computer audio). That makes it reasonably likely that SMS samples may be at 8kHz. Next, the difficulty of sample playback on the SMS. It doesn't have any PCM support, but as a (possibly accidental) hack, rapidly changing the volume can produce waveforms in the shape of the volume levels. However these aren't linear - the steps get smaller as the volume gets quieter. The sample data may take this into account - but it often doesn't. Dracula seems not to. The log output is closer to what you hear in the VGM, SGC and real console, but the linear one is more like what they intended. Finally, there's other ways of storing samples, so these tools won't always work. Some games store PWM data - which is harder to interpret into a WAV or MP3. Some store 8-bit data - or even compressed data - and convert it on the fly.

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Sat Dec 19, 2015 9:47 am
sherpa Joined: 28 Nov 2014 Posts: 365	I redid the tests more carefully. When in the proper mode, the value is ~1110 cycles per byte. I got similar numbers to what I found above. Values aren't perfectly consistent, but they hover around there at an average ~1110.89-1110.1 Each nibble section takes 575 cycles. I would have thought they would be a bit more uneven due to the different values of b, but it was consistent. I think I think I got similar values earlier, but I discarded them because I thought they did not make sense. The rates that match up closest to the original are 3150-3200. I actually like how 3150 lines up better than 3200, but when I did the math. 3546893/1110 , I got rates in the range of 3195.199 to 3195.399. I ended up using that value. I prefer to have the data in its original form as an option. I'm not too clear on Tom's above comment about not making PCM data into wav. Thanks to tom's work (and the internet). It was easy to make my own, I would like to output linear and logrithmic conversions. The tools are nice, but I'd like to be able to understand how they work, for mine and others edification. It's nice when these topics get discussed. I had the eight bit sample done by the time I read your posts, but I was still not clear on the proper frequency. I think I was looking at the values you were using, and I thought they were way too high. But actually 8000 would likley have been an acceptable (though high pitched) rate for the 8 bit DAC. I got a bit fixated on how the 4 bit version was sounding with what l thought at the time were supposedly "correct" values. I got the 4 bit version working after I read these posts, and just tested double the rate in the 8 bit and liked what I saw. Only downside is it's linear, which you mentioned might be appropriate here. It would be nice to be able to produce something similar on my own. If anything it helps me use and appreciate others' tools better. I'm still not clear why vgm>wav files show up on only one side of the line as Tom alluded to before, but my conversions show up on both. It doesn't work right if i import it signed, so it's definitely unsigned...so why are values like 80 80 70 F0 drawn on both sides? Maybe it's some feature in audacity-I don't really understand how that maps to emulate changes in air pressure/sound anyway, but it would be nice to have consistency. With the 8 bit dac, I could have used tom's conversion, but I ended up just working with what I did manually to get my bearings and practice working with the header which is still a bit confusing, especially as the guides I found use decimal notation with values starting at 1 to describe hex data. I'm not clear on the significance of byte 0x10...is it always 0x10 (16)? I'm assuming byte 0x14 will always be 1 = PCM for the files we work with here. (Do SMS ever do PWM? [I don't even know what that is]). Byte 0x20 makes no sense at all to me, I haven't found any resource that actually explains it. I'm guessing it's the reason the sample values at 0x18 and 0x1C match. But didn't we say that each byte produces 2 values? So is it supposed to be 1 for 8 bit and 2 for 4 bit? (edit: i changed the value and noticed no changes)=no clue what this is. Lastly, if the average rate of a file, in this case the 4 bit dac is 3195.399 and the 8 bit conversion is made...do we just discard the excess or round it up from 6390 to 6391? No one will notice the difference at that rate, but the purist in me would like to know if there is a common practice. Lastly, where on the site are these goodies (and those to come) going to be organized? Dracula PSG 4-Bit Dac.zip (2.2 KB) Dracula PSG 8-Bit Dac (linear).zip (2.34 KB) should sample rate be 1 higher due to rounding?

Tom Joined: 16 May 2002 Posts: 1356 Location: italy	Posted: Sat Dec 19, 2015 10:10 am Last edited by Tom on Sun Dec 20, 2015 11:34 pm; edited 1 time in total
Tom Joined: 16 May 2002 Posts: 1356 Location: italy	(post blanked by author)

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Sat Dec 19, 2015 10:16 am Last edited by sherpa on Sat Dec 19, 2015 11:53 pm; edited 1 time in total
sherpa Joined: 28 Nov 2014 Posts: 365	Oh ok. Makes sense. I thought you were mentioning that PCM in general should not be converted to wav for some reason not explained. It makes sense that adding arbitrary extensions would not be helpful overall. Just realized Maxim's tool converts to 16 bit wave files. I'm going to have to look at the math, but it looks light it might use values similar to what is in the wiki table.I had trouble imagining how to convert them to 8 bit values other than dividing by 256 or 128 whichever was appropriate. Dracula PSG 16-Bit Dac.mp3 (31.9 KB) Dracula PSG 16-Bit Dac.zip (2.84 KB) converted using maxim's tool

Tom Joined: 16 May 2002 Posts: 1356 Location: italy	Posted: Sat Dec 19, 2015 10:32 am Last edited by Tom on Sun Dec 20, 2015 11:34 pm; edited 1 time in total
Tom Joined: 16 May 2002 Posts: 1356 Location: italy	(post blanked by author)

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Sat Dec 19, 2015 12:44 pm
sherpa Joined: 28 Nov 2014 Posts: 365	Ok that makes a bit more sense. I think I understand the logrthmic a little better. It's not that there's some magical formula out there that converts the electrical current value to some volume value, it's that compared to max, say 65535 (FFFF) reducing that by a logrithmic value has a closer counterpart to the spectrum of values, so you reduce the max value FF or FFFF by a relative amount and derive the scale. I actually had a request. Would you guys be willing to update your apps to handle batch processes? It would speed up my work alot digging through games which have uncompressed audio. Just realized a lot of games are not included in the [Sampled Audio] tag that should have it. Thanks for your help

Tom Joined: 16 May 2002 Posts: 1356 Location: italy	Posted: Sat Dec 19, 2015 1:44 pm Last edited by Tom on Sun Dec 20, 2015 11:35 pm; edited 1 time in total
Tom Joined: 16 May 2002 Posts: 1356 Location: italy	(post blanked by author)

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Sat Dec 19, 2015 3:27 pm
	The PSG actually only outputs 0 and 1 values (before volume scaling), and relies on extra components to remove the DC bias (signal on one side of the line). Most emulators just output a balanced signal directly. When in sample mode, however, the signal is stuck at 1. The balanced signal is now unbalanced. Volume changes affect the height of the line, but it's always above the line. A dynamic DC bias removal stage in code would result in what you expect. The table mentioned above is for volume scaling in 16-bit signed mode, and it would make no sense to have negative values - it's a list of amplitudes. A full volume tone would go between +32767 and - 32767, but a full volume sample goes between 0 and +32767. That's why voices are so quiet - and why they appear above the line. For my tool, I was not trying to make a PSG emulator, I was trying to capture the full dynamic range of the sample, so I stretched it to the absolute maximum range. I can adapt my tool to batch processing very easily - multi selecting files would be a simple addition. I'll do it when I get a chance (maybe Monday) or maybe someone will send me a PR before then. Tom, here's your chance to escape VB ;) Also, I went for a GUI tool rather than a command line for ease of use, but the core conversion function could work either way.

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Sat Dec 19, 2015 3:50 pm
sherpa Joined: 28 Nov 2014 Posts: 365	Ok. Makes sense. I've been using the conversion app to make finding voice data easier. A lot of games seem to compress it, so I'll have to see how the emulator uses it. Alex kidd has several samples, of "Miracle Ball", apparently in different banks. I haven't verified whether it is actually accessed in game. I've been using logrithmic setting to convert, but that means when I try to reverse lookup the data, I'm looking at information i need to translate, so I'll stick to using the linear setting. @Maxim Would it be easy to add 8 and 4 bit (native) settings to the output? Working with this a bit more, I see why you went with 16 bit. Audacity doesn't even let you export 8 bit audio. The only way seems to require we create our own tools, which in this case seems a relatively easy conversion process once the main work has been done. Speeds will still be an issue.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Sat Dec 19, 2015 3:58 pm
	I don't think you can make a 4 bit WAV that works reliably in most software. Some sound hardware can't reliably play anything below 44kHz 16-bit audio anyway. My goal was to make something convertible to MP3 without too much difficulty. If you can get your editor to display sample numbers - I think Audacity can - then the file offset is the sample count divided by 2, regardless of the bit depth of the WAV.

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Sat Dec 19, 2015 4:49 pm
sherpa Joined: 28 Nov 2014 Posts: 365	Maxim wrote I don't think you can make a 4 bit WAV that works reliably in most software. Some sound hardware can't reliably play anything below 44kHz 16-bit audio anyway. My goal was to make something convertible to MP3 without too much difficulty. If you can get your editor to display sample numbers - I think Audacity can - then the file offset is the sample count divided by 2, regardless of the bit depth of the WAV. I almost got happy, but unfortunately, samples is not a reliable indicator of physical space. Interestingly, if I set the sample rate to 4000, 8 bit pcm, draula lines up to the exact byte if I change the sample value to hex. Alex kidd seems to be in 16 bit or at least is represented differently than wave data in dracula. I'll have to look a bit closer later. Unfortunately, changing the bit or sample rate affects the offset, so preserving the bytes seems the way to go. Unfortunately audacity converts everything to 16bit...~logarithmic? values so it will probably not be the go to tool for this unless i can figure this out.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Sat Dec 19, 2015 6:37 pm
	Audacity is fine for preparing files for playback - like the MP3s I made above. For preservation, just pull out the data and document the format.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Sat Dec 19, 2015 8:56 pm Last edited by Maxim on Wed Dec 23, 2015 11:15 pm; edited 2 times in total
	Here's an update. You can select multiple files now (outputs to generated filenames like Tom's), and you can select "the whole file". Internally, the code is no longer all in one file, and there's a half-assed attempt at splitting the concerns up: interpreting the data in the ROM, and rendering the intermediate data to samples. You can now use it to extract 8-bit unsigned data - like in Beavis and Butt-Head - and it offers to render that 8-bit data down to logarithmic PSG output (by discarding the low 4 bits). It should be easier to handle more data formats. I had a look at Alex Kidd: The Lost Stars. The data does seem to be 4-bit linear PCM - and the repeats in higher banks is presumably filling unused space. The "I'm the Miracle Ball" sample is so much clearer as linear PCM! Edit: attached the real update now Edit 2: see https://github.com/maxim-zhao/SampleToWav/releases/tag/v0.2

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Sat Dec 19, 2015 11:22 pm
sherpa Joined: 28 Nov 2014 Posts: 365	Hey, Maxim. Do you know any reason why the AHHH! sound in alex kidd works in your wave tool? I just realized I've had some issues finding some sounds in game when i import into audacity as an 8 bit sample or otherwise, but it works great when I use your tool. Does your tool do something special to the samples to make them audible? Some secret function? Decompression? The other samples are audible by importing into audacity but the AHHH! is not. I've also had the issue with other games not showing any voices when I imported them. I thought it was compression, but when I use your tool it works great. I just used it on the voice data and finally got it to work. Lost an hour trying to figure it out. Could someone take a look and see if they can figure out why this is not working? My 4 bit dac of dracula worked fine once i put a header on it. Can't figure it out. Also, what should I call the voice data without header? PCM? Foobar recognizes the extension, but doesn't play it, too bad. Thanks for your help guys. I'll post a thread once I finish the other voices. The other "miracle ball" samples are at 13700 and 17700. They run to the end of the bank. I have not found any code referencing it, though it might be used. Anyone remember other places where miracle ball is used besides the beginning? I can rip it, though without the actual in game access I can't be 100% sure that i have the right points, only that the bytes before these don't match the first sample but they do after it. They are a subset of sample 2 not sure why they are in the rom or if they are used. I'll look a bit more. Alex Kidd - The Lost Stars (UE) [!] - Voice 01 - Ahhh! (logarithmic).mp3 (22.65 KB) Alex Kidd - The Lost Stars (UE) [!] - Voice 01 - Ahhh! (linear).mp3 (22.65 KB) Alex Kidd - The Lost Stars (UE) [!] - Voice 02 - Find the Miracle Ball-4-bit.mp3 (36.12 KB) sounds bad...why Alex Kidd - The Lost Stars (UE) [!] - Voice 02 - Find the Miracle Ball (logarithmic).mp3 (36.12 KB) Alex Kidd - The Lost Stars (UE) [!] - Voice 02 - Find the Miracle Ball (linear).mp3 (36.12 KB) Alex Kidd - The Lost Stars (UE) [!] - Voice 01 - Ahhh!.zip (5.26 KB) Alex Kidd - The Lost Stars (UE) [!] - Voice 02 - Find the Miracle Ball.zip (1.8 KB)

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Sun Dec 20, 2015 9:18 am
	The repeats of the sample data are just filling unused space. Some of the dev hardware would tend to fill unused space with left over data from previous banks or previous builds, so you have to expect to find unexpected things sometimes. Your bad sounds are most likely 4 bit data being interpreted as 8 bit, or some other format mismatch.

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Sun Dec 20, 2015 10:32 am
sherpa Joined: 28 Nov 2014 Posts: 365	That could be true. I wonder if the way dracula processed the sound allowed it to be played back appropriately as 8 bit..while alex kidd's method doesn't. I confirmed (as best I could) that the data at the end of banks 4 and 5 is just junk filler data. I was wondering why they would waste that much physical space when it is was expensive back then..but it isn't much (length 75B). That's just a bit smaller than AHHH! though. @Maxim. I've had problems with your tool and Tom's. I've been using yours for the logarithmic feature, though in the future, I think I might stick to the linear conversions until it starts to make sense. These two games both seem not to benefit much from it. Maybe the higher quality voices will. I'm not able to select multiple inputs as you claimed with your tool. I think it saves the configuration of the last working path somewhere. I don't see it creating a file, so I assume it must use the registry somehow. Could the new program be using the old form configuration as well? I didn't see a new executable on github, I assume I would need to compile it, which i don't think I can do (outside knowledge based reasons). I'm guessing it's visual C or something similar. Could you check to see if you didn't actually upload the same one? The file sizes are the same thought the dates are different. I assume added code would result in a new size & date. Otherwise, instructions on clearing the old config would be appreciated. My version only lets me select one file at a time. It would be convenient to have batch processing once I start creating PCM packs with more than a few voices. Originally, I mainly wanted it to accelerate finding music, but results aren't always so lucky. And proper conversion takes time to verify. Especially as I'm still not too used to working with this format. Also, any recommendations on a good wav > mp3 converter? maybe command line? I use audacity, but it takes time to open and export each wave. I don't want anything bulky, just something that would allow me to prepare the files to post in the pcm thread. I probably prefer wav myself. At these sizes, they don't take up much space anyway. A sample based vgm is actually ~20-40 times larger. I'm guessing the difference would be exponential. I will be posting the Lost Stars data in its own thread shortly. Thanks for the help guys. This would have taken much longer without you.

sverx Joined: 05 Sep 2013 Posts: 3827 Location: Stockholm, Sweden	Posted: Sun Dec 20, 2015 11:04 am
	uhm... why not use SoX?

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Sun Dec 20, 2015 11:36 am
sherpa Joined: 28 Nov 2014 Posts: 365	For the simple reason that I did not know about it :) Thanks for the tip. There seems to be a bit of a learning curve. I can't convert from wav to mp3 just like that. It seems to require me setting some paramenters. But it looks like what I was looking for. Thanks!

sverx Joined: 05 Sep 2013 Posts: 3827 Location: Stockholm, Sweden	Posted: Sun Dec 20, 2015 7:44 pm
	Sorry, I didn't mean to suggest SoX for mp3 conversion, even if it can do that. I meant to suggest it for raw 4 bit data to wav file conversion...

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Sun Dec 20, 2015 8:03 pm
sherpa Joined: 28 Nov 2014 Posts: 365	Ok, I might look into that. I had problems trying to do just that in lame, as i don't really understand either the software or the sound standards. I'm not really sure how wav>mp3 compare to one another except one is compressed, or how raw relates to possible wav/mp3 standards. I'll look a bit more into it. Thanks

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Sun Dec 20, 2015 8:10 pm Last edited by Maxim on Mon Dec 21, 2015 12:54 am; edited 1 time in total
	I did check SoX - I used it years ago so it seemed plausible, but the docs don't suggest it works and I found some posts online about it saying it doesn't work. The log output isn't going to be easy with SoX either.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Sun Dec 20, 2015 11:31 pm
	I updated my incorrect attachment above. One thing to note is that some games seem to use very clipped/compressed samples, perhaps to make them sound louder, which makes them very hard to see on a PCM waveform plot. Switch to a spectrograph and it's easy to see voices show up as bands of overtones amongst a load of noise and excessively symmetric patterns - see the attachments. Space Harrier spectrograph.PNG (407.5 KB) Space Harrier waveform.PNG (38.89 KB)

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Sun Dec 20, 2015 11:58 pm
	I had a look at Populous, its sample playback is kind of weird. The data is actually a complete embedded .VOC file, including the header, at location 2c000, plus some raw data later on. The code reads out a sample and then uses some tables in ROM from $3800 to emit a sample on each of the three tone channels - the table is 256 bytes per channel, not interleaved, giving raw PSG command bytes. The "Welcome to Populous" voice seems to be a mistake - the data is linear PCM with a VOC header, as mentioned above, and I have a suspicion that it plays the header as data, hence the clicky sound. The "ha ha" voice seems to be logarithmically pre-adjusted to sound right through the PSG, and doesn't have a header. Quite weird... and without working through the volume tables, very hard to determine quite what effect that has on the samples.

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Mon Dec 21, 2015 12:28 am
sherpa Joined: 28 Nov 2014 Posts: 365	Awsome! I love the new settings! I still can't select more than one file at a time though. Curious. I assume you've been using it, so it works for you. Using the spectrogram is a good idea. Thanks for the tip. The sailormoon still only shows one sample, even with the new tool. This doesn't make logical sense to me, but oh well. Interesting info on these other games. I'll have to look into it.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Tue Dec 22, 2015 8:21 pm
	To open multiple files, just select more than one file in the Open dialog. It'll then say "5 files selected" instead of the filename. When you press save, it converts all of them in one go. I finished investigating Populous, it's a complicated case and seems to have multiple mistakes in the sample handling: http://www.smspower.org/Development/Populous-SMS

sherpa Joined: 28 Nov 2014 Posts: 365	Posted: Wed Dec 23, 2015 9:47 pm
sherpa Joined: 28 Nov 2014 Posts: 365	Neat. That page has a lot of useful info. I didn't peek at all at this game, but based on all those details, I imagine it might have been a little over my head :) I appreciate your time to do this. Might be nice to add the raw versions of the laugh and an "unofficial" version of the laugh joining parts 1 and 2. For the first sample, it's neat that it loads into video lan player and plays directly without need for the wav conversion.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14740 Location: London	Posted: Mon Feb 22, 2016 8:06 pm
	[Thread split here to Game samples extraction mega-thread]

Forums

View topic - Voice data sample conversion (How to?)