by andete. Original documents available at: https://github.com/andete/ym2413/tree/master/results

<< YM2413 Reverse Engineering Notes 2017-01-26 | YM2413 Reverse Engineering Notes | >>

It's again been over a year since my last progress report. I've mostly been working on figuring out the content of the YM2413 instrument ROM. For most instruments I've found settings that are close (sometimes very close) to the built-in instruments, but never 100% identical. I suspect there are still some details wrong in the model that I'm using. I will (slowly) continue working on this.

But this slow progress is sometimes discouraging. So for some more variety I started to look at the YM2413 rhythm sounds. This is still very early work, but I already found something interesting. (Spoiler: strictly speaking it's nothing new, software YM2413 emulators already got this right, but it's nice to have it confirmed anyway).

1) Rhythm sound measurements.

So far all my measurements on the real YM2413 captured (only) the output of channel 0, outputted on the 'MO' analog output pin. But the rhythm sounds are outputted on various other channel numbers (thus at different relative moments in time) and on the 'RO' analog output pin.

To be able to measure rhythm sounds I had to make (small) adjustments to the embedded software on the capturing board. Short summary: before the board itself located channel 0 (in time), only sampled that channel and send it back (over USB) to the host. Now I added a mode where the board samples 72x faster, thus capturing all channel information (both MO and RO pins). And then the host software must post-process this data to isolate the channel(s) it's interested in.

Anyway, I was making some initial captures of the 5 rhythm sounds. And by mistake I captured the 'snare drum' sound with a frequency setting of 0x0 instead of the recommended value of 0x550 (I forgot to update my program to write the frequency settings to registers 0x17 and 0x27 instead of registers 0x16 and 0x26). This gave the following result:

Notice that the envelope is visible, but all samples have the same sign. If we zoom in on the tail, where the envelope is fairly flat, we see this:

From what we've reverse-engineered so far this is an unexpected result: if the frequency is zero, the phase should remain constant. But in this result the phase is clearly toggling between two values.

2) Noise generator.

Compared to the melodic channels, the rhythm channels use one extra component for sound generation, namely a random noise generator. Might the above signal be the output of the noise generator?

Looking more closely at (120) consecutive samples in this signal gives the following. I mapped sample value 255 to '0' and all other values to '1':

          1, 1, 0, 1, 0, 1, 0, 0, 1, 0,
          0, 1, 1, 1, 0, 1, 1, 0, 0, 1,
          0, 0, 1, 1, 0, 1, 0, 1, 1, 1,
          0, 1, 0, 1, 1, 1, 0, 0, 0, 0,
          1, 0, 1, 1, 0, 1, 0, 0, 0, 1,
          1, 0, 0, 1, 1, 1, 0, 0, 1, 0,
          0, 1, 1, 0, 1, 0, 0, 1, 0, 1,
          1, 0, 0, 0, 0, 1, 1, 1, 0, 1,
          1, 0, 0, 1, 0, 1, 0, 0, 0, 1,
          0, 1, 1, 1, 0, 0, 0, 1, 1, 1,
          0, 0, 0, 1, 1, 0, 1, 1, 1, 1,
          1, 0, 1, 0, 0, 1, 0, 0, 0, 1,

This looks pretty random to me. A very common choice for a random generator is a Linear Feedback Shift Register (LFSR). It's cheap to implement in hardware and it has various nice mathematical properties. It's also used is several other sound chips from the same time period (e.g. AY8910 and SN76489). So it's likely that the YM2413 (and other chips in the Yamaha family) use it as well.

But how can we know for sure? The Berlekamp-Massey algorithm can help us here. For a given sequence of 0's and 1's, the BM-algorithm constructs the cheapest LFSR that fully reproduces that sequence.

https://en.wikipedia.org/wiki/Berlekamp%E2%80%93Massey_algorithm

If I feed the above sequence of 120 bits into the BM-algorithm I get the following polynomial as result:

x^23 + x^9 + x^8 + x + 1

I'll explain what this means in the next section. If you're interested, you can check my implementation of BM plus some more analysis code here:

// https://en.wikipedia.org/wiki/Berlekamp%E2%80%93Massey_algorithm

#include <algorithm>
#include <cassert>
#include <cstddef>
#include <iostream>
#include <set>
#include <vector>
using namespace std;


std::vector<int> BerlekampMassey(const std::vector<int>& s)
{
        int N = s.size();
        std::vector<int> c(N, 0);
        std::vector<int> b(N, 0);
        c[0] = true;
        b[0] = true;
        int l = 0;
        int m = -1;

        for (int n = 0; n < N; ++n) {
                bool d = false;
                for (int i = 0; i <= l; ++i) {
                        d ^= c[i] & s[n - i];
                }
                if (d) {
                        auto t = c;
                        for (int i = 0; (i + n - m) < N; ++i) {
                                c[i + n - m] ^= b[i];
                        }
                        if (l <= (n >> 1)) {
                                l = n + 1 - l;
                                m = n;
                                b = t;
                        }
                }
        }
        c.resize(l + 1);
        return c;
}

void generate_seq(std::vector<int>& s, const std::vector<int>& poly, size_t num)
{
        auto N = poly.size();
        for (int i = 0; i < num; ++i) {
                auto sn = s.size();
                bool t = false;
                for (int j = 1; j < N; ++j) {
                        t ^= poly[j] & s[sn - j];
                }
                s.push_back(t);
        }
}

unsigned gen_constant(const std::vector<int>& poly)
{
        unsigned result = 0;
        unsigned bit = 1;
        for (int i = 1; i < poly.size(); ++i) {
                if (poly[i]) result |= bit;
                bit <<= 1;
        }
        return result;
}

bool step(unsigned& s, unsigned poly)
{
        bool result = s & 1;
        s >>= 1;
        if (result) s ^= poly;
        return result;
}

std::vector<int> gen_seq(unsigned c, unsigned s, unsigned num)
{
        vector<int> result;
        for (int i = 0; i < num; ++i) {
                result.push_back(step(s, c));
        }
        return result;
}

void print(const vector<int>& v)
{
        for (int i = 0; i < v.size(); ++i) {
                cout << v[i] << (((i % 10) == 9) ? 'n' : ' ');
        }
        cout << 'n';
}

bool testPeriod(const vector<int>& v, int p)
{
        int N = v.size();
        assert(p <= N);
        for (int i = 0; i < p; ++i) {
                int j = i;
                while (true) {
                        j += p;
                        if (j >= N) break;
                        if (v[j] != v[i]) return false;
                }
        }
        return true;
}

int calcPeriod(const vector<int>& v)
{
        int N = v.size();
        for (int p = 1; p <= N; ++p) {
                if (testPeriod(v, p)) return p;
        }
        return -1;
}

void analyse(unsigned order, unsigned poly)
{
        cout << "Analysen";
        unsigned n = 1 << order;
        set<unsigned> seen;
        for (unsigned i = 0; i < n; ++i) {
                if (seen.count(i)) continue;
                int c = 0;
                unsigned t = i;
                while (!seen.count(t)) {
                        seen.insert(t);
                        ++c;
                        step(t, poly);
                }
                cout << i << ": " << c << endl;
        }
        cout << 'n';
}


int main()
{
#if 0
        vector<int> s = { 1,1,1,0,0,1,0,0,1,0,1 };
#elif 0
        vector<int> s = {
                0,1,1,1,1,1,0,0,0,0,
                1,0,1,1,1,0,1,0,1,0,
                1,1,1,1,0,1,0,1,0,1,
                0,1,1,0,0,1,0,1,1,0,
                1,0,1,0,1,0,0,0,1,1,
                0,0,1,1,1,1,1,0,1,0,
                1,1,0,0,1,0,0,1,1,0,
                0,0,1,1,0,1,1,0,0,0,
                0,0,0,1,1,1,0,0,1,0,
                0,1,0,1,1,1,0,0,0,1,
        };
        for (auto& e : s) e = 1-e;
#else
        vector<int> s = {
                1,1,0,1,0,1,0,0,1,0,
                0,1,1,1,0,1,1,0,0,1,
                0,0,1,1,0,1,0,1,1,1,
                0,1,0,1,1,1,0,0,0,0,
                1,0,1,1,0,1,0,0,0,1,
                1,0,0,1,1,1,0,0,1,0,
                0,1,1,0,1,0,0,1,0,1,
                1,0,0,0,0,1,1,1,0,1,
                1,0,0,1,0,1,0,0,0,1,
                0,1,1,1,0,0,0,1,1,1,
                0,0,0,1,1,0,1,1,1,1,
                1,0,1,0,0,1,0,0,0,1,
        };
#endif
        auto p = BerlekampMassey(s);

        int order = p.size() - 1;
        cout << order << " : ";
        for (auto b : p) {
                cout << b << ' ';
        }
        cout << endl;

        vector<int> test = s; //{ 1,1,1,0,0,0};
        test.resize(p.size() - 1);
        //print(test);
        generate_seq(test, p, s.size() - p.size() + 1);
        //print(test);
        //print(s);
        assert(s == test);

        int c = gen_constant(p);
        cout << "cnst = " << std::hex << c << std::dec << endl;
        analyse(order, c);
        auto test2 = gen_seq(c, 1, 3 * (1 << order));
        //print(test2);
        cout << "period = " << calcPeriod(test2) << endl;

        auto it = std::search(test2.begin(), test2.end(), s.begin(), s.end());
        if (it == test2.end()) {
                cout << "NOT found" << endl;
        } else {
                cout << "found at offset " << std::distance(test2.begin(), it) << endl;
        }

}

If I sample the noise signal at other time offsets and apply BM, I each time get the same result. Or if I take only the 60 first samples of the above sequence, BM still gives the same result. Moreover the resulting LFSR can then correctly predict the continuation (the last 60 samples) of that sequence.

I repeated this experiment on several different captures and several different sample sequences, and I always get the same result. Of course given that the input sequence is long enough (usually 30-40 samples is sufficient). This is a strong indication that the YM2413 indeed uses a LFSR for it's noise generator, and specifically the LFSR with this polynomial.

3) Linear Feedback Shift Register (LFSR)

But what is a LFSR exactly? You can find many details and references on wikipedia. But below I'll also try to explain in an easier (less precise) way.

https://en.wikipedia.org/wiki/Linear-feedback_shift_register

Very brief the idea is the following:

This is illustrated in the following image:

For now only look at the 'Fibonacci-configuration'. You see a large rectangle that contains 23 smaller cells. Each of these cells stores one bit. These contain the last 23 output bits of the LFSR. Some of these outputs (bits 1, 8, 9 and 23 in this example) are XOR'ed together and the result is feed back to cell 1. This is the newly generated random bit. When a new output bit is generated, all existing bits shift one position to the right. So the last bit becomes the 2nd-to last and so on.

Notice how the 'tap-positions' (1, 8, 9, 23) correspond to the exponents in the polynomial that describes this LFSR (x^23 + x^9 + x^8 + x^1 + 1). There's a mathematical reason why polynomials are used to describe LFSRs, but I won't go into that. The '+ 1' at the end of the polynomial does not correspond to a tap-position, it makes sense in the mathematical description (it's related to the input), but we'll also ignore that here.

This was the 'Fibonacci-configuration' of a LFSR. This description is easy to understand, but it's usually not how LFSRs are actually implemented. In practice, both in hardware and in software, often the 'Galois-configuration' is used instead. This configuration is shown in the lower part of the image.

Both configurations have the same number of shift register bits and the same number of XOR operations. Only the arrangement differs. (Also note that the numbering is swapped left-to-right). In the Galois configuration the XOR operations are located 'in between' the shift register bits. Thus the shifted bits get modified while they are 'in-flight'. It can mathematically be shown that both configurations produce the exact same output sequence (in cell '1'). But the full internal state (including the values of the other bits in the shift register) is NOT the same for both.

Some mathematical properties:

Applied to the YM2413 this means that:

I mentioned that 'Galois' is more often used than 'Fibonacci'. Why is that?

    // Input: LFSR state.
    // Output: updated LFSR state, random output is located in bit 0.

    unsigned lfsr_step_fibonacci(unsigned state) {
        bool output = ((state >> 22) ^ (state >> 8) ^ (state >> 7) ^ state) & 1;
        return (state << 1) | output;
    }

    unsigned lfsr_step_galois(unsigned state) {
        bool output = state & 1;
        state >>= 1;
        if (output) state ^= 0x400181
        return state;
    }

4) Comparison to existing hardware/software.

Let's check how the noise function is implemented in the existing YM2413 software emulators. In the openMSX src/sound/YM2413Burczynski.cc code you can find this code snippet:

    if (noise_rng & 1) noise_rng ^= 0x800302;
    noise_rng >>= 1;

That's equivalent to the 'lfsr_step_galois()' function above (difference is whether the XOR operation is applied before or after the shift operation). So the software emulation was already using the correct LFSR. It's always nice to confirm such things.

Can we also find this LFSR in the YM2413 die-shot? I already speculated about this in an earlier article. Here's a (slightly extended) repetition of the YM2413 die-shot.

In the upper-right corner, near the green colored 'PG' rectangle I marked in white a shift-register with 14 (or 15?) bits. That's not enough for the full LFSR (needs 23 bits) but it might correspond to the longest chunk of 14-bits in the Galois-configuration of the LFSR (see above). The single bit registers surrounded by XOR gates are hard to see in this die shot (at least for me). And the remaining 7 bit shift register might be obscured by all the wiring in this region. Also important: placing the LFSR close to the other circuitry that handles phase calculations is very logical. All this gives me more confidence that this white colored shift register is indeed related to the noise generation.

5) Next steps

I'll probably continue with investigating the rhythm sounds. This article only looked at the noise generator, but we still need to figure out how exactly this noise is used to produce the rhythm sounds. After that I'll probably return to the instrument ROM. As always, help would be appreciated.

<< YM2413 Reverse Engineering Notes 2017-01-26 | YM2413 Reverse Engineering Notes | >>




Return to top
0.641s