Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Trouble with naively implementing the MDCT transform (Read 728 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Trouble with naively implementing the MDCT transform

Out of curiosity, and without formal DSP background, I'm trying to construct an implementation of the MDCT, directly from the formulas provided by Wikipedia (to help my understanding, but also to check if the article is accurate. Let's call this an attempt at quality control).

So far, I'm failing spectacularly. The formulas look straightforward enough, but signals round-tripping through my MDCT and iMDCT are heavily distorted.

For the last few days, I tried "rubber duck debugging" by writing a document, which includes my current understanding of things (enclosed PDF), also pointing out things I'm unsure of (e.g., how the overlapping is done and how that influences the window function).

Yes, I'm really going for the non-optimized O(n²) algorithm, but I am aware that there's a FFT-related O(n log(n)) version that's used in practice.

I hope that I'm doing something very wrong in an obviously silly way. Input greatly appreciated!

Re: Trouble with naively implementing the MDCT transform

Reply #1
I don't have any insights into the (i)mdct but I can suggest pre-computing the window coefficients to speed things up a little.

Re: Trouble with naively implementing the MDCT transform

Reply #2
Quote
How are blocks of N input samples arranged in the 2N-input-buffer? Is this indeed just a
sliding window with a width of 2N samples, sliding in steps of N positions?
Yes.

Quote
Is the actual window size indeed 2N (and not, e.g., N, but ever-repeating/periodic)?
Yes.

Quote
Does the window function need to be phase-shifted to accomodate, e.g., the organization of the input-sample-buffer?
No, but it does need to be symmetrical so it satisfies the Princen-Bradley condition. Try replacing your window function with a constant sqrt(0.5) and see if anything changes.

Re: Trouble with naively implementing the MDCT transform

Reply #3
Input greatly appreciated!
After reading the english wikipedia a bit, I believe that the imdct formula should rather be:
X

With this change in your code:
Code: [Select]
- t2 += coeffBuffer[k] * ct;     // coeff-block b
+ t2 += coeffBuffer[k] * cosTerm(N + n, k);     // coeff-block b
...
- outputSamples[n] = (2f / N) * (t1 - t2);
+ outputSamples[n] = (2f / N) * (t1 + t2);
I get this:
X

Re: Trouble with naively implementing the MDCT transform

Reply #4
@Octocontrabass Thanks for elaborating on the questions! The window-function in the demo-code seems to satisfy the symmetry-condition and appears to be working fine, although I'll certainly experiment with various window-shapes.

@danadam Oh, thank you for your experimentation and finding a likely fix for the formula and corresponding code! I can replicate your results and audio data now sounds "fine". I assume that the diamond-shaped patterns in the spectogram is leakage due to the relative small block size of 128 - they're much reduced when, e.g., going for a size of 256.

You're of course also correct to point out that the code can easily be sped up by means of pre-computing. First, I wanted things to work properly, though, before doing optimization.

Re: Trouble with naively implementing the MDCT transform

Reply #5
The window-function in the demo-code seems to satisfy the symmetry-condition
Are you sure? It looks like you should be using "n + 0.5" instead of "n" when you calculate the window.

I assume that the diamond-shaped patterns in the spectogram is leakage due to the relative small block size of 128
If you compute the MDCT and iMDCT with enough precision, the output will be identical to the input. If you see artifacts, either you don't have enough precision or there's still a bug in your code.

Re: Trouble with naively implementing the MDCT transform

Reply #6
The window-function in the demo-code seems to satisfy the symmetry-condition
Are you sure? It looks like you should be using "n + 0.5" instead of "n" when you calculate the window.

Good catch, you're correct of course. (note: Dumping the values into a CSV helps)

I assume that the diamond-shaped patterns in the spectogram is leakage due to the relative small block size of 128
If you compute the MDCT and iMDCT with enough precision, the output will be identical to the input. If you see artifacts, either you don't have enough precision or there's still a bug in your code.

Turns out it's the window-is-off-by-0.5 which is causing the artifacts. The reason why things looked better at N=256, 512 etc. is that relatively speaking, the error of being off by 0.5 diminishes the wider the window is. Things are looking great now, thanks a lot!

 

Re: Trouble with naively implementing the MDCT transform

Reply #7
For future reference and experimentation, here's the corrected source code. It's also a lot faster due to precomputing, now that things work properly (still using the O(n²) base algorithm, though). Placed into the public domain.