Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Improved ADPCM encoder (Read 22920 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Improved ADPCM encoder

I was recently working on an industrial project that used DVI/IMA ADPCM to store canned audio samples. I was noticing some artifacts in some of the samples and thought that the dynamic noise shaping from WavPack lossy would make a big improvement, and it did. I also implemented a lookahead feature to find the optimum coding based on future samples (ffmpeg has this too, enabled with the -trellis option, although the implementations are different).

Anyway, this all works pretty well together so I decided to package it up and put it on Github in case anyone else might find it useful. ADPCM is not a generally useful format any more, but I understand it's still used in some applications like games and it's nice for some embedded projects because the decoder requires no multiplies and fits in just a few dozen lines of C code! And it seems that just about everything that handles WAV files will plays them.

The Github project includes binaries for Windows and Mac. This version is slightly newer than the one I posted here.

edit: typo

Re: Improved ADPCM encoder

Reply #1
I also implemented a lookahead feature to find the optimum coding based on future samples (ffmpeg has this too, enabled with the -trellis option, although the implementations are different).
I am also trying to improve (IMA-)ADPCM quality in this viewpoint.
I published this implementation in the GitHub project. This project can build/run in Windows/Mac/Ubuntu OS.

ADPCM code(nibble) sequence can optimize under dynamic programming (DP).
But it has a huge search space (16^N, where N is the number of samples), so I apply beam search and tree pruning.

Re: Improved ADPCM encoder

Reply #2
Aikiriao thanks for the tip, I tried to compile it for macOS (ARM64) and I found it sure to be slow despite the 100% CPU usage, the quality seems good to me at first listen. For 48kHz mono voice, it requires 192kbps.

When compiling I received the following warning.
warning: missing field 'description' initializer [-Wmissing-field-initializers]
    { 0, NULL,  }
                ^
1 warning generated.
[100%] Linking C executable moi
[100%] Built target moi

Unfortunately for a week I will lack the time to try Bryant's code as well, but it's just a postponement, I'll do it.

Re: Improved ADPCM encoder

Reply #3
Celona, thanks for checking and trying it.
When compiling I received the following warning.
warning: missing field 'description' initializer [-Wmissing-field-initializers]
    { 0, NULL,  }
                ^
1 warning generated.
And also, thanks for the compiler warning. I fixed and pushed it.

Re: Improved ADPCM encoder

Reply #4
Have there been any listening tests to compare these ADPCM encoders with commercial formats, such as CRI's ADX and Nintendo's B*STM? I think Sony is still using an offshoot of ATRAC on PlayStation consoles, too.

Re: Improved ADPCM encoder

Reply #5
This version is slightly newer than the one I posted here.

This version ... the link is wrong. I compiled and tried Bryant's code and it's very fast, same 46 min. file tried earlier it's compressed in only 18s on a Mac M1.


Re: Improved ADPCM encoder

Reply #7
Quote
Please note that the original topic is almost 7 years old.
Yes, it is an old (maybe well-known) topic... I'm afraid and sorry about reinventing the wheels.
I'm trying some encoding speed and quality improvement. And add some evaluation results with the libsndfile at: https://github.com/aikiriao/MOI/tree/main/evaluation

Re: Improved ADPCM encoder

Reply #8
Interesting, thanks. Regarding, e.g., your plot

So what are the numbers in each cell? MMSE or relative runtime? What do width and depth mean here? Do I have to multiply them somehow toget to the numbers?

Chris
If I don't reply to your reply, it means I agree with you.

 

Re: Improved ADPCM encoder

Reply #9
Quote
So what are the numbers in each cell?
These cells show relative run time for encoding white_noise.wav. In evaluation, I generated a 4-sec duration (44100 Hz sampling rate) wave by the script (evaluate_moi.py).

Quote
What do width and depth mean here?
I'm sorry for dismissing the description of it. "Search width" means beam search width and "Search depth" means the number of look-ahead samples. More width/depth applies, we can use more search space, and generally, we can obtain better MSE. Each parameter impacts CPU time, so the right-down side of the graph shows higher CPU usage.

Quote
Do I have to multiply them somehow toget to the numbers?
Yes. In the encoding time ratio, we can compute actual processing time by (cell value) * 4 / 100 (sec). (But we can't compute for MSE ratio because each cell shows relative MSE compared with the libsndfile...)