Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Quite OK Audio (QOA)... anyone ? (Read 947 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Quite OK Audio (QOA)... anyone ?

Just discovered this new "fast, lossy audio compression" format that claims:
Quote
QOA is fast. It decodes audio 3x faster than Ogg-Vorbis, while offering better quality and compression (278 kbits/s for 44khz stereo) than ADPCM.

QOA is simple. The reference en-/decoder fits in about 400 lines of C. The file format specification is… not yet released.

They provide online samples to evaluate it: https://qoaformat.org/samples/

Official blog: https://phoboslab.org/

Official website: https://qoaformat.org/

Official GIT: https://github.com/phoboslab/qoa
Forward Agency NPO

In progress we (always) trust.

Re: Quite OK Audio (QOA)... anyone ?

Reply #1
I'd wonder how this compares to, say, AptX. or the elephant in the room, mp3?
(And FWIW, given that an ancient chip like, say, dual core ARMv4 at 100MHz can decode ogg vorbis at multiple times realtime, I'm not entirely sure if there's a use case for this if its only benefit is "It's fast" )

Re: Quite OK Audio (QOA)... anyone ?

Reply #2
If anyone wishes to give this a try: https://www.rarewares.org/files/QOA.zip

This contains qoaconv.exe, the encoder, and qoaplay.exe, the player. These are Windows x64 compiles and the input to the encoder is only compiled for .wav files.

Command line to encode: quoconv in.wav out.qoa

and, to play: qoaplay file.qoa

Tested on a couple of tracks and I have to say I have heard a lot worse!! ;)


Re: Quite OK Audio (QOA)... anyone ?

Reply #4
I'd be interested to see someone better at this and more patient than me ABX this.

If this is doing what I assume this is, when it isn't transparent it should be less annoying than something like mp3 getting it wrong, but how often do modern lossy codecs get it annoyingly wrong in the general vicinity of 256 kbps


Re: Quite OK Audio (QOA)... anyone ?

Reply #6
I'd wonder how this compares to, say, AptX. or the elephant in the room, mp3?
(And FWIW, given that an ancient chip like, say, dual core ARMv4 at 100MHz can decode ogg vorbis at multiple times realtime, I'm not entirely sure if there's a use case for this if its only benefit is "It's fast" )

Intended use cases seem to include audio in games, including music, sound effects, where ADPCM formats have been used, and other applications where the computation savings would count, I guess.
Doesn't seem to be meant to compete with more traditional lossy codecs for applications where only one or just a few concurrent streams are meant to be used.
https://phoboslab.org/log/2023/02/qoa-time-domain-audio-compression

Re: Quite OK Audio (QOA)... anyone ?

Reply #7
The same guy invented QOI, a simple lossless image codec. In that case he was competitive with PNG on size and much quicker, a lot of that is thanks to PNG being archaic. QOA is likely uncompetitive with complex audio codecs, but has a fighting chance of being competitive with quick codecs. It'll be interesting how they fare creating a lossy codec.

From the source:
Code: [Select]
/* The Least Mean Squares Filter is the heart of QOA. It predicts the next
sample based on the previous 4 reconstructed samples. It does so by continuously
adjusting 4 weights based on the residual of the previous prediction.
The next sample is predicted as the sum of (weight[i] * history[i]).
The adjustment of the weights is done with a "Sign-Sign-LMS" that adds or
subtracts the residual to each weight, based on the corresponding sample from
the history. This, surprisingly, is sufficient to get worthwhile predictions.
This is all done with fixed point integers. Hence the right-shifts when updating
the weights and calculating the prediction. */

Re: Quite OK Audio (QOA)... anyone ?

Reply #8
QOA specification is still not frozen last time I checked.

Re: Quite OK Audio (QOA)... anyone ?

Reply #9
Out of curiosity, I compared this against a 32kHz-downsampled, FLAC compliant "FSLAC -2" encoding (using this preliminary 1.3.4 binary), which results in a similar bitrate. The reason for the comparison against FLAC is that, as ktf demonstrated in his lossless codec analysis, the FLAC reference software is extremely fast at low and medium presets as well.

Due to an apparent lack of psychoacoustic noise shaping in QOA (the quantization noise is spectrally almost white) and high efficiency (due to the extremely simple compression algorithm), FSLAC sounds quite a bit better to my ears, and so does LossyWAV+FLAC, I would assume. Especially on samples such as "Triangle", see the FSLAC thread here.

Is there any other feature in QOA that F(S)LAC doesn't offer?

Chris
If I don't reply to your reply, it means I agree with you.

Re: Quite OK Audio (QOA)... anyone ?

Reply #10
Other than up to 255 channels and guarantees about footprint and consistency, no. Flac is very fast but qoa is so simple that it should be an order of magnitude faster when optimised, if IO allows. The reference qoa decoder processes one sample at a time which can probably be improved without using SIMD and there may also be other speedups from where it stands.

Re: Quite OK Audio (QOA)... anyone ?

Reply #11
Flac is very fast but qoa is so simple that it should be an order of magnitude faster when optimised, if IO allows.
I fail to see how this algorithm is much simpler than FLAC's. I haven't looked at this in detail, but having weights being updated each sample is usually something detrimental to SIMD optimizations.
Music: sounds arranged such that they construct feelings.

Re: Quite OK Audio (QOA)... anyone ?

Reply #12
I'm noodling trying to do multiple sample decodes at once (not full SIMD but packing into uint64_t), the residual is easy to unpack 4 at a time like that but haven't figured out the predictor yet. You may be right that the predictor cannot really be SIMD per channel, it definitely could be by decoding a single sample from every channel at once ("subchannels" are interleaved) but that involves more spread out memory access which may need twiddling and defeat the purpose, and limited benefit as most input is likely 2 channel. There's no stereo decorrelation which helps. FWIW the weights for a channel fit in a uint64_t, so does the history (which the ref updates separately to the output, but it looks like the output could be used directly which may or may not be a benefit).

What is a lot simpler are the memory accesses, they're fixed and so is the structure of the data. If we're really lucky a few common channel counts could be auto-vectorised but I don't have much faith in that. Order of magnitude may be pushing it, currently the ref takes half the user time to decode as flac -8 no md5 which admittedly may not be a fair fight.