Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Descript Audio Codec (.dac) - 90x smaller than .wav? (Read 3444 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Descript Audio Codec (.dac) - 90x smaller than .wav?

I came across this new audio codec: https://github.com/descriptinc/descript-audio-codec

Here are some samples of audios encoded with this: https://descript.notion.site/Descript-Audio-Codec-11389fce0ce2419891d6591a68f814d5

Unfortunately I can't do a listening test at the moment because my headphones aren't very good, but honestly I didn't feel any difference between the original audio and the audio encoded with this codec (I listened to the music audio in the demonstration page).

What do you think?

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #1
This isn't meant for acoustic audio compression, it's meant to feed simplified data into AI algos, so I'd imagine it sounds pretty good to a neutral network and pretty bad to a human ear.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #2
It's definitely meant for acoustic audio compression, and the demos sound pretty good to me.

I notice they make no mention of how long it takes to encode or decode the audio.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #3
This is really impressive.
Although the provided samples are rather simple, and only in mono. On these samples I can't quickly tell where to focus to hear a difference.
(worth noting, the source audio samples have apparently already went through some sort of lossy compression - but that doesn't necessarily make it easier to mask further losses)

I'll try to install it, hopefully it doesn't require a huge GPU to work.
a fan of AutoEq + Meier Crossfeed

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #4
Has anyone understood if it's only able to work in hard CBR mode, or is there some flexibility possible?
For example, can it use less bandwidth during periods with relatively simple signal, or periods of complete silence?
a fan of AutoEq + Meier Crossfeed


Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #6
I had a quick play - had to install CUDA 11.7 and a large amount of python stuff to get this working.

If you're on Windows and struggling with the pytorch not compiled for CUDA error, remove all Nvidia CUDA software and Nvidia drivers, and reinstall both using the CUDA 11.7 installer.

Encoded a WAV file (CD Audio, 1h16m51s) - 69 seconds to encode, 145 seconds to decode on a 3080.

Command line:
python -m dac encode in.wav --output .\
rc55.com - nothing going on

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #7
I had a quick play - had to install CUDA 11.7 and a large amount of python stuff to get this working.

If you're on Windows and struggling with the pytorch not compiled for CUDA error, remove all Nvidia CUDA software and Nvidia drivers, and reinstall both using the CUDA 11.7 installer.

Encoded a WAV file (CD Audio, 1h16m51s) - 69 seconds to encode, 145 seconds to decode on a 3080.

Command line:
python -m dac encode in.wav --output .\

I tried to install with pipx in my Debian, but as it was taking so long to download, I had to abort at 10 minutes of installation, and also I was afraid that it could install a lot of stuff and it would be difficult to remove afterwards. But this is not a problem, there is LXC in which I can make a clean installation without messing with the system; I will give a try when I have a spare time.

My PC is of 2017 and my GPU is a Nvidia GT 1030, comparing to a 3080 it is 12~15x slower. My question is if the decoding will be that slower too.

In archive.org there is some royalty free FLAC musics: https://archive.org/search?query=royalty+free+flac

/\ I will try to convert to .dac and will post the result here.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #8
Note that HydrogenAudio user Kamedo2 posted some audio samples for blind listening tests in the following thread, which (I think, given it's a noncommercial kind-of-research study) you could use as well:

https://hydrogenaud.io/index.php/topic,98003.0.html

Chris
If I don't reply to your reply, it means I agree with you.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #9
I successfully installed this encoder in a Python virtual environment (venv), it consumed 9.8GB of disk space.

I have success in converting a .wav file to this format, but I couldn't decode due to an error in the codec (or maybe my GPU is unsupported).

Here I use an AMD Ryzen 5  1400 and a Nvidia GT  1030 graphic board, it took 55 seconds to encode an audio with 6:12 of duration.

Maybe in the future this codec becomes more performant.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #10
Sounds amazing at 8kbps.

Definitely the best sounding music compression I've heard at that bitrate.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #11
Amazing at 8kbps... Can it be decoded without that 9,8GB of disk space? ;]

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #12
Amazing at 8kbps... Can it be decoded without that 9,8GB of disk space? ;]
That's what I wanted to ask - does it have some simple decoder, or does it need full power of AI computing to reconstruct it?
Error 404; signature server not available.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #13
Amazing at 8kbps... Can it be decoded without that 9,8GB of disk space? ;]
That's what I wanted to ask - does it have some simple decoder, or does it need full power of AI computing to reconstruct it?

I've had a brief experimentation with using "Auto Py To Exe" to compile the dac.py script to an executable, and it's made a 40MB exe with over 4GB of support files in a subfolder (the vast majority is PyTorch).

This does not include the weights file which is ~300MB. The weights file contains the model data used to encode and decode the audio and is a necessity, and the model is not interchangeable. You have to use the same model to decode any encoded file.

I anticipate there is plenty of scope to optimise the software size, but it's leaning heavily on the PyTorch baggage. I anticipate as operating system support matures for AI models, it might be that there could be a standard for using models at the OS level so hooking into a model is no different than making sure you have the latest version of DirectX on Windows.

I'm trying to be careful to not violate TOS #8 - the MUSHRA scores on the GitHub should suffice, but I'd just like to comment that the performance is profoundly good. Mods - feel free to redact this last paragraph if I've made a mistake here.


rc55.com - nothing going on

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #14
When I buy a decent headphone I will comment about its quality.

I hope that its developers optimize the code for less GPU usage in the future.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #15
... it's made a 40MB exe ...
This does not include the weights file which is ~300MB. The weights file contains the model data used to encode and decode the audio and is a necessity, and the model is not interchangeable. You have to use the same model to decode any encoded file.
Thanks for the analysis! Out of curiosity: could you 7zip (preset Ultra) that 40MB exe and 300MB weight file and let us know what file size comes out? That would be a rough estimate of how much room for reduction there is.

Thanks,

Chris
If I don't reply to your reply, it means I agree with you.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #16
That would be a rough estimate of how much room for reduction there is.

You forgot 4 GB of support files there, Chris :)
Error 404; signature server not available.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #17
No, I didn't, these apparently represent the Python/PyTorch installation itself and could be avoided in a software written e.g. in C or C++.

Chris
If I don't reply to your reply, it means I agree with you.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #18
Python is slower than compiled languages such as C++ or Go: https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/python3-go.html

But efforts have been made for speeding up Python, such as Codon compiler: https://github.com/exaloop/codon

I don't know if Python code would be that slow on a GPU as well as in a CPU, but it would be awesome to have this codec compiled through LLVM.

Codon is still not ready to compile 100% of Python code. But the developers are working to implement the missing features such as metaclass support.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #19
Hmm... 9GB to store the decoder, or 9GB to store files with a slightly less efficient compression and lightweight demands on hardware.  Tricky...

AI is notorious for making things up in a believable way.  The output might sound beautiful, but is it true?
It's your privilege to disagree, but that doesn't make you right and me wrong.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #20
90x smaller than wav, huh?  Assuming they mean 16/44.1 PCM wav files, that'd be somewhere in the ballpark of 16 kbps, right?

I'm going to go out on a limb and say it either sounds bad or is completely impractical for most use cases.  Or requires licensing over 9000 patents to implement.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #21
90x smaller than wav, huh?  Assuming they mean 16/44.1 PCM wav files, that'd be somewhere in the ballpark of 16 kbps, right?
Yeah, except: the sample files are mono.
So that means CDDA encoded at 16 kbps as dual mono, without any stereo decorrelation strategy. I have not bothered to look up whether they have any stereo decorreleation algorithm (yet), but obviously that is room for improvement - and also an opportunity to spend more processing power.

I'm going to go out on a limb and say it either sounds bad
Well you can test it ... ? Although the samples are not that interesting ...
or is completely impractical for most use cases.
As of now? Sure.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #22
... it's made a 40MB exe ...
This does not include the weights file which is ~300MB. The weights file contains the model data used to encode and decode the audio and is a necessity, and the model is not interchangeable. You have to use the same model to decode any encoded file.
Thanks for the analysis! Out of curiosity: could you 7zip (preset Ultra) that 40MB exe and 300MB weight file and let us know what file size comes out? That would be a rough estimate of how much room for reduction there is.

Thanks,

Chris

Happy to oblige!
dac.exe 41,865,797 bytes
dac.7z 41,456,443 bytes

weights.pth 306,720,768 bytes
weights.7z 278,740,892 bytes
rc55.com - nothing going on

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #23
I'm going to go out on a limb and say it either sounds bad
Well you can test it ... ? Although the samples are not that interesting ...
Ever since the "64 kbps WMA sounds as good as 128 kbps mp3" stuff, I don't trust a codec developer to not cherrypick a codec implementation that's subpar for the opposing format or cherrypick samples, or otherwise be somewhat dishonest in things like this.

And it sounds like you currently need a nVidia GPU to work with this codec?  That's something I don't have to work with.

Re: Descript Audio Codec (.dac) - 90x smaller than .wav?

Reply #24
I'm going to go out on a limb and say it either sounds bad
Well you can test it ... ? Although the samples are not that interesting ...
Ever since the "64 kbps WMA sounds as good as 128 kbps mp3" stuff, I don't trust a codec developer to not cherrypick a codec implementation that's subpar for the opposing format or cherrypick samples, or otherwise be somewhat dishonest in things like this.

And it sounds like you currently need a nVidia GPU to work with this codec?  That's something I don't have to work with.

You're right to distrust 1st party benchmarks always.

The following is vague guessing because I only have vague awareness of the tech, so pinch of salt:

There appears to be a CPU and GPU mode so you can run on the CPU, but likely the GPU mode is CUDA which is proprietary nvidia lock-in tech. It may be possible for AMD GPU's to run the cuda code (or a reasonably simple port job of it to hip) using rocm. On the other hand the repo contains mostly python (albeit it does reference cuda), the readme makes reference to torchrun which is presumably pytorch and I know Pytorch works on AMD, so maybe it wouldn't take too much for AMD GPU's to work. intel dGPU's I have less of a clue, they have oneAPI that they're trying to push as an interoperable standard, and apparently can also do pytorch. If you have AMD/intel GPU's and want to try then godspeed, it's likely the way of pain even if it is possible.