Print Page - caudec: a multiprocess audio converter for Linux and OS X

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-02-15 18:40:13

caudec (http://caudec.net/) is a command-line utility for GNU/Linux and OS X that transcodes (converts) audio files from one format (codec) to another. It leverages multi-core CPUs with lots of RAM by using a ramdisk, and running multiple processes concurrently (one per file and per codec). It is Free Software, licensed under the GNU General Public License (version 3). The APEv2 tagger that's bundled with versions 1.7.1 and later, is licensed under the Mozilla Public License, version 2.

Supported input codecs: WAV, AIFF, CAF, FLAC, WavPack, Monkey's Audio, TAK, Apple Lossless.
Supported output codecs: WAV, AIFF, CAF, FLAC, Flake, WavPack, Monkey's Audio, TAK, Apple Lossless, lossyWAV, LAME, Ogg Vorbis, Nero AAC, qaac, Musepack, Opus.
Support for high quality resampling and downmixing / upmixing to stereo, with SoX.
Optimized I/O: input files are copied onto a tmpfs mount sequentially, so as to get the best performance out of the underlying medium (e.g. a hard drive). Transcoding however is done concurrently. Example: file 1 gets copied. When that's done, transcoding of file 1 starts. Meanwhile, file 2 gets copied, etc… Very little time is lost reading the files.
Transcoding to several different codecs at once is possible. In that case, decoding of input files is done only once.
Multiple instances of caudec can be run concurrently while sharing ressources.
Metadata is preserved (as much as possible) from one codec to another.
Multiprocess Replaygain scanner (except for Opus and Musepack).
Uses existing, popular command line encoders/decoders.

Tested under Arch Linux and OS X. Download here (http://caudec.net/downloads/). Please use the bug tracker (http://caudec.net/redirect/bugs) to report any bugs. Feedback is most welcome!

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-02-15 21:45:24

I just released version 1.1.0, which adds support for Musepack.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: Dario on 2012-02-15 23:34:18

Excuse my ignorance, but does TAK actually work under Linux?

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-02-15 23:38:01

The encoder/decoder (Takc.exe) works with wine. Linux users can use it for archiving, while transcoding to some other codec (e.g. lossy) for listening purposes. Caudec supports TAK encoding and decoding if the user has installed both Wine and TAK.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-02-17 07:21:42

It just occurred to me that I left out one of caudec's main selling points: it's fast. It sounds obvious to me, but maybe it isn't so much. I was never a sales person. It might also not be obvious that it works best on somewhat large sets of files (e.g. a whole album with one or two CDs, one file per track).

Encoding ABBA's 2CD The Definitive Collection (148 minutes, 37 tracks) from WAV to FLAC --best, with one process, on a Core i7 @ 2.2 GHz: 46x real time.
Same as above, with 8 processes: 186x

Just for kicks, FLAC -5 (default setting) with 8 processes encodes at 569x, TAK -p2 at 743x.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-02-23 00:35:56

I just released version 1.3.0 (http://code.google.com/p/caudec/downloads/detail?name=caudec-1.3.0.tar.gz) of caudec, that

adds support for WavPack lossy
adds support for resampling of stereo files
corrects a bug that increased disk space usage on tmpfs
improves prediction of required disk space on tmpfs
adds support for a CAUDECDIR environment variable for setting the temporary dir to your liking

Upgrading is highly recommended, if only for the bug fix. Please report any issues using the issues tracker (http://code.google.com/p/caudec/issues/list).

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: Takla on 2012-06-02 00:31:35

Hi skamp. I tried your caudec script and it is definitely very fast. I tested it by transcoding from flac to ogg -q 7 an album of flacs and it shaved maybe 40% off the time taken by oggenc or by ffmpeg>wav>oggenc or straight ffmpeg -i $file -acodec libvorbis etc. As far as I can tell all the speed benefit comes from parallel processing (I checked this by processing a single file and finding that in this case caudec is in fact slower than oggenc or a more typical bash script). So I'm wondering what is the point of creating the tmpfs and doing so much copying? Is it just to facilitate dropping files in and out of a queue? I can't see any need to create a memory consuming structure for machines with large amounts of RAM, because transcoding is almost all CPU. So I like your script's speed but I wonder if the same thing couldn't be achieved more simply by using job control to get bash running parallel encoder processes, or maybe I missed something important?

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: ExUser on 2012-06-02 00:40:15

While encoding is a parallel task, reading from a drive is intrinsically sequential. You can't double read speed by reading 2 files at once. In fact, you're likely to harm read speed. By queuing disk operations and running encoding purely in RAM, caudec cuts out the parallel read bottlenecks and runs the process as fast as possible.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: Takla on 2012-06-02 08:08:28

I can see the logic, but disk reads are very high these days. How can there be a bottle neck when reading 6 or 8 or 10 lossless files of maybe between 20MB and 50MB each, which are going to take a a few seconds to decode and encode anyway? Surely that doesn't present any kind of challenge with modern hardware?

I'd noticed that oggenc on XP was significantly faster than a gcc compiled oggenc binary in Linux so I was keen to try to make up the difference and Skamp's script prompted me to go back to my bash scripts and add some parallel processing. My scripts are simpler stuff: essentially decode+dump metadata function, encoder function, metadata writer function. By letting the core functions of the script run in parallel/background processes (number of cores +1) I can achieve about the same improvements, for example the directory I transcoded earlier, flacs to oggs:

my original bash script:
real   3m3.301s
user   3m8.952s
sys   0m3.496s

caudec:
real   1m47.993s
user   3m11.467s
sys   0m4.126

my bash script with some parallel processes/backgrounding:
real   1m52.904s
user   3m10.826s
sys   0m3.877s

But I only have 4 year old dual core AMD Athlon64 desktop and a 5 year old Core Duo (32-bit only) and a similar vintage Core 2 Duo....no experience of i7 here so I can't personally scale my tests up to 4 cores and 8 threads. Has anyone with modern hardware (quad core, multi GB RAM, SATA III etc) actually measured the difference and if so is it found it to be substantial? At the moment I can see Skamp's caudec page which compares single thread processing (and I assume conventional read from HDD) with parallel processing from tmpfs. Obviously the parallelism makes a huge difference and perhaps that accounts for all or almost all the difference, so what is missing is some data showing that the tmpfs is solving a problem or adding a benefit.

edited for typos.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-06-02 10:49:46

What Canar said. Hard drives don't like concurrent access, and you actually lose read speed (more than proportionally) as you increase the number of concurrent accesses. My laptop hard drive tops out at maybe 70 MB/s on a single access, but it's not like it gives me 17.5 MB/s per file when I'm accessing 4 files at once, it gives me less than that. Same thing with my USB3 HDD where my backup resides. I tested it a while ago so I don't have the exact figures anymore, but my observation was that single-access, sequential reading was needed.

I have a quad-core i7 with 8 threads and 8 GiB of RAM, so my objective was to get the highest transcoding speeds possible while leveraging the gear at my disposal. Copying input files to a tmpfs sequentially while transcoding them concurrently proved to be the most efficient way. The speed gains range from slight to significant, depending on the gear, the configuration (number of processes, etc…) and the set of files you're transcoding. E.g. reading 8 files at once can slow my hard drive to a crawl.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: lvqcl on 2012-06-02 10:57:51

somewhat related: http://www.hydrogenaudio.org/forums/index....showtopic=94783 (http://www.hydrogenaudio.org/forums/index.php?showtopic=94783)

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-06-02 11:22:38

I dug up an old version (before 1.0) that didn't copy input files to a tmpfs. Here are the results when transcoding FLACs from my hard drive to Ogg Vorbis, with 8 processes, on a 2 CD album with 37 files (same external encoders):

old version: 71.41 seconds (15.0 MB/s) (124.3x)
latest caudec: 58.71 seconds (18.2 MB/s) (151.1x)

That's roughly a 21% speed increase. Maybe not quite as dramatic as one could hope, but substantial nonetheless.

Obviously I dropped filesystem caches before each run.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: Takla on 2012-06-02 11:48:49

Thanks for the info. If I ever get an i7 I'll be keen to transcode this way. I've been trying out different numbers of parallel processes and I've found that on my Athlon 64 I get maximum transcode speed by allowing 5 parallel processes instead of 3, and this now performs at least as quickly as the tmpfs method (time difference is <1%), though it's all snail paced compared to your i7 figures; where you get 124x I get 26x (all on the same disk)

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-06-02 12:06:05

I'm guessing your hard drive is less of a bottleneck with your configuration (CPU speed, number of concurrent reads on the HDD) than with mine

Incidentally, the tmpfs method provides no speed gain when I'm transcoding FLACs located on my SSD. In that case, the storage medium is no longer the bottleneck. Unfortunately, my SSD is nowhere near large enough to hold my entire FLAC library, so I still have to deal with my slowish HDD.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: Takla on 2012-06-02 12:08:08

I got my Core 2 Duo 1.6 GHz running 64-bit Debian Stable headless with 512MB RAM to hit the heady heights of 33x. It's a champagne moment. Tomorrow I buy the (parallel) stripes, body kit and chrome exhaust.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-06-02 12:41:58

Quote from: Takla on 2012-06-02 08:08:28

I'd noticed that oggenc on XP was significantly faster than a gcc compiled oggenc binary in Linux so I was keen to try to make up the difference

That's the reason I added support for Windows binaries with Wine. There are instructions (https://code.google.com/p/caudec/wiki/WindowsCodecs) on how to install and use those with caudec.
lvqcl's Ogg Vorbis AoTuV ICC build (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=74345&view=findpost&p=784966) might be of interest to you.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: Takla on 2012-06-02 13:20:30

I saw the info on wine and win binaries in your docs/examples and it struck a chord because I'd previously noticed a big discrepancy between the speed of oggenc in XP (with foobar as frontend) and oggenc in Debian 32-bit. But as I don't make a habit of watching the text scroll by I can live with my newly parallelised scripts doing 26x or 33x (finally quicker than AoTuV in XP on my hardware). I'll stick with native binaries so I can run the same scripts across different free OS and architectures and not have to care if wine is installed/working/worth the effort.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: Takla on 2012-06-04 00:04:07

btw I booted my XP install to see what foobar2000 and oggenc were doing and discovered that the apparent gulf in encoder performance between oggenc in Debian and oggenc in XP was simply due to foobar2000 running two oggenc processes in parallel (XP version of oggenc being aoTuVb6.03 from rarewares). Once both cores are maxed out oggenc performs a little faster (very little: <1%, probably has more to do with OS services than the binary) in Debian 32-bit than in XP SP3 32-bit though the difference is very slight (if you measured it using a button-press stopwatch you'd never know there was any difference). Anyway if I happen again on an application which apparently performs hugely better or worse on a different OS I'll take a closer look before assuming something is either very wrong or inexplicably excellent....

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-06-04 11:48:57

Quote from: skamp on 2012-06-02 11:22:38

That's roughly a 21% speed increase. Maybe not quite as dramatic as one could hope, but substantial nonetheless.

The benefit gets more obvious as CPU time decreases (the HDD becomes more of a bottleneck). Here's a case where the difference becomes "dramatic": encoding WAVs to FLAC (-q 5, FLAC's default compression level).

old version: 70.63 seconds (22.2 MB/s) (125.7x)
latest caudec: 38.33 seconds (40.9 MB/s) (231.4x)

That's a 84% speed increase YMMV of course.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: punkrockdude on 2012-06-04 20:14:22

I am glad more Linux stuff is being done since I use Linux on my laptop and I learn new things all the time. Regards.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-06-05 12:49:18

I was curious, so I implemented a switch for disabling the preloading of input files to RAM, for cases where the underlying medium is a fast SSD, ramdisk or whatever. I ran a few tests with light to intensive CPU tasks, and the speed gains were negligible. Since inappropriate / uneducated use of that switch could easily cause terrible performance, I've decided to revert the change and not include it in a future release (not until everyone has terrabyte SSDs, at least).

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-06-27 09:11:28

I released version 1.4.0 (https://code.google.com/p/caudec/downloads/list), with many changes (pretty much as many commits as all of the other versions combined):

now runs on Mac OS X (tested on Lion)
smart handling of concurrent instances
better detection of ramdisks
don't abort if no ramdisk is available
support for e/m TAK compression parameters
removed reckless option to disable checking of available space
fixed long standing bugs in the installation script
fixed regression with empty APEv2 tags
better handling of ALAC metadata
changed handling of user interruption (Ctrl+C), removed pgrep dependency
lots of minor fixes

Upgrading is strongly recommended. Please use the tracker (http://code.google.com/p/caudec/issues/list) to report any bugs.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-07-10 12:22:49

Latest version (1.4.3) (http://code.google.com/p/caudec/downloads/list) brings support for Opus and ALAC encoding, among other improvements and fixes. See changes (http://code.google.com/p/caudec/wiki/Changes).

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: Manlord on 2012-07-10 22:08:23

Excelent, thank you skamp. Going to test it on Debian 6.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2012-07-30 11:16:40

I released caudec 1.5.0 (http://code.google.com/p/caudec/downloads/list). Changes:

Replaygain scanner (except for Opus and Musepack)
preservation of embedded artwork from FLAC and ALAC, to FLAC, ALAC, AAC and MP3
new -C switch disables metadata preservation
report both read and write speeds
better estimation of ramdisk space requirements with APE input files
various fixes

Thanks to Garf for his help on the RG scanner.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2013-03-17 10:52:35

caudec 1.6.0 (http://caudec.outpost.fr/downloads/) was released two weeks ago, with many new features (http://caudec.outpost.fr/documentation/changelog/#latest).
There's also a new website: caudec.outpost.fr (http://caudec.outpost.fr/). The googlecode.com website will no longer be updated.
Version 1.6.1 was released yesterday, with the only change from 1.6.0 being updated URLs for the project.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2013-04-16 18:52:52

I just released version 1.6.3 (http://caudec.outpost.fr/downloads/) (see the changelog (http://caudec.outpost.fr/documentation/changelog/)).
The most notable improvement is OS X fixes (thanks to TheLink (http://www.hydrogenaudio.org/forums/index.php?showuser=3911)).

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: edwardar on 2013-04-19 21:28:51

Thanks for this. I'm using it on my Shuttle XS35GT (Atom D510 1.66GHz dual core hyperthreaded) to convert FLAC -8 to Ogg Vorbis -V 5.

Using Oggenc2.87 aoTuVb6.03 (Lancer SSE3 x64 windows build from rarewares), single core encoding gives me around 9x, while caudec yields 30x.

I realise this is pretty sedate by modern standards, but it's not bad for a 29w fanless netbook!

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2013-04-20 16:02:36

Yeah, I've used caudec on my netbook a lot, too. The CPU (Intel Atom) is so slow, it does make a significant difference. I also ran a silly experiment a while ago, where I got my netbook to endode FLACs faster (with two concurrent processes, using Flake) than my AMD Phenom CPU running FLAC with a single process :-P

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2013-05-01 11:10:32

Version 1.6.4 (http://caudec.outpost.fr/downloads/) has a pretty cool feature for dealing with 2 mirrored music collections (e.g. lossless and lossy) at the same time: you can now specify a destination directory (with -o/O/P) after each -c parameter, in order to set per-codec directories. The files will then be written in the correct locations, without any further intervention. Example:

Code: [Select]

$ caudec -c wv -P "/data/wavpack" -c mp3 -P "/data/mp3" "Artist/Album [Year]"/*.flac

That command will transcode to both WavPack and MP3 at the same time. The resulting WavPacks will be written to "/data/wavpack/Artist/Album [Year]", and the MP3s will be written in "/data/mp3/Artist/Album [Year]".

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2013-06-17 11:13:35

Version 1.7.0 (http://caudec.outpost.fr/downloads/) is out, with the longest changelog (http://caudec.outpost.fr/documentation/changelog/) to date. Of note is the new -z parameter which produces machine-parsable output (http://www.hydrogenaudio.org/forums/index.php?showtopic=101151). Since I asked for it from other developers, it's only natural that I would put my money where my mouth is.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: BearcatSandor on 2013-06-18 01:46:58

Thanks Scamp, that's just awesome and useful! I appreciate all the hard work you put into caudec that makes my collection so much easier to manage.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2013-06-18 09:07:26

That's always nice to hear

Edit: you shouldn't have any problems with multichannel files now (for WavPack, just use version 4.70.0 beta). You will now be able to transcode them as is, and optionally downsample (-b, -r) or downmix them to stereo (-2).

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: BearcatSandor on 2013-06-18 09:51:26

And it just gets better and better! Thank you!

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: Ferongr on 2013-08-28 21:41:26

http://caudec.outpost.fr/ (http://caudec.outpost.fr/) looks down at the moment.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2013-08-29 07:42:13

It went back up about half an hour later, after a technician intervened on it. I'm having some problems with the new server, I'm trying to sort it out.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2013-09-03 11:48:09

Server problems seem to be fixed. I released caudec 1.7.1 (http://caudec.outpost.fr/). The biggest change is the replacement of apetag (http://www.muth.org/Robert/Apetag/), which wasn't cutting it anymore, with my own tagger, APEv2. It supports the entire specs, including NULL separated lists, embedded artwork and binary data. See the changelog (http://caudec.outpost.fr/documentation/changelog/).

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: skamp on 2013-10-22 08:33:57

caudec 1.7.2 (http://caudec.net/downloads/) is out, with support for the recently released WavPack 4.70.0 (http://www.hydrogenaudio.org/forums/index.php?showtopic=103111).

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: ChronoSphere on 2015-05-07 12:14:38

I just tried converting a bunch of .wav files to .wv and caudec errors out if I use -c: it does encode the file, but does not write it to the output directory. If I use -C, encoding works.
First time using it, so I can only speculate it doesn't like the (lack of) tags in .wav. Those files were created by converting .tak files in deadbeef-git.

I'm on 1.7.5 according to the arch linux package.

Title: caudec: a multiprocess audio converter for Linux and OS X
Post by: BearcatSandor on 2015-06-06 05:11:25

.According to the home page, the last update was almost a year ago. Is the project dead Skamp?

HydrogenAudio

Hydrogenaudio Forum => General Audio => Topic started by: skamp on 2012-02-15 18:40:13