Skip to main content

Topic: Best Quality for Lossy Encoding of Audiobooks? (Read 7506 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • encoder
  • [*][*]
Best Quality for Lossy Encoding of Audiobooks?
Obviously if I don't want FLAC. But I also don't want it to be (much) distinguishable from the original. Storage space is not really an issue nowadays. What do you use? MP3 or OGG file format? Which settings? Which software? I have Fre:Ac. I would prefer an easy to use Windows solution, no command lines. What settings to use?

By the way what is the best audio grabber program nowadays? Is it still EAC? I'm really out of this game lately.

Oh, and how to create separate audio tracks for gapless playback on any basic player? I just see your Wiki... There used to be a command line solution to cut the audio files at the block's ends so every simple player can play 'em back gaplessly. It worked for MP3. Does it work for OGG?

Thanks!
  • Last Edit: 27 November, 2012, 09:33:32 AM by encoder

  • eahm
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #1
You can hear difference if you convert in 64kbps, even with AAC. I like AAC 128kbps audiobooks but fre:ac uses FAAC which is outdated, use foobar2000 + qaac to create them. I assume OGG aoTuV 128kbps is also great. MP3 I don't really know, try to ABX 128-160+kbps and let us know.

  • Dynamic
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #2
By the way what is the best audio grabber program nowadays? Is it still EAC? I'm really out of this game lately. :)

Oh, and how to create separate audio tracks for gapless playback on any basic player? I just see your Wiki... There used to be a command line solution to cut the audio files at the block's ends so every simple player can play 'em back gaplessly. It worked for MP3. Does it work for OGG?


If by 'grabber' you mean CD ripping, EAC is still great, CUETools' CUERipper is equally good and perhaps easier to set up (that's what I use mostly, usually in Burst mode when the disc is found in CTDB and modest ripping errors can be corrected without re-ripping). Also excellent is dBpowerAmp. Its PerfectMeta could be a great time saver on a large ripping process. Foobar2000's ripper is pretty good too though I'm not as familiar with the techniques used. I've used them all from time to time. (CUETools and dBpowerAmp Converter can also handle a lot of file format conversion tasks, as can foobar2000).

Ogg Vorbis is intrinsically gapless (as is its new successor, Opus). Only a few external players support gapless MP3 via Lame's Accurate Length tags. The old --nogap solution had its problems and is deprecated in favour of the Lame Accurate Length information.

Most of the very basic players close the stream and re-open thus failing to preserve gapless playback regardless of how you encode. However, some excellent low cost players (such as Sandisk Clip) support Rockbox, which offers gapless playback.
Dynamic – the artist formerly known as DickD

  • DonP
  • [*][*][*][*][*]
  • Members (Donating)
Best Quality for Lossy Encoding of Audiobooks?
Reply #3
Most of the books on my portable I have in speex.  With many running 6 hours I'll give up a little transparency to save space on a flash player.  Even if I want "good" sound I won't go over q0 with vorbis.  If they are done as "radio drama" (music, stereo staging, sound effects) rather than just some guy reading the book out loud I might go more.

  • jensend
  • [*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #4
As far as the quality-bitrate tradeoff goes, Opus (homepage, FAQ) is easily the best codec available for audiobook use. Already at 24kbps it's quite close to the original. (That's 97 hours of audiobooks per GB of storage, vs 1 hour 40 minutes for CD or 18 hours for the 128kbps AAC eahm was suggesting.)

Codecs without any speech coding technology, like MP3, AAC, or Vorbis, tend to require about double the bitrate to get comparable results.

Almost all other codecs with modern speech coders are patented, and getting an encoder requires paying a royalty. On top of that, those codecs will have trouble on any non-speech content (music, effects) in audiobooks. Opus is the first publicly available codec to combine state-of-the-art speech and general audio coding technologies, and of course it's royalty-free.

The only drawback is that since it's a brand new format - just standardized in September - software and devices are just starting to add support. Software-side, Opus playback is supported by VLC, Foobar2000, and Firefox. Device-side, Rockbox has added support in their development builds. More applications and devices will be adding support by the end of the year.

The command line isn't as scary as you think. If the command line really is a problem you can use Foobar2000 to encode to Opus instead.

  • encoder
  • [*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #5
Thanks for the info! This Opus thing looks interesting. It's 1.01, is it already the best (for music as well)? Will my 1st gen. nonRockboxed Sansa clip play it if I somehow "make it" to an OGG? I just didn't bother to Rockbox it. Default is king, ain't it?

Most important: what bitrate to use for Opus for audiobook (and music)? Let's say I am used to 256-320k MP3s.

As for ripping CDs: I have all the time in the world and I prefer the bit accurate method.

Will the gapless playback work on Opus as well? Where can I read more about this newer method of gapless? Google didn't help.

  • marc2003
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #6
Quote
Default is king, ain't it?


no. 

the sansa clip firmware won't play opus files and it can't do gapless either regardless of format.

rockbox does perfect gapless and plays opus files. there's simply no reason not to rockbox your player. i own both a clip and clip+ and couldn't live without rockbox - mainly for the gapless support. i don't know how anyone can put up with a player that doesn't do it.

  • DonP
  • [*][*][*][*][*]
  • Members (Donating)
Best Quality for Lossy Encoding of Audiobooks?
Reply #7
As far as the quality-bitrate tradeoff goes, Opus (homepage, FAQ) is easily the best codec available for audiobook use. Already at 24kbps it's quite close to the original. (That's 97 hours of audiobooks per GB of storage, vs 1 hour 40 minutes for CD or 18 hours for the 128kbps AAC eahm was suggesting.)

.....
The command line isn't as scary as you think. If the command line really is a problem you can use Foobar2000 to encode to Opus instead.


OK I downloaded the stuff (Opus, new foobar, dev rockbox).  Convert in foobar is set with a command line (--bitrate 10 %s %d)
Can it use standard input?

I converted a spoken word CD and it sounds pretty good at 10 kb/s (5 MB for the whole thing) There's just a little bit of music in the intro and outro.  Not great on that but at least it doesn't make me cringe like some speech specific coders.  So this is my new format for speech.  I'll try some music too but more concern there for getting it to work on multiple players and figuring out my transparency point.


  • Seren
  • [*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #8
10kb/s for music 
I wonder the day this becomes the norm... probs when we have 100 petabyte hdds for $50 but oh well...
Btw if your going that low, you might want to see how it sounds with mono, a 16kb/s mono seemed to not give me pain in my ears wheres a 16kb/s stereo would.
But if your used to such high kb mp3s, why not just start out at 64kb/s, which is a tad overkill for voice with opus but sounds pretty decent for music as well.

  • eahm
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #9
10kb/s so 1.25KB/s?

For my previous post, of course you don't have to go 128kbps with AAC, I don't listen to audiobooks that often, 128kbps are the ones I found (from Audible etc.) that sound more like they should (good tone, good voice, good microphone?). 64kbps AAC is probably more than fine, I don't like 64kbps MP3 though.
  • Last Edit: 28 November, 2012, 10:39:38 AM by eahm

  • jensend
  • [*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #10
encoder: Yes, Opus is also somewhat better than the competition (AAC/Vorbis) for music, though not by as large a margin.

Being "used to 256-320k MP3s" doesn't give us enough information to tell you where your optimal point on the bitrate vs quality curve is. If you've been using a modern high-quality MP3 encoder like recent versions of LAME, did a whole lot of blind listening tests, and decided you really needed to encode MP3s at >256kbps even when you're using a space-constrained portable player like the original Clip, that would mean you're more sensitive to coding artifacts than just about anyone on the planet (or that there are only one dozen CDs you will ever want to listen to). The normal recommendation for MP3 stereo music using LAME is -V2 (~190kbps) for very sensitive listening without storage constraints and -V4 (~165 kbps) or lower for portable use.

For Opus, which is considerably better than MP3, I'd recommend you start by encoding one audiobook at 24kbps and some stereo music at 96kbps, do a little listening test (possibly using Foobar2000's ABX tool), and use that information to make a decision on how to encode your whole collection. Depending on your tastes you might want to go as high as 32kbps for audiobooks and 128kbps for music, but I don't think you'll want to go above that for a portable player.

DonP, I think you'll find that increasing the bitrate from 10kbps to 12kbps, which gives mediumband (6kHz) rather than narrowband (4kHz) audio bandwidth, will make your audiobooks sound considerably better without much of a bitrate change. Since most of the energy in sibilants (s-sounds etc) is around 4-6kHz, and since there's energy in that range even for vowels, the quality difference between narrowband and mediumband speech is quite large, probably just as large as the quality difference between mediumband and the superwideband you get at 22kbps.

  • Dynamic
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #11
I agree that at its speech-codec settings (when it uses SILK Linear Prediction mode), Opus is far less awful for music than most speech codecs (certainly most of the CELP and GSM types).

The Opus scalable bitrate demo is a good first take on what it sounds like with music with plenty of sparkly percussion and shows where it typically switches bandwidth and stereo mode as it sweeps from 8kbps to 64 kbps. My suggestion for great quality speech and decent music would be 32kbps, so I agree with jensend.
Dynamic – the artist formerly known as DickD

  • DonP
  • [*][*][*][*][*]
  • Members (Donating)
Best Quality for Lossy Encoding of Audiobooks?
Reply #12
encoder: Yes, Opus is also somewhat better than the competition (AAC/Vorbis) for music, though not by as large a margin.

DonP, I think you'll find that increasing the bitrate from 10kbps to 12kbps, which gives mediumband (6kHz) rather than narrowband (4kHz) audio bandwidth, will make your audiobooks sound considerably better without much of a bitrate change. Since most of the energy in sibilants (s-sounds etc) is around 4-6kHz, and since there's energy in that range even for vowels, the quality difference between narrowband and mediumband speech is quite large, probably just as large as the quality difference between mediumband and the superwideband you get at 22kbps.


I'll give that a shot.  10 really sounds ok to me though.. probably just low subjective requirements for plain speech. 

I encoded some acoustic guitar music at 64kb/s and it just sounded wrong..  a little mushy.  ABX was 10/10 and I could be pretty sure which was which after listening to only one of X OR Y and not bothering with A and B.  I could also ABX that track 100% with vorbis q=0 (64kb), but with considerably more effort.  150 kb/s Opus is so far transparent to me. 


  • IgorC
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #13
I encoded some acoustic guitar music at 64kb/s and it just sounded wrong..  a little mushy.

1.0.1 is actually restricted VBR. Tonality (guitar as well) is an issue for 1.0.1
Maybe You will want to try the last experimental branch which is pretty good (especially for tonality) at this point .
  • Last Edit: 28 November, 2012, 01:29:16 PM by IgorC

  • eahm
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #14
Just tested AAC Apple True VBR Q18 (~75kbps) (the average of the full audiobook was 43kbps) and I coultdn't distinguish it from the original, you can go much lower than expected with speech.
  • Last Edit: 28 November, 2012, 02:04:26 PM by eahm

  • jensend
  • [*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #15
A brief bit about Opus's ability to deal with mixed content: if you have an audiobook with quite a bit of music content, then for the time being, to get the full benefit of Opus's ability to code both speech and music, you need to be using an encoder newer than the one currently offered at opus-codec.org (for instance, this one)and you need to be encoding at 30kbps or higher.

Here's why, in case you're interested in the details. Music-oriented lossy codecs use the MDCT. To enable them to encode quality speech at lower bitrates than MDCT codecs can, speech coders use some variant of linear prediction. (I'll abbreviate that as LP- remember it has nothing to do with vinyl).

Opus has three modes: an LP mode (with bandwidths of either 4, 6, or 8kHz), a MDCT mode (with 4, 8, 12, or 20kHz bandwidth) and a hybrid mode (12 or 20kHz bandwidth). In hybrid mode, the lower frequencies with most of the speech energy (up to 8kHz) are done with LP while the higher frequencies are done with MDCT. Higher bandwidths, as well as using the MDCT mode, need more bits to code well.

The version currently on the website chooses the mode and bandwidth based on the bitrate and allows you to influence that choice a little by using a command line switch to tell it to expect either speech or music. It'll use just one mode and bandwidth for the entire file.

Newer versions remove that switch, instead detecting the type of content automatically, and will switch modes and bandwidths (seamlessly, of course) in the middle of a file if the  content changes. At 20kbps and below, only LP modes are used. For 20-30kbps it'll use hybrid modes. For 30-42kbps it will use the MDCT mode for music and the hybrid mode for speech, switching back and forth based on the content. Above 42kbps there's no longer any benefit to using hybrid mode for speech so it'll just use MDCT all the time.

  • DonP
  • [*][*][*][*][*]
  • Members (Donating)
Best Quality for Lossy Encoding of Audiobooks?
Reply #16
1.0.1 is actually restricted VBR. Tonality (guitar as well) is an issue for 1.0.1
Maybe You will want to try the last experimental branch which is pretty good (especially for tonality) at this point .


Is that "opus-tools_exp_tfself.zip"?  That certainly loosens the reigns on the VBR.  Total size for that file is about 25% bigger for same settings and foobar shows a lot more change in bit rate from frame to frame.

  • DonP
  • [*][*][*][*][*]
  • Members (Donating)
Best Quality for Lossy Encoding of Audiobooks?
Reply #17
For speech, 12 kb/s opus is working fine on the portable player (sansa e200 with rockbox) but music, not so good.  64kb/s plays, but controls and display becomes sluggish (ex: new track doesn't display for 20 seconds or so after first one ends).  128 kb hangs up the controls completely, have to do a hard turn off (hold power button for a while) to get out of it.  I hope there's more efficiency to come in the rockbox decoder.  I do accept that is is part of a development build, not a "stable release"
  • Last Edit: 30 November, 2012, 01:27:56 PM by DonP

  • jensend
  • [*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #18
I hope there's more efficiency to come in the rockbox decoder.  I do accept that is is part of a development build, not a "stable release"
There's plenty more efficiency to be had in the decoder. The initial Rockbox Opus work basically just got everything working without doing any optimization. That was sufficient to get better than realtime playback on a lot of the most popular devices, including Sandisk's AS3525v2-based players: the revised Fuze, the revised version of the original Clip (i.e. the ones with 2.xx.xx firmware versions, which were most of the units sold), the Clip+, and the Clip Zip. A good beginning, but just the beginning.

The revised c200 and e200 as well as the original Fuze and Clip use the original AS3525 system-on-a-chip. The CPU difference is small, but the orig. AS3525 has only 1/4 as much RAM, and initially it ran into stack space issues with MDCT Opus modes. A lot of work has been done since then to reduce the stack space required for Opus, and with those improvements these players should be fine.

The original e200 uses a rather different chipset, the pp5024, which looks like it's a fair bit slower but has plenty of RAM. Back at the beginning of October n1s, who's one of the guys working on Rockbox Opus optimizations, said on irc that he was getting faster-than-realtime 64kbps decoding on that chip, but not quite fast enough to leave sufficient CPU for the user interface. That sounds like what you were experiencing. The bit of optimization that's been done since then should be plenty enough to make 64kbps playback smooth and responsive, but 128kbps and up will need more work.

There's tons more optimization work that can be done*, and much of what has been done in the past 6 weeks hasn't made it to Rockbox's mainline development builds yet. If you'd like to know more, or if you'd like to see whether you can be of help with testing, ask around on the rockbox irc channel, forums, or mailing list. (Quite often the people best equipped to answer your question aren't in IRC at the moment, so it may take a while to get an answer, but when they are around it's probably the most convenient method for communicating with them.)

*As one example, getting other transform codecs to work as well as they do in Rockbox required a good bit of device-specific FFT/MDCT optimization, but the code they wrote for that only supports power-of-two sizes, and Opus uses non-power-of-2 FFTs. So right now whenever you ask Rockbox to decode hybrid or MDCT mode Opus, it's using generic code from the mainline libopus for the transform. The libopus code is a good algorithm, but for these kinds of things the difference between good generic code and well-tuned device specific code can be huge.

  • eahm
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #19
Just tested AAC Apple True VBR Q18 (~75kbps) (the average of the full audiobook was 43kbps) and I coultdn't distinguish it from the original, you can go much lower than expected with speech.

My test on one audiobook:

FLAC: 2.21GB

AAC-LC True VBR (qaac/Apple) -V18 (~51kbps): 240MB

HE-AAC (qaac/Apple) -v32 --he (~33kbps): 151MB

HE-AAC (fhgaacenc/Fraunhofer) --vbr 1 (~31kbps): 145MB

Opus (0.1.5 from opus-codec.com) --vbr 32 (~34kbps): 152MB

MP3 (LAME 3.99.5) -V 7 (~77kbps): 351MB
  • Last Edit: 04 December, 2012, 06:27:49 PM by eahm

  • IgorC
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #20
Opus (0.1.5 from opus-codec.com) --vbr 32 (~34kbps): 152MB


Post #16

A brief bit about Opus's ability to deal with mixed content: if you have an audiobook with quite a bit of music content, then for the time being, to get the full benefit of Opus's ability to code both speech and music, you need to be using an encoder newer than the one currently offered at opus-codec.org (for instance, this one)and you need to be encoding at 30kbps or higher.

Or this one
  • Last Edit: 04 December, 2012, 09:05:36 PM by IgorC

  • eahm
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #21
Thanks IgorC.

AAC-LC True VBR (qaac/Apple) -V9 (~45kbps): 211MB

AAC-LC True VBR (qaac/Apple) -V0 (~39kbps): 183MB

Opus (opusenc.exe from opus_tools_2012_11_15_sse.zip + DLLs from opusfile-0.2-win32.zip from opus-codec.com) --vbr 32 (~34kbps): 152MB

Opus (opus_v1.0.1_154_g07418d9.zip) --vbr 32 (~34kbps): 152MB

MP3 (LAME 3.99.5) -V 8 (~69kbps): 313MB

MP3 (LAME 3.99.5) -V 9 (~52kbps): 234MB


Not that everyone cares about every single codec but... I just like to test when I have some free time. I would use one of the two HE-AAC, cars with AAC capability will play them.
  • Last Edit: 05 December, 2012, 01:26:45 AM by eahm

  • IgorC
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #22
It is recommended to use ABR instead of VBR  for encoding with LAME at 100 kbps and lower.
http://wiki.hydrogenaudio.org/index.php?title=LAME

  • eahm
  • [*][*][*][*][*]
Best Quality for Lossy Encoding of Audiobooks?
Reply #23
I don't test or keep track of MP3 too much, I'd actually like it to disappear and be replaced by a newer codec like AAC. Thanks for the link though, I've read it once but I didn't remember about the lower bitrate setting.

MP3 (LAME 3.99.5) --abr 96 (~97kbps): 442MB

MP3 (LAME 3.99.5) --abr 64 (~63kbps): 285MB
  • Last Edit: 05 December, 2012, 12:31:10 PM by eahm