Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Audiobook Encoding From CD (Read 6214 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Audiobook Encoding From CD

After searching around the forums for a while and such, I came up with some settings for trying to encode audiobooks.  Unfortunately, when I tried them, the results where not at all what I expected, so I'm looking for help to understand them and refine my options.

The HA Wiki. says that -abr is best for the lower bitrates/voice, and that the "--preset voice" maps to "-abr 56 -mm". 

Most of the forum posts about this are pretty old and focus either on older options, or options to work around bugs in older versions.  There are also several suggestions about hardware limitations that require specific settings.  Still, they almost all seem to suggest a -vbr where possible.  There are also various --lowpass, mono, and resample suggestions.

I am using an MP3 player, but I have yet to come upon any limitation of it playing files, so that isn't a concern to me.  However, it will only play MP3, no Vorbis, Speex, etc.  My audiobooks are just people talking, so they are pretty simple.  As it is a single person, I'm perfectly fine with mono.  The time to encode is not a concern to me, and they are coming from CD so the starting sample rate is 44.1kHz.  I want the audio to be as close to the audio on the CD as possible, where distinguishing the two is difficult under normal circumstances, but have the space as small as possible (within that constraint). 

I downloaded and installed foobar2000_0.9.4.2.exe and the 3.97 LAME  binary from Rarewares.  CDs were ripped using fb2k strait into LAME.  I used the ABX utility in fb2k to confirm what I heard easily with my ears.

I used a base LAME command line of:
-S --noreplaygain -V 3 --vbr-new --lowpass 8 -mm --resample 22.05 -q 0 - %d

This resulted in a somewhat muffled sound.  Where the man's voice hissed, such as where the letter "S" was used, was much less 'hissy'.  I noticed it right away.  I also tried the -abr option listed in the wiki, but it sounded the same.

I found that both the resample option and the lowpass option could both individually create that muffled sound.  If I used neither, the sound went away.  I did not try varying their values though.  I also found that after removing those options, I could not tell the difference between -V0 and -V6.  I found that I could use this command line:
-S --noreplaygain -V6 --vbr-new -mm -q 0 - %d

And I can't tell the difference between that and the CD; and strangely it is even almost the same bitrate (57 kbps versus 55 kbps from my original command line).  I'm looking for ways to improve this though.  Unfortunately the permutations of various switches is to great for me to just try them all so I need suggestions of what should work best.

Here is a sample of audio from my CD (4MB).  You could see what I mean in the first 15 seconds of that.

Also, is there a way to just set a quality setting?  I know with Vorbis you have the option of setting a specific quality, and it will encode to that quality no matter what the resulting bitrate.

Audiobook Encoding From CD

Reply #1
Look at this:

http://www.hydrogenaudio.org/forums/index.php?showtopic=5716

Personal answer from glen was:

Thank you!
I tried your settings for voice encoding in fastencc.exe:
-dm -hq -br 64000
and they give very nice results, almost perfect compared to the original.


Probably there's something better nowadays, but fastencc.exe worked pretty well 4 years ago.

Audiobook Encoding From CD

Reply #2
I've done hundreds of hours worth of spoke audio this way. People who have listened to the results find no fault in clarity, pleasantness, or ease of listening. One has to listen very closely to find any difference from the original audio CD.

-V 8 --vbr-new --resample 22 --lowpass 11 --noreplaygain

Audiobook Encoding From CD

Reply #3
-m m -V 2 --vbr-new

should already give comparatively low bitrates on speech.

Just increase the V value (decrease the quality) as you want.

Throw in a --resample if you want to lower the quality still further, but note that lower quality -V settings resample anyway.

AndyH-ha - I think the --lowpass in your command line is redundant. Lame already cuts well below nyquist at V2, never mind at V8!!!!

Cheers,
David.

Audiobook Encoding From CD

Reply #4
After a little more testing, I managed to narrow down where I can tell the difference.  I started with this as the base:
-V6 --vbr-new -mm -q 0 --noreplaygain

As mentioned before, I can't tell the difference between this and the CD.  Then I adjusted only the V setting to 7, 8, and 9. Making sure with ABX testing, 7 was indistinguishable for me, and 8 was distinguishable 100%, but it was a pretty small difference.  9 was an obviously different sound. According to this chart, which is for LAME 3.95.1 but I assume still holds for 3.97, these are the auto adjustments:
Code: [Select]
Switch               target  lowpass resample
-V 6 --vbr-new         115    16000
-V 7 --vbr-new         100    14900   32000
-V 8 --vbr-new          85    12500   32000
-V 9 --vbr-new          65    10000   24000


So I can't tell the difference if the audio is --resampled to 32kHz, but as I mentioned in my first post I can tell easily at 22.05kHz.  I also can't tell a --lowpass 14900, but I can at 12500. 

To test a little further, I compared standard -V8 to -V8--lowpass 10 and the difference was very pronounced to me.  So it appears that my threshold where I can't tell a difference on this sample is about --resample 32 --lowpass 15.

I guess that leaves three questions:

1. Was there a problem with my methodologies or conclusions?
2. Were there some other settings that I should have tried?
3. Why does the HA Wiki suggest -abr for speech if not a single person in the forums suggests it?

Audiobook Encoding From CD

Reply #5
Actually, I produce exactly what I want in the WAV file before I give it over to encoding: 22050Hz mono, which of course has the Nyquist limit of 11025Hz. I use the switches to prevent LAME from making any changes. Results might be comparable if I just let LAME have at it, but I know what I want, and I like the results.

Doing this, when I decode the mp3, I can see that the higher frequencies (i.e. 11kHz) have not been reduced significantly.

Audiobook Encoding From CD

Reply #6
If you don't want lame to do the filtering that it believes is beneficial, you should/could use -k.

Cheers,
David.