Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Lame resamples at 64 kbps (Read 5102 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Lame resamples at 64 kbps

I'm using Lame v3.96.1, built by moi on Windows.

I'm converting standard CD wav files using "lame -h -b 64" and Lame insists upon resampling from 44.1 to 24 kHz.  If I attempt to override using "-s", still no joy.  If I let Lame use its default 128 kbps, it leaves the sampling at 44.1.

Can anyone explain this, and tell me how to force 44.1 kHz?

Why would I want 64 kbps?  The spoken word, which is still not dead.  Thanks for any advice.

Lame resamples at 64 kbps

Reply #1
Question: Why do you need a 44.1KHz sampling rate for voice/speech?

Lame resamples at 64 kbps

Reply #2
Quote
Question: Why do you need a 44.1KHz sampling rate for voice/speech?
[a href="index.php?act=findpost&pid=347451"][{POST_SNAPBACK}][/a]


Short answer: a somewhat knowledgable friend told me that the standard for audiobooks is 44.1 kHz and 64 kbps, and I wanted to fit with that.

I am not expert in audio encoding, but in addition to meeting the convention, my thinking is:
1. The kbps determines the file size.
2. I need smallish file size, hence 64 kbps.
3. Why not encode using the existing 44.1 if that will give even a tiny improvement?
4. I have read that sampling is best at multiples of 11, so 24 kHz irked me.

Please enlighten me if I am mistaken!  Thanks.

Lame resamples at 64 kbps

Reply #3
Lame wouldn't do the resampling for no reason. At that bitrate, 24khz audio is easier to handle than the original 44.1. The lowpass is going to be lower than 12khz anyway, so a higher sampling rate will not improve quality, and would just be wasting precious bits.

Lame resamples at 64 kbps

Reply #4
Quote
Quote
Question: Why do you need a 44.1KHz sampling rate for voice/speech?
[a href="index.php?act=findpost&pid=347451"][{POST_SNAPBACK}][/a]


Short answer: a somewhat knowledgable friend told me that the standard for audiobooks is 44.1 kHz and 64 kbps, and I wanted to fit with that.

[a href="index.php?act=findpost&pid=347458"][{POST_SNAPBACK}][/a]


Theres a standard for encoding audio books?  Why?

Quote
I am not expert in audio encoding, but in addition to meeting the convention, my thinking is:
1. The kbps determines the file size.
2. I need smallish file size, hence 64 kbps.
3. Why not encode using the existing 44.1 if that will give even a tiny improvement?
4. I have read that sampling is best at multiples of 11, so 24 kHz irked me.

Please enlighten me if I am mistaken!  Thanks.


Regarding 3, you're assuming that a higher sampleing rate is better.  Thats only true if you have bitrate to encode it properly, and something in those higher frequency worth encoding.  Spoken words are generally < 5KHz.  Using a 44.1KHz is tremendously higher then required.  Worse, it will waste your limited bitrate.

Regarding 4, theres no reason to worry about that.  However, given that 24KHz is probably higher then you need, 22 or 11Khz might be a better choice.  Although I'm not sure how well Lame will work at those sampleing rates because I've never tried it.  Might be worth doing a search here for more info.

Edit:  brackets

Lame resamples at 64 kbps

Reply #5
You would most likely get a better result if you encode the material as mono. Add "-mm" switch to your command line. If it still resamples, then also try adding "--resample 44"

Lame resamples at 64 kbps

Reply #6
Btw, the "-s" switch only tells lame the frequency of the input file - not what you want in your output. Only use this when encoding from raw files or if you otherwise have problematic wav files which don't have the sampling frequency set properly in the header.

Code: [Select]
    -s sfreq        sampling frequency of input file (kHz) - default 44.1 kHz

Lame resamples at 64 kbps

Reply #7
Think "--resample 22" would make more sense for you. That's what all audio-books I have uses. It wouldn't hurt the quality, since voice is at most up to 3.5kHz, covered by the 11.025kHz range of that sampling rate. And it would also probably be more compatible with players than 24kHz. Also easier for LAME to resample.

No need to go 44.1, as mentioned by others.

Lame resamples at 64 kbps

Reply #8
Quote
since voice is at most up to 3.5kHz[a href="index.php?act=findpost&pid=347553"][{POST_SNAPBACK}][/a]

Where have you got this info from?

I thought 3.5 kHz was a limit where you should keep everything up until, or else the voice was severely distorted. Saying that sounds produced by human voice can't go above 3.5 is not what it said iirc. Fricatives should produce pretty much energy at frequencies much higher than that... And I'm pretty sure I can back that up by abx-tests, if I only had some voice samples here...

Lame resamples at 64 kbps

Reply #9
The original question lacks some information. It was not mentioned what the source exactly contains. Is it just plain speech or does it contain music or sound effects too? Is it mono (two identical channels in a stereo wave file) or is it stereo (with more or less channel separation between the channels)? If it is stereo with perhaps meaningless channel separation would it be fine to encode it in mono mode? Also, the purpose of the encoding was not specified. How is it going to be used? Is it going to be listened to personally or is it going to be distributed and needs to meet some specifications stated elsewhere.

As mentioned before some switches can change the LAME behavior. I would recommend trying a few different switches and testing the audio quality by listening to the files. Also, the intended usage should be tried. For example, how a portable player works with the files.

Here are some possible switch combinations:

-b 64 -h --resample 44
As said before, this switch would keep the original 44.1 kHz sample rate. If the file has stereo content with channel separation the overall audio quality is likely to be lower than without the resample switch.

-b 64 -h -m m
This switches to mono encoding. The encoder will not change the original 44.1 kHz sample rate.
[span style='font-size:8pt;line-height:100%'](It "assumes" the mono files have more available space for audio data at the same bitrate, which is not exactly true if the original stereo wave file is actually 2x mono because LAME uses the joint stereo mode by default and can effectively combine the two identical channels automatically.)[/span]

-V8, -V9, -V8 -m m and -V9 -m m
Low bitrate VBR switches, perhaps worth of trying. Again, the -m m switch makes the files mono.

--abr 64 or --abr 64 -m m
ABR mode. I would try these first when seeking the best quality/size ratio for speech. If this produces a lower bitrate than 64 kbps with speech (would that be unwanted?) the value can be changed. For example, --abr 89 -m m is a valid switch.

[span style='font-size:7pt;line-height:100%']Edit: a couple of typos[/span]

Lame resamples at 64 kbps

Reply #10
Quote
Quote
since voice is at most up to 3.5kHz[{POST_SNAPBACK}][/a]

Where have you got this info from?

I thought 3.5 kHz was a limit where you should keep everything up until, or else the voice was severely distorted. Saying that sounds produced by human voice can't go above 3.5 is not what it said iirc. Fricatives should produce pretty much energy at frequencies much higher than that... And I'm pretty sure I can back that up by abx-tests, if I only had some voice samples here...
[a href="index.php?act=findpost&pid=347561"][{POST_SNAPBACK}][/a]


Yeah, you're right. I put it the wrong way. Should have said "where human voice is mostly kept within".
Higher frequencies are [a href="http://www.transom.org/tools/editing_mixing/200402.voiceprocessing.html]indeed[/url] produced, but not a lot, when it comes to regular talking.

But you would agree that human voice fits snuggly into the 11.025Hz range, right? Especially when it's not singing, just talk.

Lame resamples at 64 kbps

Reply #11
Guys,

Thanks a million to all of you.  I read all the responses carefully, and now understand things well enough.  I think I will accept the Lame defaults such as 24 kHz sampling.  When I said that 44.1 was "standard" for audiobooks I only meant that I was told that it was a convention.  But I am now skeptical and will check with my source...

Quote
The original question lacks some information. It was not mentioned what the source exactly contains. Is it just plain speech or does it contain music or sound effects too?

The usual sort of dilemma.  Mostly just a mono voice, which might span 20 regular CDs.  However, right now I'm listening to a reading from the BBC that has some stereo music and I'd hate to lose any quality...

Quote
Here are some possible switch combinations:

Tried 'em.  I think I'll use "--abr 64 -h" and leave well-enough alone with the 24 kHz sampling.

Thanks again.

Lame resamples at 64 kbps

Reply #12
Quote
But you would agree that human voice fits snuggly into the 11.025Hz range, right? Especially when it's not singing, just talk.
[a href="index.php?act=findpost&pid=347568"][{POST_SNAPBACK}][/a]

I agree that you can transmit voice over a channel with 3.5 kHz bandwidth. Telephone is one example of that. Most of the time you can hear a voice clear enough through the telephone, but I would also say that it's pretty far from good quality. All the way up to a ~10kHz bandwidth cap is imho easily detected. That would of course translate to using a sampling frequency over 20kHz according to the Shannon/Nyquist theorem. Voice which is bandwidth limited to >10kHz may also be possible to detect, but since I can't try it right now I better not swear on it.

 

Lame resamples at 64 kbps

Reply #13
How about using a codec more suited for voice or low bandwidth?
Speex/Vorbis/(HE)AAC