Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Lame Settings For Speech? (Read 36272 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Lame Settings For Speech?

Reply #25
So I have a client that is hell bent on wanting their speech encoding with LAME at 8kbps.  Again, since we are going to Flash, it needs to be 8kbps, 11KHZ Mono.

JohnV or anyone else, would love for you to experiment with the best settings at these parameters.  I know it won't be pretty, but I'm sure on this board we'll be able to find the best possible setting.

Thanks!

David

Lame Settings For Speech?

Reply #26
Well, to tell you the truth, 8kbps is almost hopeless..
Simple line after some testing:
-b 8 -a --resample 11 --lowpass 4.0

I really couldn't get it noticeably better than that with any switch tweaking... maybe somebody else can try?
Juha Laaksonheimo

Lame Settings For Speech?

Reply #27
8kbps on mp3 is not realistically possible, no matter what you do, it sounds too ugly. The best i could do was 16kbps at 8Khz (allowing full 4khz dinamic range)

Maybe you could try lowpass 3 or something less, and still keep the samplerate at 11khz.

At least vorbis handles it better, but speex is the right tool for this. Speex will have to be supported in the near future in hardware for this kind of use. (speech recording/mini tape replacement).
She is waiting in the air

Lame Settings For Speech?

Reply #28
JohnV,

Just wanted to add my thanks for your work in testing these low bitrates for speech.  I'll definitely use the 16kb/s line for audiobooks, lectures, etc. that I had previously been encoding with lame at 48kb/s.  I think Speex at ~9-10 kb/s compares favorably with your setting and I may go that route eventually, but for current hardware compatibility, this is excellent.
Yeah, when you call my name
I salivate like a Pavlov dog...

Lame Settings For Speech?

Reply #29
Quote
JohnV,

Just wanted to add my thanks for your work in testing these low bitrates for speech.  I'll definitely use the 16kb/s line for audiobooks, lectures, etc. that I had previously been encoding with lame at 48kb/s.  I think Speex at ~9-10 kb/s compares favorably with your setting and I may go that route eventually, but for current hardware compatibility, this is excellent.

Well, you will get even better result if you use those settings with --abr instead of cbr-coding. cbr was only used because it was needed by David because of the flash-implementation.
24kbps speech:
--alt-preset 24 -a --resample 22 --lowpass 7 -Z
16kbps speech:
--abr 16 -a --resample 11 --lowpass 5 --athtype 2 -X3

Also the result should be a bit better, if you use Takehiro's 3.94a:
http://static.hydrogenaudio.org/extra/LAME...-394-alpha2.zip

edit: fixed link to 3.94a and added abr-lines
Juha Laaksonheimo

Lame Settings For Speech?

Reply #30
Beside: I asked the developers of hoeren.zeit.de: They are using a Frauenhofer Codec with 24kbps, 16Khz (Stream) / 96kbps, 96Khz (Download), both mono.


I'm no pro at all, but the following comand-line also worked well:
-b 24 -m m -h --abr 24 -B 64 --resample 16 --lowpass 12 -a --nspsytune --highpass 0.06 --highpass-width 0.1  --athtype 2


But I still like JohnV's commend line better (smaller file-size, little less qualy)
--abr 16 -a --resample 11 --lowpass 5 --athtype 2 -X3

Some questions:
- do I need -m m when I'm using -a ?
- Do most portable MP3-Player accept files with 16kbps ABR and 11Khz (or similar...) - or must I stick to mpeg1 layer3 to stay compatible?
- What settings would you recomend for the strongest mpeg1, Layer III compression (I guess 32Khz, 32kbps)?
- Is there a great difference between Lame an a low-bitrate-optimized Encoder?

Quote
24kbps speech:
--alt-preset 24 -a --resample 22 --lowpass 7 -Z
16kbps speech:
--abr 16 -a --resample 11 --lowpass 5 --athtype 2 -X3

- Why are you using '--alt-preset' for 24 and '--abr' for 16kbps ?

Thanks, .lu

Lame Settings For Speech?

Reply #31
Quote
Some questions:

Ok, it's some time since I attended this discussion but I try to answer.
Quote
- do I need -m m when I'm using -a ?
No, you can use -m m or -a, but -a is obviosly a bit shorter. 
Quote
- Do most portable MP3-Player accept files with 16kbps ABR and 11Khz (or similar...) - or must I stick to mpeg1 layer3 to stay compatible?
Hmm.. I'm afraid to say anything certain to this. I'd guess that most portables support this.
Quote
- What settings would you recomend for the strongest mpeg1, Layer III compression (I guess 32Khz, 32kbps)?
I haven't tested 32kbps at all.. so can't say.
Quote
- Is there a great difference between Lame an a low-bitrate-optimized Encoder?
Well, lame is not considered especially good at low bitrates. FhG encoders may do better.
Quote
Quote

24kbps speech:
--alt-preset 24 -a --resample 22 --lowpass 7 -Z
16kbps speech:
--abr 16 -a --resample 11 --lowpass 5 --athtype 2 -X3

- Why are you using '--alt-preset' for 24 and '--abr' for 16kbps ?
IIRC I liked GPsycho better at 16kbps. --alt-preset is using NSPsytune model.
Juha Laaksonheimo

Lame Settings For Speech?

Reply #32
Quote
Ok, it's some time since I attended this discussion

I have to admit, I chose a really old one ... 

Quote
portable MP3-Player

Finally, I've found a list of MP3 Players with some basic technical information (played Bitrates, VBR)
Reinhard Hofmann : Portable Mp3 Players (engl) | (ger.)
This might be a little help, but if anybody has more experience, I would be grateful. Is mono a problem? Do Mp3-Files get larger if I use JointStereo, but the source *.wav is mono?

Quote
Well, lame is not considered especially good at low bitrates. FhG encoders may do better.

I heared about it - but is it a great difference? Are there different FhG Encoders? I read about Fastencc (not the Radium hack). Whould you suggest this one? How much is it?

Surely, it is exciting to bring speech-compression to it's limit. Nevertheless, this is quiete theoretical in my case - for high compression I have to use a stream-compatible mp3-file (so no VBR/ABR is possible). Currently I'am using lame 3.90.3 with the comand-line '-b 24 -q1 -c -a --resample 16 --lowpass 8 --nspsytune'. This is doing all right, may be it still can be improved?!

Now,  I want to make a second MP3-File, which sounds much better (none of the 24kbps / 16khz files really sounds wonderful B) ), but still is acceptable for normal modem/ISDN users (do not download (much) longer than the playtime is with  56K.. ). And they should easily be able to listen to the files with a portable player, or burn them as a audio-CD...
Currently, I'm playing around with some 64 or 56kbps abr files... And thats most ipmortant to me: What command line would you (all, out there) prefer?
Whats about? -b 32 -B 160 -a --abr 64 -F --resample 32 --lowpass 16 -h -c ?
(this is still Mpeg1).

Ok., thats quite a lot, right now.
thanx, .lu

(voice testfiles @ www.tnt.uni-hannover.de/project/mpeg/audio/sqam/ )

Lame Settings For Speech?

Reply #33
Quote
Is mono a problem? Do Mp3-Files get larger if I use JointStereo, but the source *.wav is mono?

For CBR/ABR the bitrate is pretty much fixed, so size won't change, but any kind of stereo, joint or not, takes more space to describe than mono, even if both channels are identical, so for VBR size will increase for fixed quality, and for CBR quality will decrease for fixed filesize. Although testing with LAME seems to indicate it's smarter than that - even if you use -m j on a mono source file, it outputs a mono MP3. Other MP3 encoders may allow you to create joint-stereo output from a mono input.

Lame Settings For Speech?

Reply #34
This topic often gets referenced (or at least Dibrom references it) as the definitive statement on voice-only encoding.  It seems, however, entirely focused on the lowest possible kbps settings and the best settings given the bandwidth limitations of dial-up modems.

So, I'm wondering about more ideal/transparent settings: 

1.  What are the best settings if you don't have to be concerned about modem speeds?  What do you want to encode with if you simply want to stick a 4 CD audiobook onto 1 CD for your MP3 player and want near transparency?  For the purposes of this topic, the content is strictly vocal and mono, though it may be a singing voice at times (such as an opera singer reading Hamlet).  For this, imagine it's your favorite singer reading your favorite book and there is occasionally singing, whispering, sighs, etc. and backup voices for the main characters.

2.  There's always talk of resampling in previous discussions as opposed to leaving it at the original sample rate and lowpassing.  With a CD source, is the quality better, for example, at resample 22.05 lowpass 11 than 44.1 and 11?  What's the point of the resampling if you can just lowpass (since neither size and encoding speed improve and some decoders may even have more difficulty with non 44.1 rates)?
"All I ask is that composers wash out their ears before they sit down to compose." - Morton Feldman

Lame Settings For Speech?

Reply #35
hi h.tuehn ,

perhaps my following test can partly help you:
I encoded a voice file by sqam and a selfmade one in two ways:

1. 32Khz, 80Kbps, CBR, mono with FhG's MP3Enc (3.1 Demo - Download, 218KB)
2. 32Khz, 64Kbps, CBR, mono with LAME

With Lame I re-decoded the MP3s to WAV and burned them as an audio CD (Nero). Four persons (non experts) tried to identify the higher compressed one (using excelent speakers) .  They really struggeld, sometimes they 'guessed' right.

To sum up: If you want to compress 4 CD WAVs (with voice, visper and singing) to 1 CD MP3, I think you are much above a difficult filesize. I guess, you would be doing well with standart presets, you also use for normal music. What do the otehr think (as I only guess).. .lu

Lame Settings For Speech?

Reply #36
Quote
What's the point of the resampling if you can just lowpass (since neither size and encoding speed improve and some decoders may even have more difficulty with non 44.1 rates)?


I don't know. People keep saying that Lame is optimized for 44.1 KHz and that you shouldn't use other sampling rates, so it seems logical to just lowpass instead of resampling.

I've tried encoding the "Lord of the Rings" audiobook using this command line:

--alt-preset standard -a --lowpass 10 -b 32

The average bitrate is around 60-70 kbps and it sounds great, so I'll be using this command line with all my audiobooks from now on.

Lame Settings For Speech?

Reply #37
Quote
Quote
- Do most portable MP3-Player accept files with 16kbps ABR and 11Khz (or similar...) - or must I stick to mpeg1 layer3 to stay compatible?
Hmm.. I'm afraid to say anything certain to this. I'd guess that most portables support this.

I tried a lot: Nearly all current MP3-Players DO play 16kbps ABR and 11Khz (or similar...) mp3-files.

PS: Still looking for best lame settings @ 80 kbps...

Lame Settings For Speech?

Reply #38
Quote
--alt-preset 24 -a --resample 22 --lowpass 7 -Z

Hi JohnV

I tried this line out and it didn't work???

Lame Settings For Speech?

Reply #39
Sounds silly but has anybody tried:

--alt-preset voice

I did and it didn't sound too bad but I'm a newbie

Lame Settings For Speech?

Reply #40
DavidHart:

--alt-preset voice
is equal to
--resample 24 --lowpass 12 --noshort yes -mm -b56
(Source: lame.exe --preset longhelp)

it works all right, most commandlines discussed here, were for less than 56kbps (as used in --alt-preset voice).

Lame Settings For Speech?

Reply #41
.lu,

Entering
Quote
C:\WINDOWS\lame>lame.exe --preset longhelp

gave me:

Quote
LAME version 3.90.3 MMX  (http://www.mp3dev.org/)

Error: You did not enter a valid profile and/or options with --preset

Available profiles are:

  <fast>        standard
  <fast>        extreme
                 insane
          <cbr> (ABR Mode) - The ABR Mode is implied. To use it,
                             simply specify a bitrate. For example:
                             "--preset 185" activates this
                             preset and uses 185 as an average kbps.

    Some examples:

or "C:\WINDOWS\LAME\LAME.EXE --preset fast standard <input file> <output file>"

or "C:\WINDOWS\LAME\LAME.EXE --preset cbr 192 <input file> <output file>"
or "C:\WINDOWS\LAME\LAME.EXE --preset 172 <input file> <output file>"
or "C:\WINDOWS\LAME\LAME.EXE --preset extreme <input file> <output file>"

For further information try: "C:\WINDOWS\LAME\LAME.EXE --preset help"

C:\WINDOWS\lame>

what am I doing wrong or is it simply the version of lame that I'm using?

 
SimplePortal 1.0.0 RC1 © 2008-2021