Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: mp3 voice (Read 7538 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

mp3 voice

anyone have any favorite voice settings for books on cd?

The --preset voice is unacceptably large and too high a cut off for my tastes.

Just played around with settings.  No real education other than scant internet info.  The ath type 3 was definitely better.  The abr were necessary to force the bitrates to where desired, and helped reduce swishing artifacts.  I preferred the vbr new over mtrh on whole.  the cw limit seemed to put emphasise on that hz level. the sbnf  I wasn't too sure about--seemed to be something to do with the harseness at peaks.  The short blocks seemed to increase the quality of realism while making things sound pieced together at the low bitrates.

Perhaps someone can explain them better that has worked on the development.

My clear favorite, after playing many hours was a 34 kps mono with a cut off of 9.1.  On the iriver/rio and high end ear buds, it sound almost identical to the 40 kps vbrs with a 10.6 cut off. The 7 hours of loss time on a cd really not worth the tiny increase on sound--especially in light of the portability and sharing issue.

The 34kps (lol) is %1 %2 --short --athtype 3 --lowpass 9.1 --cwlimit 7.2 --substep 0 --ns-bass 6 --ns-alto 13 --ns-treble 21 --ns-sfb21 6 --strictly-enforce-ISO --vbr-new --abr 32  --verbose --Y -q0 -b 8 -B 40 -m m  (in this one I am trying to reduce the amount of short blocks and the related choppy artifact at this low bitrate, while hoping to use some to increase a nominal quality.  Not thoroughly tested the substep theory as being best at 0 in this low bitrate. )

While the other more natural sounding one was: %1 %2 --short --athtype 3 --ns-sfb21 -3 --lowpass 10.6 --cwlimit 10.7 --verbose --vbr-new --abr 37 --Y -q0 -b 8 -B 64 -m m        (at an abr of 45, all noticeable artifact are gone.  Yet an abr of 35 produce a listenable file at an average of 39 kps with a naturalish cutoff.  I set the cwlimit higher here, since this seemed to help to reduce some artifacts, real or imagined.)

Perhaps an expert could comment.

Would be nice if there was a setting to encode id3 tags from file names: Artist-title-track.ext

Also, I had no luck doing a high pass along with the lowpass.  Everytime I tried a high pass of .2 or higher, everything sounded like it was played on a matchbox. One would think you could exclude everything 200hz and down and 11k and above.

mp3 voice

Reply #1
maybe you should try fastenc from fhg at 32 kbits mono
Be healthy, be kind, grow rich and prosper

mp3 voice

Reply #2
Took up the challenge, and no contest.  The fastenc sounds horrible compared with my settings.

However, the fastencc has fewer artifacts than the untampered with lame. Just a lower cutoff than mine. of about 1.5-2 khz, I would guess.  Also, I see you can set the bitrate at 34 kps, interestingly, just no acceptable vbr size or abr I can see.

Also the 41 fast doesn't compare with naturalness of same sized lame vbr with my settings, nor high cutoff.

mp3 voice

Reply #3
My Gran(dmother) is registered blind and recently received a rather funky audiobook cd player.  After doing the obligatory showing her how to use it, I took one of the cd's home, and was super impressed with it (well, the encoding anyway, not the story).  Cant remember khz, but it was a 48kbps fhg mp3, and it sounded excellent.  I will post some clips at a later date, just for interest purposes  [span style='font-size:8pt;line-height:100%'](house sitting at the moment, so it will be a couple of weeks before I get at my home computer)[/span]

mp3 voice

Reply #4
Well, for what it's worth I've been using the following of late: --alt-preset cbr 48 -a --lowpass 15 -t

  First of all, CBR will completely turn some people off, so this one may not be for you. Here's the simple rationale, and you can decide for yourself: Although downmixed to monophonic audio, it matches --alt-preset's default cutoff for a 96kbps joint-stereo encoding and automatically downsamples to 32kHz. There are probably some better options, but this line gave me the most satisfactory results with Rob Inglis' reading of Tolkien's The Hobbit and The Lord of the Rings trilogy, and allows for about 34 hours per 700Mb CD.

    - M.

mp3 voice

Reply #5
I'm just using [--abr 40 -h -m m --resample 32 --lowpass 11] for my audiobooks. It doesn't sound too bad, and @ 40 kbps I can fit around 7 hours on my 128MB player. It only accepts MPEG1-Layer3, so I can't go lower than 32 KHz.

mp3 voice

Reply #6
Speex, ogg, mp3pro?
Why no one is using it?

Since this is in MP3 forum
I think mp3Pro 24 kbps mono is COOL ....
If you don't like MONO, just encode them with LC-stereo (32 kbps)
I think mp3pro will perform better than LAME 

mp3 voice

Reply #7
Quote
Speex, ogg, mp3pro?
Why no one is using it?

Since this is in MP3 forum
I think mp3Pro 24 kbps mono is COOL ....
If you don't like MONO, just encode them with LC-stereo (32 kbps)
I think mp3pro will perform better than LAME 

Yes, but people want compatibility with their mp3 player and such...

mp3 voice

Reply #8
Quote
Quote
Speex, ogg, mp3pro?
Why no one is using it?

Since this is in MP3 forum
I think mp3Pro 24 kbps mono is COOL ....
If you don't like MONO, just encode them with LC-stereo (32 kbps)
I think mp3pro will perform better than LAME 

Yes, but people want compatibility with their mp3 player and such...

Those people...
I'm the one in the picture, sitting on a giant cabbage in Mexico, circa 1978.
Reseñas de Rock en Español: www.estadogeneral.com

mp3 voice

Reply #9
Quote
Well, for what it's worth I've been using the following of late: --alt-preset cbr 48 -a --lowpass 15 -t

  First of all, CBR will completely turn some people off, so this one may not be for you. Here's the simple rationale, and you can decide for yourself: Although downmixed to monophonic audio, it matches --alt-preset's default cutoff for a 96kbps joint-stereo encoding and automatically downsamples to 32kHz. There are probably some better options, but this line gave me the most satisfactory results with Rob Inglis' reading of Tolkien's The Hobbit and The Lord of the Rings trilogy, and allows for about 34 hours per 700Mb CD.

    - M.

Quote


Mind if I ask what version of LAME you're using to do this?  I just tried to encode a Tony Robbins wav file to mp3 with that line (lame.exe --alt-preset cbr 48 -a --lowpass 15 -t) and I get a warning telling me I can't use anything but CBR between 80 and 320.  What am I doing wrong here?

Thanks for any help!
The difference between genius and stupidity?

Genius has limits.

mp3 voice

Reply #10
Quote
Speex, ogg, mp3pro?
Why no one is using it?

I've tried Speex, it's awesome but still lacks popular support (maybe the code can be better after sometime).

Thanks to John33 for the front end, the command line didn't accept WAV 
Hong Kong - International Joke Center (after 1997-06-30)

mp3 voice

Reply #11
Quote
The fastenc sounds horrible compared with my settings

Probably because you are using the default (and buggy) low quality mode.


For quality and compatibility (if you need to play voice in your mp3 portable) I'd use Fraunhofer's fastenc 1.02 at low bitrate and high quality modes.
Example:  "fastencc audio.wav audio.mp3 -dm -hq -br 32000"

-dm = downmix stereo to mono , only if the original material is stereo

You could also try newer fastenc encoders (inside Cool Edit Pro 2 or MMJB 7.x).

mp3 voice

Reply #12
Quote
Quote
Well, for what it's worth I've been using the following of late: --alt-preset cbr 48 -a --lowpass 15 -t

  First of all, CBR will completely turn some people off, so this one may not be for you. Here's the simple rationale, and you can decide for yourself: Although downmixed to monophonic audio, it matches --alt-preset's default cutoff for a 96kbps joint-stereo encoding and automatically downsamples to 32kHz. There are probably some better options, but this line gave me the most satisfactory results with Rob Inglis' reading of Tolkien's The Hobbit and The Lord of the Rings trilogy, and allows for about 34 hours per 700Mb CD.

    - M.

Quote


Mind if I ask what version of LAME you're using to do this?  I just tried to encode a Tony Robbins wav file to mp3 with that line (lame.exe --alt-preset cbr 48 -a --lowpass 15 -t) and I get a warning telling me I can't use anything but CBR between 80 and 320.  What am I doing wrong here?

Thanks for any help!

Sorry; I should have specified. That setting will not work with L.A.M.E. 3.90.2 (or, presumably 3.90.3)... but it will work with 3.92 and higher. If you prefer the safety of Dibrom's compiles for standard encoding but want to try a lower-bitrate --alt-preset for audiobooks, Speek's ALL2LAME frontend will allow you to easily switch EXEs via the "Locations" button (on my system 3.90.2 is "lame.exe," while 3.92 is "lame392.exe").

    - M.

mp3 voice

Reply #13
"--preset 40 -mm"

mp3 voice

Reply #14
Excellent.  Encoding some test tracks now.  But what's that "--preset 40 -mm" blurb you added to that last message?  I'm confused now (but that's pretty easy to do sometimes). 
The difference between genius and stupidity?

Genius has limits.

mp3 voice

Reply #15
Get 3.93.1 and use ""--preset 40 -mm". That is probably what you want to achieve

mp3 voice

Reply #16
BEAUTIFUL!!! Works like a charm, thanks... just perfect for my situation.
The difference between genius and stupidity?

Genius has limits.

mp3 voice

Reply #17
I don't want to say wma is good, but for such purposes (especially if you have iRiver/RioVolt you can use wma) it must be better than MP3. Of course SPEEX is better, but wma has hardware support.
Ogg Vorbis for music and speech [q-2.0 - q6.0]
FLAC for recordings to be edited
Speex for speech

mp3 voice

Reply #18
So what are the switches used in "--preset 40"?

As far as I know, only --aps --ape and --api use code level tweaks, so there must be a commandline that produces the same results as "--preset 40".

mp3 voice

Reply #19
Yes, there is a command line equivalent to "--preset 40 -mm", but I do not understant to purpose of using it instead of the preset.
If you want some information on what are the encoding parameters, you can add --verbose to any command line.

mp3 voice

Reply #20
I'm just curious, that's all.

Didn't know about the --verbose switch.

mp3 voice

Reply #21
But "--preset 40" is optimized for stereo files, right? That mean 20 kbps/channel. Adding -mm to the commandline converts it to mono, which means 40 kbps/channel.

Wouldn't it be "more right"  to use the same lowpass (and other settings) as "--preset 80"? Or maybe "--preset 64", since the presets use joint stereo and not full stereo?

Something like "--preset 64 -m m --abr 40"

mp3 voice

Reply #22
You are right 3.93.1, you will need to higher the lowpass by adding --lowpass xxx.
I do not have a 3.93.1 here, so my suggestion is:

use --preset 64, and look at the displayed lowpass
then use --preset 40 -mm --lowpass xxx, with xxx beeing the lowpass value of the 64kbps preset.

With 3.94, this will be automatically handled.

mp3 voice

Reply #23
What's the frequency range of a human voice anyway? I've been using --lowpass 10 for my speech encodings and the results were quite good.

But is 10 kHz necessary? I never really experimented with different lowpasses.  I just tried 10 Khz and it sounded decent so I didn't care about testing anything else.

 

mp3 voice

Reply #24
Back home now, the sample in question was in fact a 44khz 56kbps FhG mp3... Have a 323kb sample of it, and the sound quality (well, on the first bit, where a proper microphone or whatever was used) is absolutely superb, imo.

  Havent got anywhere to upload it to at the mo - if someone wants to offer, I can email it so the rest of you peeps can have a listen and beg to differ or agree.