Hi', since speex decoder is available now for Palm OS also, I was wondering if I can get some good results for audio books . Currently I'm using 22khz around 20kbps (q -2 ) with ogg.
Whatever settings i tried (quality/ bitrate/complexity) - I can't bet speex to sound better ogg per same bitrate. This doesn't make any sence, because speex is supposed to be tuned specially for speach.
Moreover, it doesn't obey my commands, for example I wanted to force it to use bitrate of 10 kbps
--bitrate 10 --vbr --comp 6 o:\test\16khz.wav o:\out.spx
Instead, i get bitrate of arund 20, which sound much worse than ogg@20kbps. Why?
Can anybody help me?
I think --vbr and --bitrate 10 are not meant for each other.
I think --vbr and --bitrate 10 are not meant for each other.
Can you recommend some good settings than? I've tried sample files from speex.org site , and they are pretty good, even at 10kbps. I just can't get even close to it.
Speex is not meant to be used at 20 kbs. But when somebody has to encode at about 5 kbs - Speex is a very good solution.
Moreover, it doesn't obey my commands, for example I wanted to force it to use bitrate of 10 kbps
--bitrate 10 --vbr --comp 6 o:\test\16khz.wav o:\out.spx
Instead, i get bitrate of arund 20, which sound much worse than ogg@20kbps. Why?
Can anybody help me?
Are you sure you want to encode at 10 *bits* per seconds. Also, the option is --abr when you want to control the vbr rate, so using --bitrate 10000 will already work much better. Although I personaly recommend --vbr --quality X (with X around 6 maybe?) instead. Using 16 kHz, I *think* Speex should be better than Vorbis when you use bit-rate below ~32 kbps. Above that, Vorbis would probably be better.
I've been doing voice encoding of radio traffic with ogg vorbis (-q -1) and speex for quite a while now. The speex files come out worse sounding and larger than the ogg's every single time.
Plus the speex encoder is hella unstable.
I blame the problems on the speex encoder's immaturity. Maybe I should see if there is a newer version out (I've just been using vorbis for the past several months... gave up on speex)
The sad part is that the agc and the denoising are _really_ helpful.
Nothing scientific, just a bit of anecdotal evidence that may be helpful.
arecord -f S16_LE -r 16000 | oggenc -q -1 - -o "`date +"%F %Hhr %Mmn %Ssc %Z"`.ogg"
arecord -f S16_LE -r 16000 | speexenc -w --quality 6 --vbr --denoise --agc -V - "`date +"%F %Hhr %Mmn %Ssc %Z"`.spx"
I've also given up on Speex, at least temporarily, in favor of Vorbis. I'd been using Speex to encode my audiobook CD's and cassettes ever since Windows binaries of it first appeared, and found its superiority to LAME easily ABX-able (LAME almost always produced an annoying metalic noise at lower bitrates, which Speex did not).
The latest Speex beta, I've found, often produces a similarly annoying noise, which Vorbis does not, and was again easily ABX-able. Ultimately, I found that while Speex is SUPPOSED to be optimized for bitrates up to 44kbps and sampling rates of 8, 16, and 32khz, it only seems to work consistently well at 8khz.
Even for those cases in which I cannot ABX a difference between Speex and Vorbis, I've found that Vorbis rarely produces significantly larger files, so I saw no reason to continue with Speex.
I've wondered if it's the case that most modern audiobook recordings can't be considered "pure speech", even when no background music or sound effects are used.
http://www.vorbis.com/faq/#speech (http://www.vorbis.com/faq/#speech)
So... do they tell lies?
i think it kind of depends which encoder is used. the one at xiph.org isn't that great compared to aoTuV.