Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Best way to encode voice with vorbis. (Read 11374 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Best way to encode voice with vorbis.

Hello,

I do a audio blog style show with a friend of mine (http://www.n37radio.com) and I wanted to know what everyone's opinion was for the best settings to encode voice using vorbis. Right now I'm just using strate quality 1 encoding. I would like to get closer to a -q 0 setting but when I try this it seems like vorbis just goes wild on the compression of the voice. The intro music at quality 0 is actually "good enough" but as soon as the music fades out and we start talking vorbis just trys to throw to much away.

Any suggestions would be appreciated.


Jonathan

Best way to encode voice with vorbis.

Reply #1
try speex.

Best way to encode voice with vorbis.

Reply #2
If your focus is on voice, resample the input to a lower sampling rate. Try 8kHz, 11.025kHz, 16kHz, 22.05kHz, and 32kHz, and see what suits you best.

Best way to encode voice with vorbis.

Reply #3
Quote
try speex.
[a href="index.php?act=findpost&pid=328009"][{POST_SNAPBACK}][/a]


Where I think speex is awesome and I personally would download speex versions of the show I don't know how popular that would be with other listeners. Maybe when I get the torrent tracker setup I might offer a speex copy for download over bittorrent only but right now it would just take up valuable web host space.

Thanks for the feedback.

Jonathan

Best way to encode voice with vorbis.

Reply #4
Quote
Quote
,Sep 18 2005, 10:04 PM]try speex.
[{POST_SNAPBACK}][/a]


Where I think speex is awesome and I personally would download speex versions of the show I don't know how popular that would be with other listeners. Maybe when I get the torrent tracker setup I might offer a speex copy for download over bittorrent only but right now it would just take up valuable web host space.


The only thing the receivers need to do is download the DirectShow Ogg codecs for Windows Media Player:

[a href="http://www.illiminable.com/ogg/]http://www.illiminable.com/ogg/[/url]

On your end, you need that plus a speex converter. Besides the speex distribution itself, dBpowerAMP has a speex codec.

As for space, the compression of speex on voice recording is more efficient than Vorbis or MP3 could ever give you. It would be the Vorbis copies that would take up valuable web host space.
FLAC – all your bit are belong to you

Best way to encode voice with vorbis.

Reply #5
@Involarius: Not everybody uses WMP, and not everybody uses Windows. Speex also has no hardware support that i'm aware of.

Vorbis can be used for this, the nifty bit here is that you can adjust the quality down _below_ 0. -2 is the lowest supported at the moment, which equates to 32kbps at 44.1khz stereo, but can go down to about 16kbps if you use mono and downsample, and still have very good results for voice.

I wouldn't recommend speex for this unless you absolutely require the smallest possible file. It's simply too unknown and unsupported to be of much use, imho.

People have asked about this before, see:
http://www.hydrogenaudio.org/forums/index....showtopic=37593
http://www.hydrogenaudio.org/forums/index....showtopic=36879

And i can highly recommend you read http://www.hydrogenaudio.org/forums/index....showtopic=15049 for some good information on how best to use the encoder. I think aoTuV beta 4 version is best suited to your needs.
Veni Vidi Vorbis.

Best way to encode voice with vorbis.

Reply #6
Quote
try speex.
[a href="index.php?act=findpost&pid=328009"][{POST_SNAPBACK}][/a]


Are aoTuV b4 and last official version already optimized for sampling rate other than 44.1?
Ogg Vorbis for music and speech [q-2.0 - q6.0]
FLAC for recordings to be edited
Speex for speech

Best way to encode voice with vorbis.

Reply #7
>Not everybody uses WMP, and not everybody uses Windows

Sadly if you take the majority as 'everybody' then I think yes:

Everybody uses Windows
Everybody uses WMP (the most popular would be split WMP / iTunes - Winamp is loosing ground these days).

Especially with the web integration (click something by default in Internet Explorer, such as .mp3 and Windows Media Player popup up), that makes people use it, even if by accident.

Best way to encode voice with vorbis.

Reply #8
The majority use Windows (I do, though I may jump ship to Linux some time in the future, if Microsoft's DRM schemes turn out nasty). Telling how to support speex on Windows is helpful to a lot of people. I expect those who use Linux have support for speex built-in in the OS, or will have it soon (GNOME 2 already supports Vorbis and FLAC natively). As for Macintosh, I don't know, though opensource goodies usually trickle down to OS X pretty fast.

Hardware ... ah, well, that's pretty much stuck with MP3 for most cases. Nothing much you can do but use LAME's presets for voice encoding or phone quality.
FLAC – all your bit are belong to you

Best way to encode voice with vorbis.

Reply #9
It highly depends on the audience of his blog, however.

But yes, for the internet as a whole, that is the sad state of affairs.

Vorbis optimised for other sample rates, i have no idea, i just know 16000hz is more than enough to accurately represent a human voice and shaves off bits in the process. I suppose one could choose not to resample and lowpass at 7 or 8 khz instead. But there doesn't seem a to be a --lowpass option in oggenc2.6.
Veni Vidi Vorbis.

Best way to encode voice with vorbis.

Reply #10
Quote
........... But there doesn't seem a to be a --lowpass option in oggenc2.6.
[a href="index.php?act=findpost&pid=332970"][{POST_SNAPBACK}][/a]

It's one of the advanced encode options.

Best way to encode voice with vorbis.

Reply #11
Quote
Quote
,Sep 18 2005, 09:04 PM]try speex.
[a href="index.php?act=findpost&pid=328009"][{POST_SNAPBACK}][/a]


Are aoTuV b4 and last official version already optimized for sampling rate other than 44.1?
[a href="index.php?act=findpost&pid=332929"][{POST_SNAPBACK}][/a]

It's primarily optimized for 44.1kHz, but it works fine with all sampling rates from 6kHz to 48kHz. I haven't tried anything higher.

edit: Um, I tested 6kHz with Floggy, not AoTuV or the current versions, so the above is slightly incorrect. But I suppose you get what I mean.

Best way to encode voice with vorbis.

Reply #12
Quote
Hello,

I do a audio blog style show with a friend of mine (http://www.n37radio.com) and I wanted to know what everyone's opinion was for the best settings to encode voice using vorbis. Right now I'm just using strate quality 1 encoding. I would like to get closer to a -q 0 setting but when I try this it seems like vorbis just goes wild on the compression of the voice. The intro music at quality 0 is actually "good enough" but as soon as the music fades out and we start talking vorbis just trys to throw to much away.

Any suggestions would be appreciated.


Jonathan
[a href="index.php?act=findpost&pid=328002"][{POST_SNAPBACK}][/a]

For voice at low q you'd better encode in mono not in stereo..

Best way to encode voice with vorbis.

Reply #13
Quote
It's one of the advanced encode options.
[a href="index.php?act=findpost&pid=332985"][{POST_SNAPBACK}][/a]


I was suspecting that, but i couldn't find it in oggenc2's help, i couldn't find a list of advanced encode options at the time either, so i gave up. Perhaps it should be in the help, as lowpassing is a pretty common tuning thing across various codecs. Got it now though.

I did a short test, resampling to 16KHz reduces the lowpass to 6khz, not resampling, but instead lowpassing to this frequency produced an 8% higher bitrate. I suppose you could do that, for the sake of staying on the really only proven samplerate.

Does anyone know how the Vorbis DAP's handle samplerates other than 44.1KHz?
Veni Vidi Vorbis.


Best way to encode voice with vorbis.

Reply #15
I listen mostly to audiobooks, and only use Ogg for the ones wherein music is vital, like this one, or where stereo effects are used to good effect.
Ogg Vorbis for speech is total overkill and probably (needlessly) reduces your audience.  MP3 is what you need.

Best way to encode voice with vorbis.

Reply #16
Unless you meant mp3 pro, I disagree with you. In this test you see Vorbis sensibly better than Lame for both male and female speeches. Sorry, but making harsh statements without any statistical data to back them up is generally not very appreciated on these forums
Edit: And if you did mean mp3 pro that's still not a very good idea IMO because a big part of the audience may not have the plugin.

Best way to encode voice with vorbis.

Reply #17
I've just checked my Ubuntu Linux LiveCD: GNOME 2 supports Speex out of the box. So anyone who's planning on Speex has got one less platform to worry about, and just supply the Windows DirectShow codec for those still living under the Empire.  Mac-ites will probably be able to find a Fink port for their platform.

Ah, the beauty of open formats...
FLAC – all your bit are belong to you

Best way to encode voice with vorbis.

Reply #18
Thank you everone for your feedback (except for riggits).  It seems like from most of the feed back I'm getting all you can really do is play with the sample rate and the quality setting which I have done. Currently I'm using these settings now to encode the vorbis streams:

oggenc -q 1 --resample 32000

Anything below 32000 really starts to bother me in terms of quality and anything below -q 1 does the same. It's odd because durning the music opening it sounds decent even at 16kHz but when it hits the voice section and the music goes away it seems like vorbis really lays on the compression. It makes the voice sound very (for lack of better words) "digitally processed". I'm still open to suggestions on how to fix this problem but for right now 32kHz at quality 1 is the treatment it's getting.

[a href="index.php?act=findpost&pid=328002"][{POST_SNAPBACK}][/a]
Quote
For voice at low q you'd better encode in mono not in stereo..
[a href="index.php?act=findpost&pid=332991"][{POST_SNAPBACK}][/a]

I realize that I could be saving space by down mixing to mono. I know this may seem silly but I want to keep the stereo stream. I do some very minor things to the stereo stream I don't want to discard. The mp3 stream is mono because well I just don't care as much about the mp3 stream and the people who would pick that over the vorbis download.

To address all the speex comments that I got. Yes I already have access to both encoder and decoder for speex on all the platforms I use which include but are not limited to Linux, FreeBSD, OpenBSD, OS X, and Windows. Currently the show is produced on my Powerbook using OS X. Props to the darwin ports people so I don't have to port things myself. No I don't use Fink it makes me want to hurt things.

I think HbG was correct in his statement about speex and hardware support. At least with vorbis there are a handful of portables that will play it.

Thanks again everyone for all the comments.

Best way to encode voice with vorbis.

Reply #19
Quote
oggenc -q 1 --resample 32000

Anything below 32000 really starts to bother me in terms of quality
[a href="index.php?act=findpost&pid=334260"][{POST_SNAPBACK}][/a]

You might want to try an external resampler like SSRC.  Oggenc's internal resampling is generally believed to be rather poor.


Best way to encode voice with vorbis.

Reply #21
As Olive said, oggenc2 uses a proper resampler, so you needn't worry about that. You could even add a -S 0 switch if you want to make sure it uses the highest quality resampling.

Perhaps there should be a "vorbis uses libsamplerate awareness campaign"

Thesaint, an issue may be that vorbis somehow spends too few bits on the voice, you could try ABR mode, it is generally not recommended but may help for your case. Try something like: -b 64 --managed --resample 32000, encoding will be _slow_ though.

Still strange how vorbis performs poorly on speech though. The audio clips in Doom3 are 32kbps and sound just fine.
Veni Vidi Vorbis.

Best way to encode voice with vorbis.

Reply #22
Quote
Still strange how vorbis performs poorly on speech though. The audio clips in Doom3 are 32kbps and sound just fine


It might be different if Floor 0 was still being deployed.  The goal at one point was to use the LPC model based upon Levinson-Durbin approach. It doesn't have much use now it would just be easier to use Speex.
budding I.T professional

Best way to encode voice with vorbis.

Reply #23
Please translate this for us mortals.
Veni Vidi Vorbis.

 

Best way to encode voice with vorbis.

Reply #24
Quote
Please translate this for us mortals.


Using Linear Prediction Coeffcients are a better method for encoding speech (from what I have read in the past, not be confused with the way they are used for lossless coding). The original release candidates and early Vorbis binaries used to deploy this model, it was no convenient accident that it was designed for speech and low-bitrate coding. The reason is was scrapped I am guessing was, because using the model results in the inability to be able to couple the channels for another technical reason. It can still be implemented for experimental purposes, but now now we have Speex which is a CELP coder Coded Excited Linear Prediction so your best bet would just be to resample and if you plan on using Vorbis for speech for the desired results. If there was a "quality" problem per say it might have had to with the fact that the lower sampling rates and or quality modes don't switch to short blocks, I know this was true for -q -1 for the time being of course they might have changed it.  . LPC models can also be used to create those nifty "vocoders" used in electronic music to do resynthesis and other fun stuff like that  Thomas Bangalter from Daft Punk like's these.
budding I.T professional