Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: looking for efficient speech codec (Read 13178 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

looking for efficient speech codec

Hi. I have quite a few speech recordings (talkradio, lectures, notes to self, etc.) which I need to encode. I have done some tests with HE-AAC V2 (Nero AAC codec / 1.5.3.0) and am quite pleased with the quality/filesize ratio.

I then tried speex (speex-1.2beta1/$). When comparing to HE-AACv2 files of the same size, speex failed horribly for me during VLC 1.1.11 playback, suffering from major artefacts and lower overall quality. Not quite what I expected. (Unless VLC decoding is broken or the speex version too old ?)

  • How does speex fare against HE-AACv2 for you ?
  • Is there any other substantially more efficient speech codec ?
  • What software player (Windows) can you recommend ?
  • What considerations should I not overlook if I want to keep the files for life (I'm still young :-) ?


many thanks




looking for efficient speech codec

Reply #1
I'll address point 4 first: Bluntly, if you want to keep the files for life, you probably want to keep them in a lossless format. Space is cheap, and formats, encoders, and player support will change over the years. Having a lossless copy means you can always re-encode into whatever the lossy format of the day is.

1. Speex should beat HE-AACv2 at some low bitrates, but its main advantage over HE-AAC and even Vorbis is not quality but latency. It will lose to HE-AAC and Vorbis at moderate or high bitrates.

2. First, the bad news. There's nothing substantially more efficient than HE-AAC with really solid player support right now.

The good news is that this is changing. The Opus codec is already vastly superior to both Speex and HE-AAC for speech, and there's still room for more improvements in the reference encoder. It is just about to release its 1.0 version. Not much player support yet but it will be there quite soon.

3. I find myself just using VLC since it's pretty handy for a lot of stuff.


Just checking- have you been resampling your files before passing them to your HE-AAC encoder? General-purpose codecs like HE-AAC or Vorbis, since they've been tuned for music, will avoid resampling so as to preserve the quality of music's high frequencies, but for speech recordings those higher frequencies are just noise, and removing them allows the encoder to spend more bits on things that matter. Depending on your hearing and preferences, the sample rate sweet spot for straight speech could be anywhere from 12kHz to 24kHz.

looking for efficient speech codec

Reply #2
More information on your scenario could be important - otherwise we'd have to guess on that.
Most telling hint seems to be the word "efficient" here..? As you mention Speex and HE-AAC you seem to be aiming for very low bitrates. I guess that means storage space or transmission bandwith are a problem?
- Detective work is fun but you might just tell us...

Opus seems to be a really nice replacement for both Speex and HE-AAC, given that it is also free/libre, open and royalty free format, an IETF standard, more efficient than both, ... But player support is just about to emerge now, no broad support yet. Although with being an IETF standard and already visible interest in the format (application support even already for development versions, ...) may promise a lot.

 

looking for efficient speech codec

Reply #3
Just checking- have you been resampling your files before passing them to your HE-AAC encoder? General-purpose codecs like HE-AAC or Vorbis, since they've been tuned for music, will avoid resampling so as to preserve the quality of music's high frequencies, but for speech recordings those higher frequencies are just noise, and removing them allows the encoder to spend more bits on things that matter. Depending on your hearing and preferences, the sample rate sweet spot for straight speech could be anywhere from 12kHz to 24kHz.


At least for HE-AAC, removing those high frequencies should have only minimal effects, as the entire upper part of the frequency range is encoded in a few kbps (that's the whole High Efficiency part of the codec).

looking for efficient speech codec

Reply #4
  • Is there any other substantially more efficient speech codec ?


Opus, which is a hybrid of a very good speech and a very good music codec.

Quote
  • What software player (Windows) can you recommend ?


As said, Opus support is just emerging as it came out of the standardization process weeks ago. foobar2000 support appears to be right around the corner, whereas Firefox Nightlies have support for it enabled by default since a few days.

Quote
  • What considerations should I not overlook if I want to keep the files for life (I'm still young :-) ?


Archive them in lossless somewhere? If it's speech then the bitrate shouldn't be very high, especially if they're resampled to say 24kHz first.