HydrogenAudio

Lossy Audio Compression => Speech Codecs => Topic started by: chrizoo on 2012-06-15 02:12:07

Title: looking for efficient speech codec
Post by: chrizoo on 2012-06-15 02:12:07
Hi. I have quite a few speech recordings (talkradio, lectures, notes to self, etc.) which I need to encode. I have done some tests with HE-AAC V2 (Nero AAC codec / 1.5.3.0) and am quite pleased with the quality/filesize ratio.

I then tried speex (speex-1.2beta1/$). When comparing to HE-AACv2 files of the same size, speex failed horribly for me during VLC 1.1.11 playback, suffering from major artefacts and lower overall quality. Not quite what I expected. (Unless VLC decoding is broken or the speex version too old ?)



many thanks



Title: looking for efficient speech codec
Post by: jensend on 2012-06-15 07:15:01
I'll address point 4 first: Bluntly, if you want to keep the files for life, you probably want to keep them in a lossless format. Space is cheap, and formats, encoders, and player support will change over the years. Having a lossless copy means you can always re-encode into whatever the lossy format of the day is.

1. Speex should beat HE-AACv2 at some low bitrates, but its main advantage over HE-AAC and even Vorbis is not quality but latency. It will lose to HE-AAC and Vorbis at moderate or high bitrates.

2. First, the bad news. There's nothing substantially more efficient than HE-AAC with really solid player support right now.

The good news is that this is changing. The Opus codec is already vastly superior to both Speex and HE-AAC for speech, and there's still room for more improvements in the reference encoder. It is just about to release its 1.0 version. Not much player support yet but it will be there quite soon.

3. I find myself just using VLC since it's pretty handy for a lot of stuff.


Just checking- have you been resampling your files before passing them to your HE-AAC encoder? General-purpose codecs like HE-AAC or Vorbis, since they've been tuned for music, will avoid resampling so as to preserve the quality of music's high frequencies, but for speech recordings those higher frequencies are just noise, and removing them allows the encoder to spend more bits on things that matter. Depending on your hearing and preferences, the sample rate sweet spot for straight speech could be anywhere from 12kHz to 24kHz.
Title: looking for efficient speech codec
Post by: Speckmade on 2012-07-13 01:22:59
More information on your scenario could be important - otherwise we'd have to guess on that.
Most telling hint seems to be the word "efficient" here..? As you mention Speex and HE-AAC you seem to be aiming for very low bitrates. I guess that means storage space or transmission bandwith are a problem?
- Detective work is fun but you might just tell us...

Opus seems to be a really nice replacement for both Speex and HE-AAC, given that it is also free/libre, open and royalty free format, an IETF standard, more efficient than both, ... But player support is just about to emerge now, no broad support yet. Although with being an IETF standard and already visible interest in the format (application support even already for development versions, ...) may promise a lot.
Title: looking for efficient speech codec
Post by: Garf on 2012-07-13 06:01:35
Just checking- have you been resampling your files before passing them to your HE-AAC encoder? General-purpose codecs like HE-AAC or Vorbis, since they've been tuned for music, will avoid resampling so as to preserve the quality of music's high frequencies, but for speech recordings those higher frequencies are just noise, and removing them allows the encoder to spend more bits on things that matter. Depending on your hearing and preferences, the sample rate sweet spot for straight speech could be anywhere from 12kHz to 24kHz.


At least for HE-AAC, removing those high frequencies should have only minimal effects, as the entire upper part of the frequency range is encoded in a few kbps (that's the whole High Efficiency part of the codec).
Title: looking for efficient speech codec
Post by: Garf on 2012-07-13 06:06:17
  • Is there any other substantially more efficient speech codec ?


Opus, which is a hybrid of a very good speech and a very good music codec.

Quote
  • What software player (Windows) can you recommend ?


As said, Opus support is just emerging as it came out of the standardization process weeks ago. foobar2000 support appears to be right around the corner, whereas Firefox Nightlies have support for it enabled by default since a few days.

Quote
  • What considerations should I not overlook if I want to keep the files for life (I'm still young :-) ?


Archive them in lossless somewhere? If it's speech then the bitrate shouldn't be very high, especially if they're resampled to say 24kHz first.