Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: An Idea (Read 3251 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

An Idea

I was just wondering if it would be possible to fuse a general audio codec, such as Vorbis, with a speech codec, such as Speex, so that if there happened to be any speech-only portions of an audio stream, the encoder could switch to speech mode and save an incredible amount of bits, just like good VBR codecs switch to an incredibly low bitrate when silence is detected.  This might not be so useful for music, as very few song have any significant amount of pure speech in them.  However this could be revolutionary for audio that is mainly speech but also contains musical interludes or applause from an audience, which is often garbled by speech codecs.  Although I think its most interesting use might be in encoding audio from movies which have a lot of dialogue, this could, theoretically, make single CD rips plausible in many cases where they were not before.

Cons:

-The change in sample rate for speech might not be possible.  (you tell me)
-Movies often have a lot of background noise which might mean that the speech codec would never be able to be used (possibly the threshold could be changed:  file size vs background detail)
-Compression would take longer, possibly a lot longer
-This would definitely break compatibility with the past form of the audio codec
-Implementation could be excruciatingly difficult, I suppose

Anyway I am interested in hearing anyones comments/thoughts especially relating to development potential.

edit:  I didn't know where to put this post so any mods/admins can move it if they feel it is better suited elsewhere.
gentoo ~amd64 + layman | ncmpcpp/mpd | wavpack + vorbis + lame

An Idea

Reply #1
It might work but detecting which parts are speech and which aren't is tricky.

An Idea

Reply #2
Check out MPEG4 Audio.. I think there is something on this..

An Idea

Reply #3
It might be plausible but I doubt it would help that much - when doing a 1 CD rip a low bitrate soundtrack of about 64-100kbps is quite common, dropping that down even 30-40 kbps to an ultra low bitrate is only going to help the video bitrate marginally.

Additionally most modern movies rarely have just dialogue, there is usually background music or other sound effects.

 

An Idea

Reply #4
All speech codecs I've heard so far sound "understandable" but not much more. I think this would only be interesting in very low bitrate situations like streaming.