Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Fundamental difference bet. speech & audio codecs (Read 7107 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Fundamental difference bet. speech & audio codecs

Hi everyone,

I am wondering what's the fundamental difference between speech and audio coding that makes speech coder (such as Speex that utilises the CELP algorithm) perform better (in terms of output quality) than general audio coder (such as Lame which utilises MP3 coding algorithm) when given a speech input. Likewise why is it that general audio coder tends to produce better results when given non-speech audio input, even at low bitrates?

By the way, if there's any reading material that I can read to understand their fundamental differences would you please refer it to me?  Thank you very much for your time and help. 

Fundamental difference bet. speech & audio codecs

Reply #1
Quote
Hi everyone,

I am wondering what's the fundamental difference between speech and audio coding that makes speech coder (such as Speex that utilises the CELP algorithm) perform better (in terms of output quality) than general audio coder (such as Lame which utilises MP3 coding algorithm) when given a speech input. Likewise why is it that general audio coder tends to produce better results when given non-speech audio input, even at low bitrates?


I also have a very similar question as above. Not wrt to the performance but with using the speech coding algorithms in Audio coding.... my question is this that all lossless audio coders use simple LPC analysis and scalar quantization of the prediction parameters while speech coding use complex algos like CELP and so... Can we use CELP to improve the prediction in our lossless audio coding and get the smaller residual signal and hence improve the compression..???  Please thro light upon this question also alongwith the above one...

Fundamental difference bet. speech & audio codecs

Reply #2
Quote
I also have a very similar question as above. Not wrt to the performance but with using the speech coding algorithms in Audio coding.... my question is this that all lossless audio coders use simple LPC analysis and scalar quantization of the prediction parameters while speech coding use complex algos like CELP and so... Can we use CELP to improve the prediction in our lossless audio coding and get the smaller residual signal and hence improve the compression..???  Please thro light upon this question also alongwith the above one...
[a href="index.php?act=findpost&pid=333462"][{POST_SNAPBACK}][/a]


CELP has nothing to do with prediction and will not help you in any way for lossless coding. All the VQ thing does is allow you to *minimize* the error when you have a *fixed* number of bits. For lossless, you want scalar quantization followed by entropy coding. Using VQ will result in exponentially (as a function of vector size) growing complexity and still exactly the same bit-rate at the end.

Fundamental difference bet. speech & audio codecs

Reply #3
A nice introduction to speech codecs can be found here.
(codec classes: waveform vs. source)

Fundamental difference bet. speech & audio codecs

Reply #4
It would be simpler, I think, to point out that voice codecs only have to describe a very small part of the whole range of PCM signals, i.e. those signals that a human vocal tract can utter.

As such, its input spans a much smaller space than that of a music coder. As a result, several things come of this:

1) The speech coder can use a speech production mechanism.
2) The encoding must describe a much smaller space
and
3) Speech coders in general can afford to be very, very lossy, and still convey what the speech coder needs to convey.

Somebody else said "CELP has nothing to do with prediction". Well, actually, the "LP" in CELP stands for "linear prediction". Of course, the CE stands for "codebook excited". So, while the VQ (codebook excitation) does not have any relationship to a predictor, the LP is a predictor, end of discussion, plain and simple.
-----
J. D. (jj) Johnston

Fundamental difference bet. speech & audio codecs

Reply #5
Quote
Somebody else said "CELP has nothing to do with prediction". Well, actually, the "LP" in CELP stands for "linear prediction". Of course, the CE stands for "codebook excited". So, while the VQ (codebook excitation) does not have any relationship to a predictor, the LP is a predictor, end of discussion, plain and simple.
[a href="index.php?act=findpost&pid=333875"][{POST_SNAPBACK}][/a]


Well, "somebody else" is me and if you looked at the context of this statement, you would have seen that I was talking about CELP as opposed to standard (linear-prediction-based) lossless encoding. So, no CELP doesn't to anything (more) about prediction. ...and I think I'm qualified enough to talk about CELP.

Fundamental difference bet. speech & audio codecs

Reply #6
Quote
Well, "somebody else" is me and if you looked at the context of this statement, you would have seen that I was talking about CELP as opposed to standard (linear-prediction-based) lossless encoding. So, no CELP doesn't to anything (more) about prediction. ...and I think I'm qualified enough to talk about CELP.
[a href="index.php?act=findpost&pid=334300"][{POST_SNAPBACK}][/a]


You also could have admitted that your phrasing was suboptimal. I too choked on that sentence but I'm sure you both know how CELP works. ;-)


Sebi

Fundamental difference bet. speech & audio codecs

Reply #7
Thank you very much for all your replies, it helps me greatly  . Honestly speaking ... without help from you guys ... I am dead  .

 

Fundamental difference bet. speech & audio codecs

Reply #8
Quote
Well, "somebody else" is me and if you looked at the context of this statement, you would have seen that I was talking about CELP as opposed to standard (linear-prediction-based) lossless encoding. So, no CELP doesn't to anything (more) about prediction. ...and I think I'm qualified enough to talk about CELP.
[a href="index.php?act=findpost&pid=334300"][{POST_SNAPBACK}][/a]



Sorry, wasn't trying to start an argument, but I've seen unclear statements (like yours, which, I'm sorry, was not clear in context at all) get loose and wander off into places where they get quoted as scripture.

And I did want to prevent that. There are enough myths out there, some of them quite accidental.
-----
J. D. (jj) Johnston