Skip to main content

Topic: MPEG4 AAC Gain Control Tool (Read 2605 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • wkw
  • [*][*]
MPEG4 AAC Gain Control Tool
Hi, I like to know something about the ISO Psychoacoutic Model II.
I used to work on the Sony ATRAC3 compression algorithm.. and I noticed
a potential difficulty with Psychoacoutic Model II because ATRAC3 uses the Gain-Control tool which is almost identical to the one used MPEG-4 AAC. Because it is so efficient in tackling signal transients, there is no need to switch to short-block. However, I noticed that the Psychoacoutic Model II MUST switch to short-block or the threshold calculated is terribly out of order.

I implemented the ISO Psychoacoutic Model I but instead of calculating a FFT, I used the MDCT spectral to determine the threshold.  It seems in single block mode, Model I provides a much more accurate threshold that model II..  I wonder, has it got anything to do with the tone calculation used in Model II? Model II used a Prediction Model  which I believe became inaccurate during transient..

I wonder if the MPEG4 AAC Gain Control tools good enough for single block mode operation as in ATRAC3.. I noticed that the AAC Gain Control tools operates on band 2 to band 4 whereas the Sony, the original developer of this tool applies the same tool on all 4 bands. Is there any perfomance differences in this aspect? Is it recommended to disable the block switching in MPEG4-AAC?

Another thing, when I coded the pqf and ipqf filter using the Q coefficients from the
draft standard, I realised that they are wrong! I have to extract the coefficients from the ISO source code and reverse engineered the ISO implemented Gain Control Tools in order to implement the correct pqf and ipqf bank. Also, I noticed that there is some interband leakage on the reconstructed tone samples.  (close to 5 dB)

Could anyone comment on this?


  • Ivan Dimkovic
  • [*][*][*][*][*]
  • Developer
MPEG4 AAC Gain Control Tool
Reply #1
If implemented correctly, Psychoacoustic Model II should provide relatively accurate thresholds for MDCT lengths of 128, 512 and 1024 points (most widely used)

Regarding tonality calculation - this is a very common problem, and it seems that there is no best way of calculating tonality... Following things might be good to try:

1. If the coder is MDCT based, try using complex MDCT instead of complex FFT in tonality estimator and psych module (i.e  MDCT as real, MDST as imaginery part)

2. Try using spectral flatness measure (peak detection) for frequencies where unpredictability method fails

3. Try using two way unpredictability search (two blocks ahead, and two blocks before) and use the value which yields more tonal result.

Reqarding PQF in AAC SSR - I think that AAC SSR is not widely used, and I don't think that PQF would bring any extra quality, so I didn't bothered with implementing it  Sorry.

  • wkw
  • [*][*]
MPEG4 AAC Gain Control Tool
Reply #2