Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: AAC optimization with Intel IPP libraries (Read 8983 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

AAC optimization with Intel IPP libraries

I was wondering if anyone has any experience with Intel IPP libraries in AAC audio encoder/decoder optimization for speed on Intel processor architectures? How good are this functions from Intel? Can they really improve codec speed significantly?
Another question: what about optimization for AMD processors?

Regards,
dunn

AAC optimization with Intel IPP libraries

Reply #1
Nero uses the IPP libraries for their AAC encoder.

AAC optimization with Intel IPP libraries

Reply #2
thanks roberto! if nero uses IPP than it probably means they help significantly. ivan could say more, but no sight of him... what about AMD?

AAC optimization with Intel IPP libraries

Reply #3
What does Intel IPP libraries stand for?


AAC optimization with Intel IPP libraries

Reply #5
Both Nero AAC and Dolby AAC encoders use IPP library - it is a very nice and fast DSP toolbox, and you can get a lot of improvement if the code is written wisely.

AAC optimization with Intel IPP libraries

Reply #6
Do you also use Intel's C compiler along with IPP? Or it does not metter?

What do you mean by " if written wisely..."?
I am trying to speed up some C code by using IPP but it seams that it does not work - the code is even a few percent slower. What could be the problem? If I copy-paste some smaller amount of code like the one below, and make a separate small project, it runs ok, many times faster. How come that it metters if code is in a small project or burried somewhere deep inside some big project? I am using MSVC 6.0. Are there any secrets in using IPP?


for (j = 0; j < lenj; j += 1)
{
...
for (i = 0; i < leni; i += 1)
{
p += q[i ] * z;     
}
}

inner i loop substitued by:

ippsDotProd_32f( q,z,leni,p );

AAC optimization with Intel IPP libraries

Reply #7
Ok, the inside loops looks like a typical dot produt  - so you substituted correctly.

However, IPP won't speed things up if the length is some small value, like 32 or so..

Also, I see another loop - are you performing a convolution (looks like ISO psymodel spreading)? If yes, maybe you can also use IPP convolution functions as they might work faster.

Or, if you are performing autocorrelation, you might also check IPP function for that.

Anyway, check if the array is large enough.

Intel C++ might also help, but the improvement will be smaller than without IPP library, of course - but at total, it should be faster than MSVC+IPP

AAC optimization with Intel IPP libraries

Reply #8
Quote
Both Nero AAC and Dolby AAC encoders use IPP library

where i can download the Dolby AAC Encoder ?

AAC optimization with Intel IPP libraries

Reply #9
QuickTime uses modified Dolby AAC encoder (modified inside Apple)

Old version 6.0 used original Dolby consumer encoder...

Dolby offers command-line versions for evaluation, but I dunno if they are willing to give them to end-users.

AAC optimization with Intel IPP libraries

Reply #10
Quote
for (j = 0; j < lenj; j += 1)
{
...
for (i = 0; i < leni; i += 1)
{
p += q[i ] * z;     
}
}

It looks like this loop:

for (i = 0; i < leni; i += 1)
{
p += q[i ] * z;     
}

gives the same value for every j, since it is completely independant of j.

Menno

AAC optimization with Intel IPP libraries

Reply #11
IPP AAC Encode sample is available on the site mentioned above. The one shows how to use IPP. Also you can get AAC Decode sample, MP3 Encode/Decode samples, etc.

AAC optimization with Intel IPP libraries

Reply #12
For some strange reason: DEBUG version optimized with IPPs works faster than non-IPP, but... optimized RELEASE version works s l o w e r than non-optimized! It seams that the compiler better optimizes the version without IPPs (I have tried Intel's and Microsoft's C compilers - same thing)!

Is this possible?
Please, what could be the problem?

AAC optimization with Intel IPP libraries

Reply #13
Quote
Please, what could be the problem?

Could be an alignment issue.

AAC optimization with Intel IPP libraries

Reply #14
2 dunny:
Could you describe which functions you use ? And please decribe the way you measure performance ?

AAC optimization with Intel IPP libraries

Reply #15
For speed measurement I use very simple:
BeginTime = GetTickCount();
EndTime = GetTickCount();

IPP functions:
ippsAutoCorr_32f
ippsMul_32f
ippsConv_32f
ippsConvert_16s32f
ippsDotProd_32f
ippsMulC_32f_I
ippsDurbin_32f
.
.
.
...many of them.

Considering alignment: I used ippsMalloc which should align itself automatically.

AAC optimization with Intel IPP libraries

Reply #16
Quote
BeginTime = GetTickCount();
EndTime = GetTickCount();


Do not use this method,  tick count is in milliseconds and that is too rough.

Try to perform some large block of operations,  say encoding of the whole file, etc.. and measure the performance.

Or use a profiler like VTune and measure the number of cycles/samples collected.

AAC optimization with Intel IPP libraries

Reply #17
PLEASE,
IF YOU HAVE Intel IPP INSTALLED, COPY-PASTE THIS PEACE OF CODE AND MAKE A SINGLE-FILE PROJECT OUT OF IT. COMPILE AND RUN IT IN TWO MODES, DEGUB AND RELEASE, AND LET ME KNOW WHAT RESULTS YOU GET WITH AND WITHOUT IPP.
DOES IPP DO IT FASTER?



// --- Intel Performance Primitives test --- //
#include <windows.h>
#include <stdio.h>
#include <ipps.h>

#define LENGTHi 2000   /* p1 p2 .... p[LENGTHi]*/
#define LENGTHj 1000000   /* repeat million times */

#define IPP /* comment or uncomment this to use or not IPP */

int main(int argc, char *argv[])
{
   int i,j;
   float p[LENGTHi],q[LENGTHi],x;   
   unsigned long BeginTime,EndTime;   

   printf("\n\n Working...\n");
#ifdef IPP
   printf("\n\n IPP ON \n");
#else
   printf("\n\n IPP OFF \n");
#endif
   /* Initialize: */
   for(i=0;i<LENGTHi;i++)
   {
      p=(float)(i*3.14);
      q=(float)(i*6.28);
   }

   BeginTime = GetTickCount(); /* ~10 ms timer resolution   */

   /* Nested loops, inside loop performs dot product which is in fact: 
     x = p0*q0 + p1*q1 + p2*q2 + ... +pn*qn    */
   for(j=0;j<LENGTHj;j++)
   {
      #ifdef IPP
      ippsDotProd_32f( p,q,LENGTHi,&x );/* Intel optimized...   */
      #else
      x=0.0;
      for(i=0;i<LENGTHi;i++)
      {
         x += p*q; /* ... or the usual, slow way. */
      }
      #endif
   }
   EndTime = GetTickCount();

   printf("\n\n Miliseconds: %d.\n\n",(int)(EndTime-BeginTime));   
   printf("\n (Dot product result: x = %f)\n\n\n",x);   
   return 0;
}

/*
Set (in case you installed IPP to C:\Program Files\Intel\IPP\):

1.
Project/Settings/Link/Input/AdditionalLibraryPath:
C:\Program Files\Intel\IPP\stublib

2.
Project/Settings/Link/Input/ObjestLibraryModules:
ipps20.lib 

3.
Project/Settings/C C++/Preprocessor/AdditionalIncludeDorectories:
C:\Program Files\Intel\IPP\include

*/

AAC optimization with Intel IPP libraries

Reply #18
Dude, you can replace

Code: [Select]
for(i=0;i<LENGTHi;i++)
{
   x += p[i]*q[i]; /* ... or the usual, slow way. */
}


with

Code: [Select]
x = (2/3)*PI^2*LENGTH^3 + PI^2*LENGTH^2 + (1/3)*PI^2*LENGTH


Your summation is a power series.

Menno

AAC optimization with Intel IPP libraries

Reply #19
menno, this is just an example, I had to initialize those numbers in some way so I accidentally chose 3.14 and 6.28... IPP is the problem here. Fast reaction though...

AAC optimization with Intel IPP libraries

Reply #20
I thought that much, just wanted to be smart 

What CPU do you have?

Menno

AAC optimization with Intel IPP libraries

Reply #21
pentium 4 1.6GHz (compile the example!!!)

AAC optimization with Intel IPP libraries

Reply #22
I  say, why don't you just code it in inline assembly and measure the performance with VTune? You have more flexibility here..

AAC optimization with Intel IPP libraries

Reply #23
IPPs definitly do not speed up the application if the vector length is too small. I have tested the above example (dot product), and it works for lengths higher than say 20-30, works perfectly if length>1000... Unfortunatelly, small vectors do not work, for example 16...
If someone uses IPPs have that in mind, before you replace your "slow" code. If I knew that, I could save some time... Ivan, thanx for your hint (how come I did not find any length limitations in Intel's manuals???).
Topic closed.

AAC optimization with Intel IPP libraries

Reply #24
It's most likely because of the function call overhead being too high relative to the amount of data you're processing each call.  Not using the IPP function, the compiler just generates your code right there and there isn't the overhead of a function call.