Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: The MP3 Polyphase Filter Bank (Read 8558 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

The MP3 Polyphase Filter Bank

Hello!

For a completely different application, I designed a 32-sub band filter bank in matlab,
And implemented it in CUDA for a GPU.
It currently uses 64 filters, a single 64 point FFT, and the decimation\interpolation is by 32.
But this isn't really important since the CUDA implementation can easily be changed.

Since I read that the MP3 filter bank also divides the signals into 32 sub-bands,
I was wondering if my GPU filterbank can be used for MP3 decoding.

My questions are:
1. Is the total number of taps in the MP3 filter 512, or is it 512 taps in the filter in every branch (512*32 total)?
2. How good is the isolation between the different sub-bands? Do they overlap (One input frequency may have a response in more than one band)? Are "holes", which by I mean dead areas between the sub-bands, allowed?
3. Is this filtering considered an expensive operation in MP3 decoding? What are the time constants it operates in for lets say 512 samples-
mili seconds? micro seconds?
4. Does the format use a specific filter for the decoding? Is there freedom to implement the filter bank with a different structure?

Thanks for reading,
Jacob

The MP3 Polyphase Filter Bank

Reply #1
The filterbank (including the DFT) is the slowest part of the decode process, using roughly half the total time.  However, for decoding at least, the filterbank only uses 10-20 MHz (or perhaps lower with SIMD), so I'm not sure CUDA makes sense.

If you're interested, I've been working little by little on improving filterbank performance for various embedded devices, where performance can make quite a difference due to battery limitations.

The MP3 Polyphase Filter Bank

Reply #2
The filterbank (including the DFT) is the slowest part of the decode process, using roughly half the total time.  However, for decoding at least, the filterbank only uses 10-20 MHz (or perhaps lower with SIMD), so I'm not sure CUDA makes sense.

If you're interested, I've been working little by little on improving filterbank performance for various embedded devices, where performance can make quite a difference due to battery limitations.


Hi,
If I understand right, you mean that there isn't much of a gain in speeding up the filter bank, since it operates in a slow rate anyway?

In your work in improving the filterbank performance, are you modifying the design itself (changing the prototype filter, structure, etc.),
or are you making the filter bank code smarter?
I am curious to know weather there is a point in experimenting MP3 decoding with a different filter-bank. For example, maybe a different prototype filter can give better sound quality?

The MP3 Polyphase Filter Bank

Reply #3
http://wiki.hydrogenaudio.org/index.php?ti...ng_of_MP3_audio

Quote
Decoding [...] is carefully defined in the standard. Most decoders are "bitstream compliant", meaning that the decompressed output they produce from a given MP3 file will be the same (within a specified degree of rounding tolerance) as the output specified mathematically in the ISO/IEC standard document.

The MP3 Polyphase Filter Bank

Reply #4
If I understand right, you mean that there isn't much of a gain in speeding up the filter bank, since it operates in a slow rate anyway?


Speeding up the filterbank is generally useful, but I don't see much point in using a GPU.  It happens fast enough that the CPU is sufficient.  Not sure the overhead and synchronization with another processor is worthwhile. 

In your work in improving the filterbank performance, are you modifying the design itself (changing the prototype filter, structure, etc.),
or are you making the filter bank code smarter?


Just better implementations for various CPUs.  We've come up with different variations for different CPUs:

http://git.rockbox.org/?p=rockbox.git;a=bl...c920f3b;hb=HEAD
http://git.rockbox.org/?p=rockbox.git;a=bl...7d540b9;hb=HEAD

Haven't updated it much, but a work in progress ARMv5 version:

http://www.rockbox.org/tracker/task/11759

I am curious to know weather there is a point in experimenting MP3 decoding with a different filter-bank. For example, maybe a different prototype filter can give better sound quality?


You can cheat a little to improve speed at the expense of accuracy, but usually its not a good idea.

The MP3 Polyphase Filter Bank

Reply #5
Perhaps GPU-based transcoding could be interesting. Doing 1000s of files in a batch means large potential for threading/vectorization.

-k

The MP3 Polyphase Filter Bank

Reply #6
Perhaps GPU-based transcoding could be interesting. Doing 1000s of files in a batch means large potential for threading/vectorization.
You're going to run into I/O slowdown far before you can utilize that degree of parallelism.

The MP3 Polyphase Filter Bank

Reply #7
Perhaps GPU-based transcoding could be interesting. Doing 1000s of files in a batch means large potential for threading/vectorization.
You're going to run into I/O slowdown far before you can utilize that degree of parallelism.


Being I/O limited would be a nice problem to have.  Particularly for people with SSDs.

 

The MP3 Polyphase Filter Bank

Reply #8
Perhaps GPU-based transcoding could be interesting. Doing 1000s of files in a batch means large potential for threading/vectorization.
You're going to run into I/O slowdown far before you can utilize that degree of parallelism.

Perhaps. But if the source and destination formats are both low-bandwidth, incompatible formats with complex transforms, you can fit many of those within 50 MB/s or 400MB/s.

I think that exploiting the theoretical flop numbers of current GPUs for doing non-GPU stuff is very hard. However, even a 2x or 3x speedup compared to cpu might be worthwhile for some, especially if this frees up a precious resource (cpu) while using an unused resource (gpu)
-k