Skip to main content
Topic: Using GPU when convolving (Read 4372 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Using GPU when convolving

After digging into the theoretics of convolution I understand that long tap FIR filters require lots of cpu power, with the tradeoff between cpu power and high latency.
I then stumbled across this site, FIR on GPU. Their conclusion is that a modern GPU such as Geforce 6600 outperforms SSE enabled Pentium-4-HT @3.2GHz, when it comes to long tap FIR's.
A fb2k convolving plugin using GPU would be nice, but hey, I don't even know if windows would allow it. But if it does, I wouldn't mind a black screen when convolving...

Using GPU when convolving

Reply #1
Why would you get a black screen? You sure don't get one when running a 3D application in windowed mode.

Using GPU when convolving

Reply #2
Unfortunately, the paper is garbage, because they seem to have used the most stupid algorithm possible for their benchmarks (but of course, the most stupid one happened to have a good speedup on the GPU).

It is pointless to do this on the GPU.

Using GPU when convolving

Reply #3
Quote
Unfortunately, the paper is garbage, because they seem to have used the most stupid algorithm possible for their benchmarks (but of course, the most stupid one happened to have a good speedup on the GPU).

It is pointless to do this on the GPU.
[a href="index.php?act=findpost&pid=370300"][{POST_SNAPBACK}][/a]

How is the algorithm stupid?  Couldn't the additional power of the graphics card benefit the computer anyways?  What about the memory bandwidth?  Some cards use DDR-2, don't they?

Using GPU when convolving

Reply #4
GPU are fast at massively paralel computations, massively beeing several thousands of similar computations.
A big drawback is the high loading time to tranfer data to the graphic card.

This means that right now they are efficient only for very large data sets with identical computations.

Using GPU when convolving

Reply #5
Quote
Unfortunately, the paper is garbage, because they seem to have used the most stupid algorithm possible for their benchmarks (but of course, the most stupid one happened to have a good speedup on the GPU).

I see your point, Garf. If I understand things correctly, they're using linear(?) convolution (element-wize multiplication of vectors) which is exponentially slower than FFT methods on long tapped filters, right?

Here is an example of FFT calculation using GPU.

Couldn't circular buffer/algorithm methods be used to minimize bandwith? After all, if the input is audio samples and a constant FIR, there really is no point shuffling coefficients back and forth.

Using GPU when convolving

Reply #6
Quote
Quote
Unfortunately, the paper is garbage, because they seem to have used the most stupid algorithm possible for their benchmarks (but of course, the most stupid one happened to have a good speedup on the GPU).

I see your point, Garf. If I understand things correctly, they're using linear(?) convolution (element-wize multiplication of vectors) which is exponentially slower than FFT methods on long tapped filters, right?

Here is an example of FFT calculation using GPU.
[a href="index.php?act=findpost&pid=370507"][{POST_SNAPBACK}][/a]


Yes, exactly. The FFT is much more irregular and complex and hence much harder to implement on a GPU quickly. If you check their results, the GPU is outperformed by the CPU by a factor 5. Oops, that's not so promising anymore.

Of course, as GPU's get faster and better at complex algorithms, this may become worthwhile.

But wanting to do convolutions on the GPU because it's fast at doing FIR filters...now that's just stupid.

Using GPU when convolving

Reply #7
Quote
Yes, exactly. The FFT is much more irregular and complex and hence much harder to implement on a GPU quickly. If you check their results, the GPU is outperformed by the CPU by a factor 5. Oops, that's not so promising anymore.

Of course, as GPU's get faster and better at complex algorithms, this may become worthwhile.
I don't know if it's been enough time since that paper for what you say to be true but you do have to keep in mind that their paper is from 2003 and the comparison is between a Geforce FX 5800 Ultra and a 1.7 GHz Xeon.  Granted today's CPU's are a good amount faster than that (I'll be relatively generous and guess 5x faster when architectural improvements - faster FSB, memory, etc., and new SIMD instructions are factored in), but today's video cards vastly outperform the dustbuster, which is known to particularly be a dog in many shader operations unless you run at reduced precision.

Using GPU when convolving

Reply #8
GPUFFTW
Quote
GPUFFTW is a fast FFT library designed to exploit the computational performance and memory bandwidth on GPUs. Our library exploits the data parallelism available on current GPUs and pipelines the computation to the different stages of the graphics processor. Moreover, our library uses an efficient tiling strategy to further improve the memory performance of our algorithm. GPUFFTW can efficiently handle large real and complex 1-D arrays at 32-bit floating point precision on commodity GPUs. Furthermore, our FFT algorithm achieves comparable precision to the IEEE 32-bit FFT algorithms on CPUs even on large 1-D arrays. The library supports both Windows and Linux platforms.


A benchmark on their site shows that a single NVidia GeForce 7900GTX outperforms a dual Opteron 280 workstation (4 cores @ 2.4 GHz) by a factor 5

 
SimplePortal 1.0.0 RC1 © 2008-2020