Subject: Re: [linux-audio-dev] Traps in floating point code
From: Jussi Laako (jussi.laako_AT_pp.inet.fi)
Date: Fri Jul 02 2004 - 22:14:04 EEST
On Fri, 2004-07-02 at 00:40, Erik de Castro Lopo wrote:
> > Eric what do you think ? can something like that be coded efficiently
> > using SSE/SSE2 ?
>
> Probably not. There are some algorithms which simply can't be vectorized.
SSE2 is usually significantly faster for non-vectorized code also. At
least for P4 and AMD64. I usually do some profiling on code generated by
the compiler and then handcode the SSE2 parts for compiler bottlenecks.
IIR filter was one good example where compilers sucked badly.
-- Jussi Laako <jussi.laako_AT_pp.inet.fi>
This archive was generated by hypermail 2b28 : Fri Jul 02 2004 - 22:07:19 EEST