Re: [LAD] vectorization

From: Jens M Andreasen <jens.andreasen@email-addr-hidden>
Date: Mon May 05 2008 - 19:00:30 EEST

On Mon, 2008-05-05 at 16:07 +0200, Christian Schoenebeck wrote:

> Uhm, stupid question: already tried if GCC's special "complex" attribute type
> leads to a better result with auto vectorization? At least that could give
> the optimizer a better chance.
>

No I did not, but thats an idea. This certainly looks nice and clean:

  _Complex float cxA[N], cxB[N], cxD[N];

  for (i = 0;i < N; ++i)
     cxD[i] += cxA[i] * cxB[i];

Comparison to the other two versions with gcc -O3 -msse
-ftree-vectorize, suggests a slight advantage over the original
(non-vectorized) two dimensional array:

> clock: 13920 ms (_Complex)
> clock: 7040 ms (cvec_t)
> clock: 14470 ms (original array of complex)

With icc -O3 -msse the difference is even more pronounced:

> clock: 3850 ms (_Complex)
> clock: 1410 ms (cvec_t)
> clock: 13290 ms (original array of complex)

Moving from 'gcc 4.2.2' to '4.3 20070713 (experimental)' is very
disappointing:

> clock: 46180 (_Complex) <-- we have found a looser!
> clock: 7030 (cvec_t)
> clock: 14340 (original array of complex)

/j

> CU
> Christian
> _______________________________________________
> Linux-audio-dev mailing list
> Linux-audio-dev@email-addr-hidden
> http://lists.linuxaudio.org/mailman/listinfo/linux-audio-dev

-- 
_______________________________________________
Linux-audio-dev mailing list
Linux-audio-dev@email-addr-hidden
http://lists.linuxaudio.org/mailman/listinfo/linux-audio-dev
Received on Mon May 5 20:15:01 2008

This archive was generated by hypermail 2.1.8 : Mon May 05 2008 - 20:15:01 EEST