On Mon, 2008-05-05 at 16:07 +0200, Christian Schoenebeck wrote:
> Uhm, stupid question: already tried if GCC's special "complex" attribute type
> leads to a better result with auto vectorization? At least that could give
> the optimizer a better chance.
>
No I did not, but thats an idea. This certainly looks nice and clean:
_Complex float cxA[N], cxB[N], cxD[N];
for (i = 0;i < N; ++i)
cxD[i] += cxA[i] * cxB[i];
Comparison to the other two versions with gcc -O3 -msse
-ftree-vectorize, suggests a slight advantage over the original
(non-vectorized) two dimensional array:
> clock: 13920 ms (_Complex)
> clock: 7040 ms (cvec_t)
> clock: 14470 ms (original array of complex)
With icc -O3 -msse the difference is even more pronounced:
> clock: 3850 ms (_Complex)
> clock: 1410 ms (cvec_t)
> clock: 13290 ms (original array of complex)
Moving from 'gcc 4.2.2' to '4.3 20070713 (experimental)' is very
disappointing:
> clock: 46180 (_Complex) <-- we have found a looser!
> clock: 7030 (cvec_t)
> clock: 14340 (original array of complex)
/j
> CU
> Christian
> _______________________________________________
> Linux-audio-dev mailing list
> Linux-audio-dev@email-addr-hidden
> http://lists.linuxaudio.org/mailman/listinfo/linux-audio-dev
-- _______________________________________________ Linux-audio-dev mailing list Linux-audio-dev@email-addr-hidden http://lists.linuxaudio.org/mailman/listinfo/linux-audio-devReceived on Mon May 5 20:15:01 2008
This archive was generated by hypermail 2.1.8 : Mon May 05 2008 - 20:15:01 EEST