Re: [LAD] vectorization

From: Christian Schoenebeck <cuse@email-addr-hidden>
Date: Wed Apr 16 2008 - 10:19:19 EEST

Am Mittwoch, 16. April 2008 02:10:20 schrieb Jens M Andreasen:
> On Tue, 2008-04-15 at 19:45 +0200, Christian Schoenebeck wrote:
> > Yeah, I'm respawning this topic ...
>
> There is something funny with this benchmark. If we compare your
[snip]
> Benchmarking mixdown (WITH coeff):
> ASM SSE : 160 ms <-- faster?
>
> .. or leave in C++ as well:
>
> Benchmarking mixdown (WITH coeff):
> pure C++ : 400 ms <-- slower?
> ASM SSE : 170 ms
>
> .. or take out only the ASM:
>
> Benchmarking mixdown (WITH coeff):
> pure C++ : 380 ms <-- faster?
> GCC vector extensions : 160 ms <-- slower?

Yeah, it even gets funnier: you may have noticed that ALLOC_BUFFERS macro. The
timing results vary dependent on whether you allocate the buffers at runtime
(memalign()) or use statically allocated buffers at compile time. It could be
the same over-optimizing issue like you already noticed. Haven't investigated
it yet.

>
> Me thinks it is very difficult to predict what -O3 will or will not do.

Yep, but as you already pointed out, the speed relationship between those 3
solutions is clear, no matter what the absolute timing results are. I also
flipped the order of the benchmarks and the coarse speed relationship was
always the same.

But if you're totally sceptical, you could simply move out the mixing
functions into an own C++ file, compile that object file with maximum
optimization, and compile the actual benchmark application with just "-O1" or
something.

CU
Christian
_______________________________________________
Linux-audio-dev mailing list
Linux-audio-dev@email-addr-hidden
http://lists.linuxaudio.org/mailman/listinfo/linux-audio-dev
Received on Wed Apr 16 12:15:02 2008

This archive was generated by hypermail 2.1.8 : Wed Apr 16 2008 - 12:15:02 EEST