linux-audio-dev: Re: [LAD] GCC Vector extensions

From: Gabriel Beddingfield <gabrbedd@email-addr-hidden>
Date: Tue Jul 26 2011 - 15:30:16 EEST

On 07/26/2011 03:15 AM, Maurizio De Cecco wrote:
>> So are you now considering use some #ifdef to select float/4 instead of
>> double/8 vectors in jMax or just change all of them?
>
> Well, at the moment on gcc the perfomance with vector types is the same
> as without vector types, so i'll leave the Linux version without vector
> types (the code is #ifdef'ed).

When I was playing around with this last night... the best performance
came from your non-optimized, non-vectored code.

Why?

Because GCC translated it to optimized, vectored code.

> By the way, i forgot to mentions that all my tests where at 64 bits;
> i'll try later on a 32 bit Ubuntu.

I was on 32 bit Ubuntu. Also, with GCC the 64-bit optimizer is known to
be better at optimising SIMD code.

Because I'm a sucker for these kinds of diversions, I came up with a
scheme that shaved about 1 second off your test (on my machine). It
assumes that `vecsize` is a power-of-two. The idea is to store stuff in
the processor registers, and access each buffer one page at a time (a
cache page is 64 bytes on x86... 16 floats).

static inline void add3_vec(float * restrict arg0, float * restrict
arg1, float * restrict arg2, unsigned int vecsize)
{
   unsigned int i;
   v4sf *v0, *v1, *v2;
   v4sf c0, c1, c2, c3, c4, c5, c6, c7;
   const unsigned cache_size = 4;

   v0 = (v4sf*)arg0;
   v1 = (v4sf*)arg1;
   v2 = (v4sf*)arg2;
   vecsize /= 4*cache_size;

   while(vecsize--) {
           c0 = *v0++;
           c1 = *v0++;
           c2 = *v0++;
           c3 = *v0++;
           c4 = *v1++;
           c5 = *v1++;
           c6 = *v1++;
           c7 = *v1++;
           *v2++ = c0 + c4;
           *v2++ = c1 + c5;
           *v2++ = c2 + c6;
           *v2++ = c3 + c7;
   }

}

-gabriel
_______________________________________________
Linux-audio-dev mailing list
Linux-audio-dev@email-addr-hidden
http://lists.linuxaudio.org/listinfo/linux-audio-dev
Received on Tue Jul 26 16:15:01 2011

This archive was generated by hypermail 2.1.8 : Tue Jul 26 2011 - 16:15:01 EEST