Re: [LAD] vectorization

From: Jens M Andreasen <jens.andreasen@email-addr-hidden>
Date: Wed Apr 16 2008 - 20:07:28 EEST

On Wed, 2008-04-16 at 17:37 +0200, Remon wrote:

> I only modified the example to use gettimeofday() instead of clock().
>
> Maybe a gcc developer can shed some light on this issue ?
>

Not a gcc developer, but 'man clock' says that clock is supposed to
measure the time used by the application thus excluding the time spend
by system and other applications, whereas gettimeofday is the wall clock
including everything (as well as using a different unit of measurement.)

To get the real clock-count for running one fragment - as opposed to the
pseudo GHz return from 'clock' - you could use something like this:

/* high resolution processor clock count
 */
static inline unsigned long long
rdtsc()
{
     unsigned long long int x;
     __asm__ volatile (".byte 0x0f, 0x31" : "=A" (x));
     return x;
}

... and then:

  long long clck= rdtsc();

  for (int i = 0; i < RUNS; i++) {
     cpp_mix(pSampleInputBuf, pOutput, FRAGMENTSIZE, coeff);
   }
   
   printf ("pure C fragment\t\t: %d clck\n",(int)((rdtsc()-clck)/RUNS));

This could reveal something about how well you are utilizing the added
execution units in later Intel hardware compared to a simple PIII
celeron:

Benchmarking mixdown (WITH coeff):
pure C fragment : 1493 clck

For your C2D, compile with -mssse3 rather than -sse

> Greetings,
>
> Remon
> _______________________________________________
> Linux-audio-dev mailing list
> Linux-audio-dev@email-addr-hidden
> http://lists.linuxaudio.org/mailman/listinfo/linux-audio-dev

-- 
_______________________________________________
Linux-audio-dev mailing list
Linux-audio-dev@email-addr-hidden
http://lists.linuxaudio.org/mailman/listinfo/linux-audio-dev
Received on Wed Apr 16 20:15:19 2008

This archive was generated by hypermail 2.1.8 : Wed Apr 16 2008 - 20:15:19 EEST