Re: [LAD] vectorization

From: Pieter Palmers <pieterp@email-addr-hidden>
Date: Tue May 06 2008 - 12:14:35 EEST

Fons Adriaensen wrote:
> On Mon, May 05, 2008 at 07:18:39PM +0200, Jens M Andreasen wrote:
>
>> Could you try this out with your proposed compiler options on your own
>> hardware?
>>
>> ...
>> #define N 1024
>> ...
>> int n = 1000000;
>> ...
>
> Looping a million times over the same small data vector
> is _not_ very realistic.
>
> In a real app, the data size would be much longer (there's
> no need to optimise otherwise), that data would be rewritten
> for each iteration (no need to redo the calculation otherwise),
> and the work would not be done in a single long run but be
> divided over a number of e.g. jack process callbacks.
>
> I've again performed some tests on zita-convolver used by
> jconv to do the York Minster config. That means around 240
> different blocks of 8192 complex values each. The differences
> between plain C++, hand vectorized, and optimised assembly
> code are absolutely marginal in that case.

The main problem there is that you are not looking at the speed of the
CPU, but running into memory bandwidth problems. In my own experiments
it was very apparent that it pays off to restructure the code in such a
way that memory access is limited as much as possible, although that
might result in more instructions. (not suggesting that's possible for
you though).

The key point seems to be that you have to do all operations at once
instead of looping over a buffer multiple times. Which is not really a
surprise of course...

e.g.:
for(all samples) {sample = operation1(sample)}
for(all samples) {sample = operation2(sample)}
for(all samples) {sample = operation3(sample)}

can easily be an order of magnitude slower than:

for(all samples) {
  sample = operation1(sample)
  sample = operation2(sample)
  sample = operation3(sample)
}

It is essential that the data does not leave the processors cache,
that's for sure. For the remainder I think modern day processors are
very good at optimizing their computation.

Greets,

Pieter
_______________________________________________
Linux-audio-dev mailing list
Linux-audio-dev@email-addr-hidden
http://lists.linuxaudio.org/mailman/listinfo/linux-audio-dev
Received on Tue May 6 16:15:02 2008

This archive was generated by hypermail 2.1.8 : Tue May 06 2008 - 16:15:02 EEST