[linux-audio-dev] PPC AltiVec performance !

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: [linux-audio-dev] PPC AltiVec performance !
From: Benno Senoner (sbenno_AT_gardena.net)
Date: Sat May 13 2000 - 02:07:08 EEST


Hi,

I

Last week I installed the RC5-64 crack distributed.net client on some of my
servers, and looked at the various Keys/sec rates of several CPUs.
(Notice that the app has cores written in assemby which take advantage of
special fetaures of the CPU, like MMX, SIMD , Altivec etc)

I had the chance to play with a new (500-550 Mhz I believe) G4 Mac.

Of course the first I made was to install the RC5 client in order to see how
many keys/sec this box is able to perform.

I was simply AMAZED:

A 500 PIII with PIII (SIMD) optimized code delivered about 1.3Mega keys/sec.

The PPC G4 delivered incredible 3.4 Mega keys/sec using the AltiVec optimized
code, that is more that 2.5 times faster while running at about the same
frequency as the PIII.

I downloaded the AltiVec PDFs from http://www.altivec.org and checked out some
interesting parts.

The AltiVec engine does up to 4 x 32bit float ops / cycle.
And there is even the multiply-add which does
4 multiplications and 4 additions within one cycle.
That is a dream for us software-DSP folks.

The really nice thing is that Motorola has even native C/C++ extensions
(vector datatypes , and vector operations which get directly translated into
altivec assemby instructions).
look below for how simple it is to do a 4 way multiply-add which runs at full
speed. ( d = vec_madd(a,b,c) )
And the nicest thing is that these C/C++ extensions exist as a gcc patch !
see here for some infos:
http://www.terrasoftsolutions.com/news/2000-03-23.shtml

I think the PPC G4 HW platform has really future in the studio from a raw MFLOPS
POV.

Can you imagine how beautiful it will be to have a PPC running on PPC-Linux with
ALSA and soundcards like the Hammerfall ?

But again the cross-platform capabilties of Linux will this make this
transition quite easy ( port ALSA, recompile apps adding
Altivec cores written in C/C++ (!!) without touching any assemby instruction,
and ... continue working as usual ... but with the difference to running with
2-3 times your previous speed. :-)

(Assume Steinberg ports Cubase/Nuendo for Linux/x86 , when it would be quite
easy (if they paid attention to bigendian/littleendian issues)) to port the
whole app to Linux/PPC since it would basically involve a recompile with some
minor tweaks.

Write once ... run anywhere (on any Linux platform at native speed)

Now I know where to invest unneeded money: a Mac G4 :-)
(but wipe out MacOS before :-) )

-----
Vector Multiply Add
d = vec_madd(a,b,c)
do i=0 to 3
d(i) := RndToFPNearest(a(i) * b(i) + c(i) )
end
Each element of the result is the sum of the corresponding element of c and the product of
the corresponding elements of a and b.
-----

Benno.


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Sat May 13 2000 - 02:25:57 EEST