2009/8/17 Chris Cannam <cannam@email-addr-hidden-day-breakfast.com>:
> On Mon, Aug 17, 2009 at 5:21 AM, Ken Restivo<ken@email-addr-hidden> wrote:
>> I'm trying to squeeze the last little bit of juice out of my EEE.
>>
>> The CPU I have is this:
>> http://restivo.org/projects/eee/cpu.txt
>>
>> This nifty script at http://www.pixelbeat.org/scripts/gcccpuopt , says I should use "-march=core2 -mtune=pentium -mfpmath=sse"
>>
>> However, the Gentoo people (who I take to be an -funrollloops authority on performance tuning), say I should "-march=core2 -mtune=generic -fomit-frame-pointer -pipe".
>>
>> And then there is -march=native which many say is just easier and faster. And others recommend putting "-msse2" and other such things.
>>
>> What say you-all?
>
> If you want the fastest possible floating point code, then you
> probably want something like:
>
> -march=core2 -msse -msse2 -mfpmath=sse -ffast-math -fomit-frame-pointer -O3
>
> ... but with caveats.
>
> Discussion:
>
> Supplying -ffast-math causes the use of non-IEEE-compliant math
> functions. Among other things, this screws up any code that
> explicitly deals with infinity or NaN values or signed zeroes, and
> makes assumptions about properties like associativity for the purposes
> of optimisation which may not be true in the floating-point world. In
> other words, it can give you the wrong results. In _most_ cases,
> audio applications are fine with it, but you need to be aware that it
> can be problematic.
>
> However, -ffast-math in combination with -mfpmath-sse has the very
> nice quality that it enables denormal flush to zero throughout, thus
> avoiding denormal slowdowns in filters and the like. It's also much
> faster for some of the apparently simple operations like floor() that
> are surprisingly slow in IEEE compliant mode.
>
> It might be interesting to know what the authors of the programs
> you're trying to optimise thought about the use of -ffast-math...
> Perhaps you could compile them both ways and compare the output.
On the SuperCollider dev list we're just having a conversation about
exactly this. NaNs are used in some cases for signalling, and since
compiling with -ffast-math implies -ffinite-math-only, that trashes
the NaN signalling. This combination seems OK though: "-ffast-math
-fno-finite-math-only". The moral of the story is probably that it
depends strongly on the app. Who knows if your chosen softwares make
use of NaNs and infinities? Hard to tell.
Dan
> -fomit-frame-pointer is pretty much guaranteed to make things
> marginally faster but harder to debug. It won't break anything and it
> won't make any huge improvements.
>
> -O3 rather than -O2 because it enables -ftree-vectorize, which does
> some limited auto-vectorization of loops for things like
> floating-point copy into SSE operations. This doesn't always do
> anything (depends on the code, obviously) but sometimes it makes a
> significant difference, for example it helps when compiling my Rubber
> Band library. I've never yet seen any problems with the results, but
> of course there's always an increased risk of running into
> optimisation bugs the more optimisation you do. You can get
> interesting (?) debug output about vectorization successes and
> failures (mostly failures) with e.g. -ftree-vectorizer-verbose=2.
>
> I would be slightly suspicious of anyone who recommends -pipe as an
> optimisation -- it makes no difference to the resulting code, it just
> makes compiling faster.
>
> If you're using a 64-bit distro, then you can omit the options with
> SSE in them (they're all enabled by default in 64-bit gcc).
>
>
> Chris
> _______________________________________________
> Linux-audio-user mailing list
> Linux-audio-user@email-addr-hidden
> http://lists.linuxaudio.org/mailman/listinfo/linux-audio-user
>
-- http://www.mcld.co.uk _______________________________________________ Linux-audio-user mailing list Linux-audio-user@email-addr-hidden http://lists.linuxaudio.org/mailman/listinfo/linux-audio-userReceived on Mon Aug 17 16:15:02 2009
This archive was generated by hypermail 2.1.8 : Mon Aug 17 2009 - 16:15:02 EEST