Re: [linux-audio-dev] Traps in floating point code

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] Traps in floating point code
From: Ruben van Royen (rvroyen_AT_guidedbees.com)
Date: Thu Jul 01 2004 - 11:25:51 EEST


Hi all,

please note that SSE2 has support for 64bit floats (doubles) and contains an
instruction that truncates to int, irregardless of controlwords. A new enough
gcc with (-march=pentium4 or -msse2) and -mfpmath=sse will use sse instead of
the old fp unit. This has more advantages, since sse math uses normal
registers instead of the stack in the old fp unit.

The disadvantage is of course that it does not run on older processors. I'm
also not sure what level of sse athlon currently supports. The last time I
looked, it only supported sse. This is also good, but it lacks support for
double precision floatingpoint.

Ruben

On Wednesday 30 June 2004 23:09, Tim Goetze wrote:
> [Jens M Andreasen]
>
> >On tis, 2004-06-29 at 17:15, Steve Harris wrote:
> >> > integer = lrintf(fullindex);
> >> > fractional = fullindex - integer;
> >>
> >> I dont think this is right, fractional will be [-0.5, 0.5], rather than
> >> [0,1] which is more noirmal as lrintf() rounds to the nearest.
> >>
> >> I think you should be using lrintf(floor(x)) or (int)x.
> >
> >Why not just use modf?
> >
> > double fullindex, increment, integer, fraction;
> > // int i;
> >
> > fullindex += increment;
> > fraction = modf(fullindex, &integer);
> > // i = integer;
>
> if we want to use the integer part as sample index for a memory
> lookup, having it in double/float doesn't buy us much: we still need
> to convert to int type, which is costly.
>
> regarding lrintf() on x86: checking the output of gcc-3.0 -S, you can
> see that it compiles to a simple "fistpl" instruction. this
> instruction relies on the current FPU control word, which defaults to
> 'round' not 'truncate' (which isn't what we want for memory indices).
>
> now, "i = lrintf (floor (f))" compiles to "frndint" surrounded by
> "fldcw" (load control word, which is __slow__), and finally "fistpl".
>
> in contrast, "i = (int) f" compiles to a single "fistpl", but of
> course (the gcc default FPU control word defaulting to 'round') also
> surrounded by "fldcw".
>
> so if you want quick fractional sample lookups, the best option on x86
> i see is to manually "fldcw" before and after your sample loop, and
> use lrintf() or "fistpl" directly to obtain integer indices inside
> the loop.
>
> incidentally, you can find a portable implementation in the caps
> package.
>
> tim


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Thu Jul 01 2004 - 11:17:22 EEST