[Mesa-dev] [PATCH] Add an accelerated version of F_TO_I for x86_64
Roland Scheidegger
sroland at vmware.com
Mon Jul 28 13:29:12 PDT 2014
Am 28.07.2014 20:38, schrieb Jason Ekstrand:
>
>
>
> On Mon, Jul 28, 2014 at 8:34 AM, Roland Scheidegger <sroland at vmware.com
> <mailto:sroland at vmware.com>> wrote:
>
> Looking all good, though it got me thinking about these numerous
> float-to-int rounding functions in mesa in various places a bit more.
> There is, for instance, a _mesa_round_to_even() function, which claims
> that the c99 lrintf() and friends functions can't be used because the
> environment might use a different rounding mode. This is of course true
> though I wonder why it would matter there and not here for instance,
> since of course cvtss2si is affected by that as well.
>
>
> Yes, it does matter for cvtss2si. I briefly considered replacing F_TO_I
> with lrint and just trusting in the standard library to do it. However,
> that turns out to have really bad performance (on my system) if you
> don't enable -ffast-math (which we don't by default). The reason why
> using cvtss2si is ok is that we only use F_TO_I for things where the GL
> spec allows us to be sloppy on the rounding. In particular, we use it
> for texture format conversion and for things like swrast where we care
> more about fast than perfect.
>
>
> Also, I'm wondering about the IROUND fallback. Richard Sandiford
> discovered a problem in llvmpipe (using lp_build_iround()) which was
> using the same method (that is, (int)(val + 0.5) for unsigned numbers,
> similar but more complex for signed numbers). This works ok for nearly
> all numbers (though it is definitely not round to nearest even) except
> numbers between [-]2^23 and [-]2^24-1, in which case it will always
> return the next higher even number for odd numbers. So
> IROUND((float)val) != val even for numbers which can be represented as
> floats exactly. I'd guess though mesa probably doesn't use IROUND for
> cases where this would matter (most likely some conversion of z24
> numbers), worst case there would probably not just be the inaccuracy but
> if you'd have clamped to z24 max (as float, 2^24-1) then done the IROUND
> you'd get back 2^24. And FWIW some gallium util math function seem to
> have this problem as well, though again I don't know if it would matter
> (the gallium util code will only use them if c99 isn't available).
>
> But anyway I guess that's slightly off-topic...
>
>
> That's interesting. I'll give that some thought, but at that point
> you're right at the boundary of floating-point precision and things get
> interesting. I'll think about it and if I come up with a better way to
> do it, I'll send a patch.
Yes it's right at the boundary but does not exceed it (there's a reason
z24 unorm is very common and z32 unorm is paper spec only :-)).
FWIW the easiest solution for the gallivm code was to use
largest-float-smaller-than-0.5 for the add in iround though I'm not 100%
sure yet of the consequences this could have (at first sight, this looks
quite ok, the previous rounding when using +0.5f gives round up
semantics for positive tied values (0.5, 1.5, 2.5 would become 1, 2, 3
whereas usually you'd possibly expect round-nearest-even hence 0,2,2)
and round down semantics for negative tied values, whereas with this you
get round-nearest-trunc).
Oh and since you're working on it, it would be also nice if the rounding
macro / function stuff could be shared between gallium/util and mesa.
Roland
>
>
>
> Roland
>
>
> Am 23.07.2014 05:15, schrieb Jason Ekstrand:
> > According to a quick micro-benchmark, this new version is 20%
> faster on my
> > Haswell laptop.
> >
> > v2: Removed the XXX note about x86_64 from the comment
> >
> > Signed-off-by: Jason Ekstrand <jason.ekstrand at intel.com
> <mailto:jason.ekstrand at intel.com>>
> > ---
> > src/mesa/main/imports.h | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h
> > index af780b2..c8ae7f2 100644
> > --- a/src/mesa/main/imports.h
> > +++ b/src/mesa/main/imports.h
> > @@ -277,7 +277,6 @@ static inline int IROUND_POS(float f)
> >
> > /**
> > * Convert float to int using a fast method. The rounding mode
> may vary.
> > - * XXX We could use an x86-64/SSE2 version here.
> > */
> > static inline int F_TO_I(float f)
> > {
> > @@ -285,6 +284,10 @@ static inline int F_TO_I(float f)
> > int r;
> > __asm__ ("fistpl %0" : "=m" (r) : "t" (f) : "st");
> > return r;
> > +#elif defined(USE_X86_64_ASM) && defined(__GNUC__)
> > + int r;
> > + __asm__ ("cvtss2si %1, %0" : "=r" (r) : "xm" (f));
> > + return r;
> > #elif defined(USE_X86_ASM) && defined(_MSC_VER)
> > int r;
> > _asm {
> >
>
>
More information about the mesa-dev
mailing list