[Mesa-dev] [PATCH 1/2] mesa: remove buggy assertions in unpack Z24

Mon Dec 12 04:24:14 PST 2011

I saw this email yesterday, but did not have time to reply.

----- Original Message -----
> On Mon, Nov 21, 2011 at 6:52 PM, Eric Anholt <eric at anholt.net> wrote:
> > On Sun, 20 Nov 2011 15:08:55 +0100, Marek Olšák <maraeo at gmail.com>
> > wrote:
> >> unpack_float_z_Z24_X8 fails with -O2 in:
> >> - fbo-blit-d24s8
> >> - fbo-depth-sample-compare
> >> - fbo-readpixels-depth-formats
> >> - glean/depthStencil
> >>
> >> With -O0, it works fine.
> >>
> >> I am removing all the assertions. There's not much point in having
> >> them,
> >> is there?
> >
> > Is -ffast-math involved at all?
> 
> Yes, -fno-fast-math makes the problem go away.
> 
> >
> > At a guess, replacing "* scale" with "/ (float)0xffffff" makes the
> > failure happen either way?
> 
> Yes. Only using GLdouble fixes it.
> 
> It makes sense. The mantissa without the sign has 23 bits, but 24
> bits
> are required to exactly represent 0xffffff if I am not mistaken.

I'm afraid this is wrong: single precision floating point can describe 24bits uints (and therefore unorms) without any loss, because although the mantissa has 23bits, the mantissa is only used to represent all but the most significant bit, ie., there's an implicit 24th bit always set to one.

The fact that -fno-fast-math makes the problem go away is yet another clear proof that the issue is _not_ lack of precision of single-precision floating point -- as -fno-fast-math will still use single-precision floating point numbers.  Actually, fno-fast-math, it will ensure that all intermediate steps are done in single precision instead of higher intel x87 80bit floating points, on the 32bit x86 architecture.  And I suspect the problem here is that the rounding w/ x80 yields wrong results.

I also disagree that using double precision is a good solution. Either fast-math serves our purposes, or it doesn't.  Using double precision to work-around issues with single-precision fast-math is the *worst* thing we could do.

Does the assertion failure even happen on 64bits? I doubt, as x64 ABI establishes the use of sse for all floating point, and x87 80bit floating point will never be used. So effectively this is making all archs slower.

Honestly, I think the patch should be recalled.  And we should consider disabling fast-math.  And maybe enabling -mfpmath=sse on 32bits x86 (i.e., require sse).

Jose