[Mesa-dev] [PATCH 1/2] mesa: remove buggy assertions in unpack Z24

Brian Paul brian.e.paul at gmail.com
Mon Dec 12 07:50:52 PST 2011


On Mon, Dec 12, 2011 at 8:20 AM, Jose Fonseca <jfonseca at vmware.com> wrote:
> ----- Original Message -----
>> On Mon, Dec 12, 2011 at 2:10 PM, Jose Fonseca <jfonseca at vmware.com>
>> wrote:
>> >
>> >
>> > ----- Original Message -----
>> >> On Mon, Dec 12, 2011 at 1:24 PM, Jose Fonseca
>> >> <jfonseca at vmware.com>
>> >> wrote:
>> >> > I saw this email yesterday, but did not have time to reply.
>> >> >
>> >> > ----- Original Message -----
>> >> >> On Mon, Nov 21, 2011 at 6:52 PM, Eric Anholt <eric at anholt.net>
>> >> >> wrote:
>> >> >> > On Sun, 20 Nov 2011 15:08:55 +0100, Marek Olšák
>> >> >> > <maraeo at gmail.com>
>> >> >> > wrote:
>> >> >> >> unpack_float_z_Z24_X8 fails with -O2 in:
>> >> >> >> - fbo-blit-d24s8
>> >> >> >> - fbo-depth-sample-compare
>> >> >> >> - fbo-readpixels-depth-formats
>> >> >> >> - glean/depthStencil
>> >> >> >>
>> >> >> >> With -O0, it works fine.
>> >> >> >>
>> >> >> >> I am removing all the assertions. There's not much point in
>> >> >> >> having
>> >> >> >> them,
>> >> >> >> is there?
>> >> >> >
>> >> >> > Is -ffast-math involved at all?
>> >> >>
>> >> >> Yes, -fno-fast-math makes the problem go away.
>> >> >>
>> >> >> >
>> >> >> > At a guess, replacing "* scale" with "/ (float)0xffffff"
>> >> >> > makes
>> >> >> > the
>> >> >> > failure happen either way?
>> >> >>
>> >> >> Yes. Only using GLdouble fixes it.
>> >> >>
>> >> >> It makes sense. The mantissa without the sign has 23 bits, but
>> >> >> 24
>> >> >> bits
>> >> >> are required to exactly represent 0xffffff if I am not
>> >> >> mistaken.
>> >> >
>> >> > I'm afraid this is wrong: single precision floating point can
>> >> > describe 24bits uints (and therefore unorms) without any loss,
>> >> > because although the mantissa has 23bits, the mantissa is only
>> >> > used to represent all but the most significant bit, ie., there's
>> >> > an implicit 24th bit always set to one.
>> >> >
>> >> > The fact that -fno-fast-math makes the problem go away is yet
>> >> > another clear proof that the issue is _not_ lack of precision of
>> >> > single-precision floating point -- as -fno-fast-math will still
>> >> > use single-precision floating point numbers.  Actually,
>> >> > fno-fast-math, it will ensure that all intermediate steps are
>> >> > done
>> >> > in single precision instead of higher intel x87 80bit floating
>> >> > points, on the 32bit x86 architecture.  And I suspect the
>> >> > problem
>> >> > here is that the rounding w/ x80 yields wrong results.
>> >> >
>> >> >
>> >> > I also disagree that using double precision is a good solution.
>> >> > Either fast-math serves our purposes, or it doesn't.  Using
>> >> > double
>> >> > precision to work-around issues with single-precision fast-math
>> >> > is
>> >> > the *worst* thing we could do.
>> >> >
>> >> >
>> >> > Does the assertion failure even happen on 64bits? I doubt, as
>> >> > x64
>> >> > ABI establishes the use of sse for all floating point, and x87
>> >> > 80bit floating point will never be used. So effectively this is
>> >> > making all archs slower.
>> >> >
>> >> >
>> >> > Honestly, I think the patch should be recalled.  And we should
>> >> > consider disabling fast-math.  And maybe enabling -mfpmath=sse
>> >> > on
>> >> > 32bits x86 (i.e., require sse).
>> >>
>> >> You're probably right. Feel free to revert & fix the issue in some
>> >> other way.
>> >
>> > OK.
>> >
>> > Could you please confirm whether the assertions failures you saw
>> > were on 32bits or 64bits mode?
>>
>> 32-bit.
>
> Thanks.
>
> In that case I propose dropping fast-math, at least on x86 32bits, as it makes (in particular the unorm24 <-> f32 conversions happen in a lot of places in Mesa's source tree) and some times worse code [1], and simply use the subset of finer grained options  [2]:
>
>  -fno-math-errno, -funsafe-math-optimizations, -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans and -fcx-limited-range.
>
> which really meets our needs.
>
> About -mfpmath=sse on 32bits, although I believe that depending on sse would be alright, it looks like -mfpmath=sse  it's not a clear winner on 32bits , because calls to libc still need to use x87 registers.
>
>
> Jose
>
>
> [1] http://programerror.com/2009/09/when-gccs-ffast-math-isnt/
> [2] http://gcc.gnu.org/onlinedocs/gcc-4.6.2/gcc/Optimize-Options.html


Thanks for digging into this, guys.

I'm happy to drop -ffast-math (or use -fno-fast-math) but it would be
interesting to do at least a few before/after comparisons just to make
sure there's no surprises in performance.

In any case, don't we still need to use double-precision for Z32
packing/unpacking?  I ran into a failure in _mesa_pack_float_z_row()
for Z32 on Saturday (debug build on x64).

I'd like to leave Marek's patch committed as-is for now until we
settle on another solution.

-Brian


More information about the mesa-dev mailing list