[Mesa-dev] NaN behavior in GLSL (was Re: [PATCH] glsl: always do sqrt(abs()) and inversesqrt(abs()))

Axel Davy axel.davy at ens.fr
Fri Jan 13 22:04:02 UTC 2017


On 13/01/2017 19:50, Matteo Bruni wrote:
> 2017-01-13 3:37 GMT+01:00 Ilia Mirkin <imirkin at alum.mit.edu>:
>> On Thu, Jan 12, 2017 at 9:13 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>>> Unless, of course, it's controlled by the same hardware bit... Clearly, we
>>> can can give you abs on rsq without denorm flushing (easy shader hacks) but
>>> not the other way around.
>> OK, so somehow I missed that earlier. However there's an interesting
>> section in the PRM:
>>
>> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-skl-vol07-3d_media_gpgpu.pdf
>>
>> on PDF page 854, "Dismissed Legacy Behaviors" which has a list of
>> suggested IEEE 754 deviations for DX9. One of them is indeed that 0 *
>> x = 0, but another is that input NaNs be propagated with certain
>> exceptions. Also they suggest that RCP(0)/RSQ(0) = fmax. Interesting.
>>
>> So at this point, the zero_wins thing is pretty much blown. i965
>> appears to have an all-or-nothing approach, and additionally that
>> approach doesn't match up exactly to what NVIDIA does (or at least I'm
>> not aware of a clamp-everything mode).
>>
>> This will take some thought to figure out how something can be
>> specified so that a single spec works for both i965 and nv/amd. OTOH
>> we could have two different specs that just expose different things -
>> e.g. i965 could expose a MESA_shader_float_alt_mode or whatever which
>> is spec'd to do the things that the PRM says, and nv/amd have the
>> MESA_shader_float_zero_wins ext which does what we were talking about
>> earlier.
>>
>> I'm open to other suggestions too.
> Maybe we can go back to the original idea and have the extension
> require that no NaNs can be generated by GLSL mathematical operators
> and builtin functions (if no operand is a NaN?) It's possible that's
> not exactly it but in any case the idea is to just specify expected
> results, without requiring a specific route to get there. The
> extension could introduce undefined behavior where necessary e.g.
> allowing (but not requiring) INF results to be always flushed to fmax
> when enabled.
>
> For Intel that would work trivially. For AMD it should be a matter of
> using the special instructions where necessary and "be careful" in a
> few places (in the same vein as the RSQ and POW opcodes of ARB
> programs Marek mentioned). Not sure about nouveau, I guess it should
> be similar to AMD in the end.
>
> Would that be too messy? Am I completely missing the point?

Specifying just the behaviour for NaN doesn't solve the 0*inf issue for 
MAD operations. 24 + 0*inf = NaN gets converted to 0 instead of 24.


Axel



More information about the mesa-dev mailing list