[Mesa-dev] NaN behavior in GLSL (was Re: [PATCH] glsl: always do sqrt(abs()) and inversesqrt(abs()))

Fri Jan 13 18:06:36 UTC 2017

On 13.01.2017 18:53, Jason Ekstrand wrote:
> On Fri, Jan 13, 2017 at 8:43 AM, Marek Olšák <maraeo at gmail.com
> <mailto:maraeo at gmail.com>> wrote:
>
>     On Fri, Jan 13, 2017 at 5:25 PM, Jason Ekstrand
>     <jason at jlekstrand.net <mailto:jason at jlekstrand.net>> wrote:
>     > On Fri, Jan 13, 2017 at 4:05 AM, Marek Olšák <maraeo at gmail.com
>     <mailto:maraeo at gmail.com>> wrote:
>     >>
>     >> On Fri, Jan 13, 2017 at 3:37 AM, Ilia Mirkin
>     <imirkin at alum.mit.edu <mailto:imirkin at alum.mit.edu>> wrote:
>     >> > On Thu, Jan 12, 2017 at 9:13 PM, Jason Ekstrand
>     <jason at jlekstrand.net <mailto:jason at jlekstrand.net>>
>     >> > wrote:
>     >> >> Unless, of course, it's controlled by the same hardware bit...
>     Clearly,
>     >> >> we
>     >> >> can can give you abs on rsq without denorm flushing (easy
>     shader hacks)
>     >> >> but
>     >> >> not the other way around.
>     >> >
>     >> > OK, so somehow I missed that earlier. However there's an
>     interesting
>     >> > section in the PRM:
>     >> >
>     >> >
>     >> >
>     https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-skl-vol07-3d_media_gpgpu.pdf
>     <https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-skl-vol07-3d_media_gpgpu.pdf>
>     >> >
>     >> > on PDF page 854, "Dismissed Legacy Behaviors" which has a list of
>     >> > suggested IEEE 754 deviations for DX9. One of them is indeed
>     that 0 *
>     >> > x = 0, but another is that input NaNs be propagated with certain
>     >> > exceptions. Also they suggest that RCP(0)/RSQ(0) = fmax.
>     Interesting.
>     >> >
>     >> > So at this point, the zero_wins thing is pretty much blown. i965
>     >> > appears to have an all-or-nothing approach, and additionally that
>     >> > approach doesn't match up exactly to what NVIDIA does (or at
>     least I'm
>     >> > not aware of a clamp-everything mode).
>     >> >
>     >> > This will take some thought to figure out how something can be
>     >> > specified so that a single spec works for both i965 and nv/amd.
>     OTOH
>     >> > we could have two different specs that just expose different
>     things -
>     >> > e.g. i965 could expose a MESA_shader_float_alt_mode or whatever
>     which
>     >> > is spec'd to do the things that the PRM says, and nv/amd have the
>     >> > MESA_shader_float_zero_wins ext which does what we were talking
>     about
>     >> > earlier.
>     >> >
>     >> > I'm open to other suggestions too.
>     >>
>     >> There is also the "small" problem that it would take a non-trivial
>     >> effort for us on the LLVM side. You guys can flip a switch. We can't.
>     >
>     >
>     > Don't you have to expend that effort for ARB programs anyway?  I
>     thought
>     > they weren't supposed to generate NaN either.
>
>     No, we don't, because st/mesa adds abs before RSQ and the driver
>     implements POW as log+mul+exp, where mul follows the rule
>     0*anything=0. I don't think any other opcode follows that rule though.
>
>
> Ah.  That makes sense.  Do you also implement DIV as MUL+RCP?

For single-precision, yes. For double-precision, it seems we need to 
move away from that due to precision issues (which is itself a bit odd, 
since you don't seem to have encountered that?).

Nicolai

>  If so,
> the two of those should take care of NaN getting generated in the
> shader.  We'd still have to do something about inf and maybe denorms.