[Mesa-dev] [RFC PATCH] Add GL_MESA_ieee_fp_alu_mode specification draft
idr at freedesktop.org
Mon Feb 24 18:10:30 UTC 2020
On 2/23/20 5:57 PM, Ilia Mirkin wrote:
> We talked about something like this a while back, but the end result
> was inconclusive. I added a TGSI MUL_ZERO_WINS shader property for nine.
> But it'd be nice for wine to be able to control this too.
> I couldn't actually find any evidence of the discussion from 2017 or so,
> so ... let's have another one.
> docs/specs/MESA_ieee_fp_alu_mode.spec | 136 ++++++++++++++++++++++++++
> 1 file changed, 136 insertions(+)
> create mode 100644 docs/specs/MESA_ieee_fp_alu_mode.spec
> diff --git a/docs/specs/MESA_ieee_fp_alu_mode.spec b/docs/specs/MESA_ieee_fp_alu_mode.spec
> new file mode 100644
> index 00000000000..cb274f06571
> --- /dev/null
> +++ b/docs/specs/MESA_ieee_fp_alu_mode.spec
> @@ -0,0 +1,136 @@
> + MESA_ieee_fp_alu_mode
> +Name Strings
> + GL_MESA_ieee_fp_alu_mode
> + Ilia Mirkin, ilia 'at' x.org
> +IP Status
> + No known IP issues.
> + Proposed
> + TBD
> + OpenGL 3.0 or OpenGL ES 3.0 is required.
> + The extension is written against the OpenGL GL 3.0 and OpenGL ES 3.0
> + specifications.
> + Pre-GL3 hardware did not generally have full IEEE floating point operation
> + support. Among other things, 0 * Infinity would work out to 0, and NaN's
> + might not be generated, or otherwise be treated improperly. GL3-class and
> + later hardware introduced full IEEE FP support, including NaN, Infinity,
> + and the proper generation of these.
> + Some software targeted at older hardware makes assumptions about how the
> + shader ALU works. And to accomodate these, GL3-class hardware has a way to
> + change how the shader ALU behaves. There are no standards around this, and
> + different hardware has different ways of dealing with it. However these
> + modes were designed specifically with such older software in mind.
> + This extension introduces a way to configure a context to be in non-IEEE
> + ALU mode. This extension does not specify precisely what this means, as
> + each vendor has something different. Generally it means non-IEEE compliant
> + handling of multiplication, as well as any other unspecified changes.
I think many of the other things are specified. They're the non-IEEE
behaviors of GL_ARB_vertex_program and GL_ARB_fragment_program, and
those mimic the required behavior of early DX shader models. There are
a bunch of cases that specify that zero is generated when IEEE would
If there's just a small handful of things like this, we'd probably be
better adding a couple new built-in functions to do the job. The
problem on Intel hardware is... we really, really don't want to switch
to non-IEEE mode because it changes how a bunch of things work, and we
haven't tested any of that in many years. I'd much rather put in some
kind of work-arounds for things that don't want multiplication or pow()
to generate NaN.
As for the mechanism, I'm very strongly in favor of something that would
be locked-in when the shader is compiled. I really want to avoid any
potential that an external glEnable could trigger a a recompile.
The more I think about it... having an extension that adds a handful
built-in functions that give old shader model behavior would be a good
idea. We could even test it. :) I've looked a lot of shaders, and I've
seen a lot of not-quite-what-they-wanted methods for avoiding NaN
behavior in a bunch of these functions. Having a special version of
inversesqrt() that returns FLT_MAX for 0 would be useful to a lot of
users. As part of the spec we could even provide canonical versions of
the functions so that users could copy-and-paste
float inveresqrt_nonIEEE(float x)
> +New Tokens
> + Accepted by the <cap> parameter of Enable, Disable, and IsEnabled, by
> + the <pname> parameter of GetBooleanv, GetIntegerv, GetFloatv, and
> + GetDoublev:
> + IEEE_FP_ALU_MODE_MESA 0x????
> +Changes to GLSL Section 4.1.4 Floats:
> + Add the following paragraph:
> + In case that the shader is being executed in a context with
> + IEEE_FP_ALU_MODE_MESA disabled, multiplication shall produce the following
> + (non-IEEE-complaint) result:
> + float a = 0;
> + float b = Infinity;
> + float c = a * b; // c == 0
> + There may be other implications from this mode being enabled, including
> + clamping of non-finite values, or anything else the hardware mode happens
> + to enable to achieve compatibility.
> +New State
> + (add to table 6.52, Miscellaneous, p.392)
> + Initial
> + Get Value Type Get Command Value Description Sec. Attribute
> + --------------------- ----- ----------- ------- ------------------ ------ ---------
> + IEEE_FP_ALU_MODE_MESA B IsEnabled TRUE Whether shader ALU enable
> + is in IEEE FP mode
> + (1) This specification does not precisely specify what non-IEEE FP mode is.
> + RESOLVED. Shipping hardware has different ways of dealing with it. For
> + example, Intel clamps all values. NVIDIA Tesla series has a
> + context-wide mode for controlling whether zero wins in multiplication
> + or follows IEEE rules. NVIDIA Fermi+ series as well as ATI/AMD Radeon
> + R600+ has separate opcodes which control this (but again, a different
> + set of operations are covered).
> + A single extension which is going to be easy to use for emulation
> + software is thus much harder to write if it's to precisely specify
> + this.
> + The applications that want these have already been written and tested
> + against these approaches, so we know they all work with whatever the
> + hardware has to offer.
> + (2) Why use an Enable instead of a shader layout token?
> + RESOLVED. Because some hardware implementations don't allow
> + controlling this on a per-stage level. While one could come up with
> + rules requiring linked program stages to have the same setting, this
> + is going to be extra validation for the implementations to
> + implement. Furthermore, one would want these rules to also apply to
> + fixed-function-generated shaders equally. Instead a simple mode should
> + be able to flip this on and off.
> + (3) What about FP denorms?
> + RESOLVED. The same hardware tends to also have a way to control
> + whether denorm FP values are flushed to zero. GLSL does not specify
> + this explicitly, but some software relies on denorms being
> + flushed. Should there be a desire to allow denorms to work, this can
> + be done by another extension.
> + (4) What is the expected usage for this?
> + RESOLVED. Software which enables older games to operate,
> + e.g. emulators, will now be able to do shader translation without
> + copious checks for these "error" conditions.
> +Revision History
> + Revision 1, ilia, 2020-02-23
> + - Initial draft
More information about the mesa-dev