[Mesa-stable] [Mesa-dev] [PATCH 01/11] glsl/lower_instruction: handle denorms and overflow in ldexp correctly

Thu Sep 28 22:54:20 UTC 2017

On Sat, Sep 16, 2017 at 4:23 AM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> From: Nicolai Hähnle <nicolai.haehnle at amd.com>
>
> GLSL ES requires both, and while GLSL explicitly doesn't require correct
> overflow handling, it does appear to require handling input inf/denorms
> correctly.
>
> Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.*
>
> Cc: mesa-stable at lists.freedesktop.org
> ---
>  src/compiler/glsl/lower_instructions.cpp | 171 +++++++++++++++++++------------
>  1 file changed, 107 insertions(+), 64 deletions(-)
>
> diff --git a/src/compiler/glsl/lower_instructions.cpp b/src/compiler/glsl/lower_instructions.cpp
> index 0c1408911d2..362562a4af6 100644
> --- a/src/compiler/glsl/lower_instructions.cpp
> +++ b/src/compiler/glsl/lower_instructions.cpp
> @@ -358,140 +358,183 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir)
>  }
>
>  void
>  lower_instructions_visitor::ldexp_to_arith(ir_expression *ir)
>  {
>     /* Translates
>      *    ir_binop_ldexp x exp
>      * into
>      *
>      *    extracted_biased_exp = rshift(bitcast_f2i(abs(x)), exp_shift);
> -    *    resulting_biased_exp = extracted_biased_exp + exp;
> +    *    resulting_biased_exp = min(extracted_biased_exp + exp, 255);
>      *
> -    *    if (resulting_biased_exp < 1 || x == 0.0f) {
> -    *       return copysign(0.0, x);
> +    *    if (extracted_biased_exp >= 255)
> +    *       return x; // +/-inf, NaN
> +    *
> +    *    sign_mantissa = bitcast_f2u(x) & sign_mantissa_mask;
> +    *
> +    *    if (min(resulting_biased_exp, extracted_biased_exp) < 1)
> +    *       resulting_biased_exp = 0;
> +    *    if (resulting_biased_exp >= 255 ||
> +    *        min(resulting_biased_exp, extracted_biased_exp) < 1) {
> +    *       sign_mantissa &= sign_mask;
>      *    }
>      *
> -    *    return bitcast_u2f((bitcast_f2u(x) & sign_mantissa_mask) |
> +    *    return bitcast_u2f(sign_mantissa |
>      *                       lshift(i2u(resulting_biased_exp), exp_shift));
>      *
>      * which we can't actually implement as such, since the GLSL IR doesn't
>      * have vectorized if-statements. We actually implement it without branches
>      * using conditional-select:
>      *
>      *    extracted_biased_exp = rshift(bitcast_f2i(abs(x)), exp_shift);
> -    *    resulting_biased_exp = extracted_biased_exp + exp;
> +    *    resulting_biased_exp = min(extracted_biased_exp + exp, 255);
>      *
> -    *    is_not_zero_or_underflow = logic_and(nequal(x, 0.0f),
> -    *                                         gequal(resulting_biased_exp, 1);
> -    *    x = csel(is_not_zero_or_underflow, x, copysign(0.0f, x));
> -    *    resulting_biased_exp = csel(is_not_zero_or_underflow,
> -    *                                resulting_biased_exp, 0);
> +    *    sign_mantissa = bitcast_f2u(x) & sign_mantissa_mask;
>      *
> -    *    return bitcast_u2f((bitcast_f2u(x) & sign_mantissa_mask) |
> -    *                       lshift(i2u(resulting_biased_exp), exp_shift));
> +    *    flush_to_zero = lequal(min(resulting_biased_exp, extracted_biased_exp), 0);
> +    *    resulting_biased_exp = csel(flush_to_zero, 0, resulting_biased_exp)
> +    *    zero_mantissa = logic_or(flush_to_zero,
> +    *                             gequal(resulting_biased_exp, 255));
> +    *    sign_mantissa = csel(zero_mantissa, sign_mantissa & sign_mask, sign_mantissa);
> +    *
> +    *    result = sign_mantissa |
> +    *             lshift(i2u(resulting_biased_exp), exp_shift));
> +    *
> +    *    return csel(extracted_biased_exp >= 255, x, bitcast_u2f(result));
> +    *
> +    * The definition of ldexp in the GLSL spec says:
> +    *
> +    *    "If this product is too large to be represented in the
> +    *     floating-point type, the result is undefined."
> +    *
> +    * However, the definition of ldexp in the GLSL ES spec does not contain
> +    * this sentence, so we do need to handle overflow correctly.

Without contradictory evidence, I would expect that was an oversight
and that it's also undefined in GLSL ES.

Is this required to fix the dEQP test you mention? If it's actually
spec'd to behave differently between desktop and ES then that's really
embarrassing.

As mentioned, you may want to just replace this whole lowering pass
with what Jason's done in nir_opt_algebraic.py for ldexp.

Whatever you decide to do, this patch is:

Acked-by: Matt Turner <mattst88 at gmail.com>