[Mesa-dev] [PATCH 01/11] glsl/lower_instruction: handle denorms and overflow in ldexp correctly

Fri Sep 29 10:06:46 UTC 2017

On 29.09.2017 00:54, Matt Turner wrote:
> On Sat, Sep 16, 2017 at 4:23 AM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>> From: Nicolai Hähnle <nicolai.haehnle at amd.com>
>>
>> GLSL ES requires both, and while GLSL explicitly doesn't require correct
>> overflow handling, it does appear to require handling input inf/denorms
>> correctly.
>>
>> Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.*
>>
>> Cc: mesa-stable at lists.freedesktop.org
>> ---
>>   src/compiler/glsl/lower_instructions.cpp | 171 +++++++++++++++++++------------
>>   1 file changed, 107 insertions(+), 64 deletions(-)
>>
>> diff --git a/src/compiler/glsl/lower_instructions.cpp b/src/compiler/glsl/lower_instructions.cpp
>> index 0c1408911d2..362562a4af6 100644
>> --- a/src/compiler/glsl/lower_instructions.cpp
>> +++ b/src/compiler/glsl/lower_instructions.cpp
>> @@ -358,140 +358,183 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir)
>>   }
>>
>>   void
>>   lower_instructions_visitor::ldexp_to_arith(ir_expression *ir)
>>   {
>>      /* Translates
>>       *    ir_binop_ldexp x exp
>>       * into
>>       *
>>       *    extracted_biased_exp = rshift(bitcast_f2i(abs(x)), exp_shift);
>> -    *    resulting_biased_exp = extracted_biased_exp + exp;
>> +    *    resulting_biased_exp = min(extracted_biased_exp + exp, 255);
>>       *
>> -    *    if (resulting_biased_exp < 1 || x == 0.0f) {
>> -    *       return copysign(0.0, x);
>> +    *    if (extracted_biased_exp >= 255)
>> +    *       return x; // +/-inf, NaN
>> +    *
>> +    *    sign_mantissa = bitcast_f2u(x) & sign_mantissa_mask;
>> +    *
>> +    *    if (min(resulting_biased_exp, extracted_biased_exp) < 1)
>> +    *       resulting_biased_exp = 0;
>> +    *    if (resulting_biased_exp >= 255 ||
>> +    *        min(resulting_biased_exp, extracted_biased_exp) < 1) {
>> +    *       sign_mantissa &= sign_mask;
>>       *    }
>>       *
>> -    *    return bitcast_u2f((bitcast_f2u(x) & sign_mantissa_mask) |
>> +    *    return bitcast_u2f(sign_mantissa |
>>       *                       lshift(i2u(resulting_biased_exp), exp_shift));
>>       *
>>       * which we can't actually implement as such, since the GLSL IR doesn't
>>       * have vectorized if-statements. We actually implement it without branches
>>       * using conditional-select:
>>       *
>>       *    extracted_biased_exp = rshift(bitcast_f2i(abs(x)), exp_shift);
>> -    *    resulting_biased_exp = extracted_biased_exp + exp;
>> +    *    resulting_biased_exp = min(extracted_biased_exp + exp, 255);
>>       *
>> -    *    is_not_zero_or_underflow = logic_and(nequal(x, 0.0f),
>> -    *                                         gequal(resulting_biased_exp, 1);
>> -    *    x = csel(is_not_zero_or_underflow, x, copysign(0.0f, x));
>> -    *    resulting_biased_exp = csel(is_not_zero_or_underflow,
>> -    *                                resulting_biased_exp, 0);
>> +    *    sign_mantissa = bitcast_f2u(x) & sign_mantissa_mask;
>>       *
>> -    *    return bitcast_u2f((bitcast_f2u(x) & sign_mantissa_mask) |
>> -    *                       lshift(i2u(resulting_biased_exp), exp_shift));
>> +    *    flush_to_zero = lequal(min(resulting_biased_exp, extracted_biased_exp), 0);
>> +    *    resulting_biased_exp = csel(flush_to_zero, 0, resulting_biased_exp)
>> +    *    zero_mantissa = logic_or(flush_to_zero,
>> +    *                             gequal(resulting_biased_exp, 255));
>> +    *    sign_mantissa = csel(zero_mantissa, sign_mantissa & sign_mask, sign_mantissa);
>> +    *
>> +    *    result = sign_mantissa |
>> +    *             lshift(i2u(resulting_biased_exp), exp_shift));
>> +    *
>> +    *    return csel(extracted_biased_exp >= 255, x, bitcast_u2f(result));
>> +    *
>> +    * The definition of ldexp in the GLSL spec says:
>> +    *
>> +    *    "If this product is too large to be represented in the
>> +    *     floating-point type, the result is undefined."
>> +    *
>> +    * However, the definition of ldexp in the GLSL ES spec does not contain
>> +    * this sentence, so we do need to handle overflow correctly.
> 
> Without contradictory evidence, I would expect that was an oversight
> and that it's also undefined in GLSL ES.
> 
> Is this required to fix the dEQP test you mention? If it's actually
> spec'd to behave differently between desktop and ES then that's really
> embarrassing.

Yes, it is, and yes, it is...

> 
> As mentioned, you may want to just replace this whole lowering pass
> with what Jason's done in nir_opt_algebraic.py for ldexp.
> 
> Whatever you decide to do, this patch is:
> 
> Acked-by: Matt Turner <mattst88 at gmail.com>

Thanks.

Cheers,
Nicolai

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.