[Mesa-dev] [PATCH 2/8] glsl: Fix constant evaluation of the rcp op.

Ian Romanick idr at freedesktop.org
Thu Jan 26 19:06:17 UTC 2017


On 01/25/2017 10:53 AM, Francisco Jerez wrote:
> Hi Ian, and thank you for your comments,
> 
> Ian Romanick <idr at freedesktop.org> writes:
> 
>> On 01/24/2017 03:26 PM, Francisco Jerez wrote:
>>> Will avoid a regression in a future commit that introduces some
>>> additional rcp operations.
>>
>> When I converted GLSL IR to ir_expression_operation.py, I was careful to
>> keep all the expressions the same.  rcp and div had these weird guards.
>> GLSL doesn't require that NaN be generated, and quite a few old GPUs
>> don't.  If the atan2 implementation depends on NaN being generated by
>> rcp, it may have problems on i915, r300, and similar GPUs.  I don't know
>> what they generate, but it's not NaN and it's probably not 0.0.
> 
> The atan2 implementation from patch 5 doesn't rely on NaNs being
> generated, but it does rely on the reciprocal operation handling zero
> and infinity correctly as specified by GLSL for the division operation.

Okay.  That is the problem on older GPUs that I was referring.
Specifically, all versions of the GLSL prior to 4.40 (!) say:

    Similarly, treatment of conditions such as divide by 0 may lead to
    an unspecified result, but in no case should such a condition lead
    to the interruption or termination of processing.

I believe that DX11 requires the GLSL 4.40+ behavior, so all even
somewhat modern devices should just work.  It's all the pre-DX11
hardware that's might be a problem.

Now... talking to Jason just now, he reminded me that the spec also says
the following about built-in functions:

    Function parameters specified as angle are assumed to be in units
    of radians. In no case will any of these functions result in a
    divide by zero error. If the divisor of a ratio is 0, then results
    will be undefined.

We may be fine even on old, clunky hardware.  Looking at the code in
patch 5, atan(0, -abs(x)) would still be a problem if rcp(0) produces
undefined results.  It looks like
tests/shaders/glsl-fs-atan-2.shader_test should hit that case.  Anyone
have r300 or r400 hardware to test that?

This patch doesn't affect that, and, even with the "unspecified result"
rule, it's clearly correct.  This patch is

Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>

>> That said, this matches NIR, and it's probably fine.
>>
>>> ---
>>>  src/compiler/glsl/ir_expression_operation.py | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/src/compiler/glsl/ir_expression_operation.py b/src/compiler/glsl/ir_expression_operation.py
>>> index f91ac9b..4ac1ffb 100644
>>> --- a/src/compiler/glsl/ir_expression_operation.py
>>> +++ b/src/compiler/glsl/ir_expression_operation.py
>>> @@ -422,7 +422,7 @@ ir_expression_operation = [
>>>     operation("neg", 1, source_types=numeric_types, c_expression={'u': "-((int) {src0})", 'default': "-{src0}"}),
>>>     operation("abs", 1, source_types=signed_numeric_types, c_expression={'i': "{src0} < 0 ? -{src0} : {src0}", 'f': "fabsf({src0})", 'd': "fabs({src0})", 'i64': "{src0} < 0 ? -{src0} : {src0}"}),
>>>     operation("sign", 1, source_types=signed_numeric_types, c_expression={'i': "({src0} > 0) - ({src0} < 0)", 'f': "float(({src0} > 0.0F) - ({src0} < 0.0F))", 'd': "double(({src0} > 0.0) - ({src0} < 0.0))", 'i64': "({src0} > 0) - ({src0} < 0)"}),
>>> -   operation("rcp", 1, source_types=real_types, c_expression={'f': "{src0} != 0.0F ? 1.0F / {src0} : 0.0F", 'd': "{src0} != 0.0 ? 1.0 / {src0} : 0.0"}),
>>> +   operation("rcp", 1, source_types=real_types, c_expression={'f': "1.0F / {src0}", 'd': "1.0 / {src0}"}),
>>>     operation("rsq", 1, source_types=real_types, c_expression={'f': "1.0F / sqrtf({src0})", 'd': "1.0 / sqrt({src0})"}),
>>>     operation("sqrt", 1, source_types=real_types, c_expression={'f': "sqrtf({src0})", 'd': "sqrt({src0})"}),
>>>     operation("exp", 1, source_types=(float_type,), c_expression="expf({src0})"),         # Log base e on gentype
>>>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170126/c6df4b45/attachment.sig>


More information about the mesa-dev mailing list