[Mesa-dev] [PATCH 2/8] glsl: Fix constant evaluation of the rcp op.

Thu Jan 26 20:20:45 UTC 2017

Ian Romanick <idr at freedesktop.org> writes:

> On 01/25/2017 10:53 AM, Francisco Jerez wrote:
>> Hi Ian, and thank you for your comments,
>> 
>> Ian Romanick <idr at freedesktop.org> writes:
>> 
>>> On 01/24/2017 03:26 PM, Francisco Jerez wrote:
>>>> Will avoid a regression in a future commit that introduces some
>>>> additional rcp operations.
>>>
>>> When I converted GLSL IR to ir_expression_operation.py, I was careful to
>>> keep all the expressions the same.  rcp and div had these weird guards.
>>> GLSL doesn't require that NaN be generated, and quite a few old GPUs
>>> don't.  If the atan2 implementation depends on NaN being generated by
>>> rcp, it may have problems on i915, r300, and similar GPUs.  I don't know
>>> what they generate, but it's not NaN and it's probably not 0.0.
>> 
>> The atan2 implementation from patch 5 doesn't rely on NaNs being
>> generated, but it does rely on the reciprocal operation handling zero
>> and infinity correctly as specified by GLSL for the division operation.
>
> Okay.  That is the problem on older GPUs that I was referring.
> Specifically, all versions of the GLSL prior to 4.40 (!) say:
>
>     Similarly, treatment of conditions such as divide by 0 may lead to
>     an unspecified result, but in no case should such a condition lead
>     to the interruption or termination of processing.
>

I cannot find this paragraph in any of the GLSL 4.1+ specs.

> I believe that DX11 requires the GLSL 4.40+ behavior, so all even
> somewhat modern devices should just work.  It's all the pre-DX11
> hardware that's might be a problem.
>

Actually I don't think PATCH 5 necessarily cares about infinities not
getting generated -- The only requirement is for rcp(0) to return a
fairly large value in absolute value (the larger the more accurate the
result will be), even the sign of the result is pretty much irrelevant.

That said there's a relatively straightforward change we could apply to
PATCH 5 and 6 in order to make the calculation more robust against
division by zero (it will probably make this patch unnecessary to avoid
piglit regressions but I think we want it anyway because of GLSL 4.1+):
We could leverage the coordinate rotation to turn (y, 0) into (0, y),
avoiding division by zero along the whole vertical line -- The only
remaining case where we could potentially divide by zero is along the
left y=0 half-line, but the function jumps from -π to π along that line
so returning an undefined value within [-π, π] for y=0 and x < 0 is very
unlikely to hurt, because AFAICT all GLSL versions lacking well-defined
divide by zero didn't require the implementation to represent the sign
of zero consistently either, so the shader is unlikely to be able to
generate a signed zero value accurately enough to notice the problem...

> Now... talking to Jason just now, he reminded me that the spec also says
> the following about built-in functions:
>
>     Function parameters specified as angle are assumed to be in units
>     of radians. In no case will any of these functions result in a
>     divide by zero error. If the divisor of a ratio is 0, then results
>     will be undefined.
>
Heh... That could be interpreted as if atan2(y, 0) is undefined which
would make this discussion moot -- I'll send a v2 of PATCH 5 and 6
anyway since it should be easy enough to get right.

> We may be fine even on old, clunky hardware.  Looking at the code in
> patch 5, atan(0, -abs(x)) would still be a problem if rcp(0) produces
> undefined results.  It looks like
> tests/shaders/glsl-fs-atan-2.shader_test should hit that case.  Anyone
> have r300 or r400 hardware to test that?
>
> This patch doesn't affect that, and, even with the "unspecified result"
> rule, it's clearly correct.  This patch is
>
> Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
>
>>> That said, this matches NIR, and it's probably fine.
>>>
>>>> ---
>>>>  src/compiler/glsl/ir_expression_operation.py | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/src/compiler/glsl/ir_expression_operation.py b/src/compiler/glsl/ir_expression_operation.py
>>>> index f91ac9b..4ac1ffb 100644
>>>> --- a/src/compiler/glsl/ir_expression_operation.py
>>>> +++ b/src/compiler/glsl/ir_expression_operation.py
>>>> @@ -422,7 +422,7 @@ ir_expression_operation = [
>>>>     operation("neg", 1, source_types=numeric_types, c_expression={'u': "-((int) {src0})", 'default': "-{src0}"}),
>>>>     operation("abs", 1, source_types=signed_numeric_types, c_expression={'i': "{src0} < 0 ? -{src0} : {src0}", 'f': "fabsf({src0})", 'd': "fabs({src0})", 'i64': "{src0} < 0 ? -{src0} : {src0}"}),
>>>>     operation("sign", 1, source_types=signed_numeric_types, c_expression={'i': "({src0} > 0) - ({src0} < 0)", 'f': "float(({src0} > 0.0F) - ({src0} < 0.0F))", 'd': "double(({src0} > 0.0) - ({src0} < 0.0))", 'i64': "({src0} > 0) - ({src0} < 0)"}),
>>>> -   operation("rcp", 1, source_types=real_types, c_expression={'f': "{src0} != 0.0F ? 1.0F / {src0} : 0.0F", 'd': "{src0} != 0.0 ? 1.0 / {src0} : 0.0"}),
>>>> +   operation("rcp", 1, source_types=real_types, c_expression={'f': "1.0F / {src0}", 'd': "1.0 / {src0}"}),
>>>>     operation("rsq", 1, source_types=real_types, c_expression={'f': "1.0F / sqrtf({src0})", 'd': "1.0 / sqrt({src0})"}),
>>>>     operation("sqrt", 1, source_types=real_types, c_expression={'f': "sqrtf({src0})", 'd': "sqrt({src0})"}),
>>>>     operation("exp", 1, source_types=(float_type,), c_expression="expf({src0})"),         # Log base e on gentype
>>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170126/23eed410/attachment.sig>