[Mesa-dev] [PATCH 2/2] r600: use the clamped versions of rcp/rsq for eg/cayman.

Ilia Mirkin imirkin at alum.mit.edu
Wed Nov 8 06:20:07 UTC 2017


Actually cayman gets half of it - it gets the abs, but not clamped. I
wonder what happens if you go the other way -- use the IEEE version of
the op for RSQ() (presumably you're not testing this on cayman).

On Wed, Nov 8, 2017 at 1:18 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> tgsi_rsq appears to ignore the passed-in op and always puts in
> ALU_OP1_RECIPSQRT_CLAMPED anyways. It also sticks an absolute value on
> the RSQ() argument. This only happens for eg, not cayman. (Probably
> why only the rcp_clamped change appeared to be necessary.)
>
> This is odd though, because there's no clamping like that in other
> drivers. The trace you made looks fine on both nvc0 and nv50.
>
> On Tue, Nov 7, 2017 at 11:01 PM,  <sroland at vmware.com> wrote:
>> From: Roland Scheidegger <sroland at vmware.com>
>>
>> r600 already used the clamped versions, but for some reason this was
>> different to eg/cayman.
>> (Note that it has been different since essentially forever, 7 years, since
>> df62338c491f2cace1a48f99de78e83b5edd82fd in particular, which changed
>> this for r600 but not eg (cayman wasn't supported back then, but probably
>> copied this from the eg part later). The commit does not mention any reason
>> why this difference should exist.)
>> This seems a bit unfortunate, since it would be nice to use ieee arithmetic,
>> I have no idea what this could potentially break and no idea if it really
>> makes sense going back to legacy-style rcp/rsq...
>> This however prevents misrenderings in This War of Mine since using ieee
>> muls (ce7a045feeef8cad155f1c9aa07f166e146e3d00), albeit strictly speaking
>> only rcp_clamped is necessary for this. It seems likely the root cause is
>> some x * rcp(y) calculation where both x and y evaluate to 0. Albeit it
>> apparently works with other drivers, not sure what's up with that...
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103544
>> ---
>>  src/gallium/drivers/r600/r600_shader.c | 16 ++++++++++++----
>>  1 file changed, 12 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c
>> index 6a755bb3fd..62fc4da901 100644
>> --- a/src/gallium/drivers/r600/r600_shader.c
>> +++ b/src/gallium/drivers/r600/r600_shader.c
>> @@ -9033,8 +9033,12 @@ static const struct r600_shader_tgsi_instruction eg_shader_tgsi_instruction[] =
>>         [TGSI_OPCODE_ARL]       = { ALU_OP0_NOP, tgsi_eg_arl},
>>         [TGSI_OPCODE_MOV]       = { ALU_OP1_MOV, tgsi_op2},
>>         [TGSI_OPCODE_LIT]       = { ALU_OP0_NOP, tgsi_lit},
>> -       [TGSI_OPCODE_RCP]       = { ALU_OP1_RECIP_IEEE, tgsi_trans_srcx_replicate},
>> -       [TGSI_OPCODE_RSQ]       = { ALU_OP1_RECIPSQRT_IEEE, tgsi_rsq},
>> +       /* XXX:
>> +        * For state trackers other than OpenGL, we'll want to use
>> +        * _RECIP_IEEE/_RECIPSQRT_IEEE instead.
>> +        */
>> +       [TGSI_OPCODE_RCP]       = { ALU_OP1_RECIP_CLAMPED, tgsi_trans_srcx_replicate},
>> +       [TGSI_OPCODE_RSQ]       = { ALU_OP1_RECIPSQRT_CLAMPED, tgsi_rsq},
>>         [TGSI_OPCODE_EXP]       = { ALU_OP0_NOP, tgsi_exp},
>>         [TGSI_OPCODE_LOG]       = { ALU_OP0_NOP, tgsi_log},
>>         [TGSI_OPCODE_MUL]       = { ALU_OP2_MUL_IEEE, tgsi_op2},
>> @@ -9256,8 +9260,12 @@ static const struct r600_shader_tgsi_instruction cm_shader_tgsi_instruction[] =
>>         [TGSI_OPCODE_ARL]       = { ALU_OP0_NOP, tgsi_eg_arl},
>>         [TGSI_OPCODE_MOV]       = { ALU_OP1_MOV, tgsi_op2},
>>         [TGSI_OPCODE_LIT]       = { ALU_OP0_NOP, tgsi_lit},
>> -       [TGSI_OPCODE_RCP]       = { ALU_OP1_RECIP_IEEE, cayman_emit_float_instr},
>> -       [TGSI_OPCODE_RSQ]       = { ALU_OP1_RECIPSQRT_IEEE, cayman_emit_float_instr},
>> +       /* XXX:
>> +        * For state trackers other than OpenGL, we'll want to use
>> +        * _RECIP_IEEE/_RECIPSQRT_IEEE instead.
>> +        */
>> +       [TGSI_OPCODE_RCP]       = { ALU_OP1_RECIP_CLAMPED, cayman_emit_float_instr},
>> +       [TGSI_OPCODE_RSQ]       = { ALU_OP1_RECIPSQRT_CLAMPED, cayman_emit_float_instr},
>>         [TGSI_OPCODE_EXP]       = { ALU_OP0_NOP, tgsi_exp},
>>         [TGSI_OPCODE_LOG]       = { ALU_OP0_NOP, tgsi_log},
>>         [TGSI_OPCODE_MUL]       = { ALU_OP2_MUL_IEEE, tgsi_op2},
>> --
>> 2.12.3
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list