[Mesa-dev] [PATCH 3/4] r600: use ieee version of rcp
Roland Scheidegger
sroland at vmware.com
Thu Nov 9 17:58:46 UTC 2017
Am 09.11.2017 um 18:27 schrieb Jan Vesely:
> On Thu, 2017-11-09 at 03:58 +0100, sroland at vmware.com wrote:
>> From: Roland Scheidegger <sroland at vmware.com>
>>
>> r600 used the clamped version for rcp, whereas both evergreen and cayman
>> used the ieee version. I don't know why that discrepancy exists (it does so
>> since day 1) but there does not seem to be a valid reason for this, so make
>> it consistent. This seems now safer than before the previous commit (using
>> the mystery dx10 clamp).
>> Note that rsq still uses clamped version (as before even though the table
>> may have suggested otherwise for evergreen) for r600/eg, but not for cayman.
>
> just layman's opinion here. Does TGSI not mandate specific behaviour
> wrt nans and infinities for this OP?
No, not really. Ideally all (non-legacy such as LIT) opcodes would
follow ieee754 (or d3d10) semantics (and that's how they are implemented
at least in llvmpipe). But we don't enforce denorm behavior for instance
neither (llvmpipe will disable them - on x86 at least...).
Some hw supported by gallium drivers also simply can't generate NaNs, no
matter the opcode.
I think in general drivers should try to stick as close to ieee754 (and
d3d10) semantics as they can, regardless what GL may allow, so yes, I
guess the ieee version should be used (albeit the abs modifier doesn't
really fit in there neither, and that is done because there's problems
otherwise).
Roland
>
>> I just don't feel lucky enough to change this (it should also be noted r600
>> supports sqrt natively, which is always ieee, therefore might not really see
>> rsqrt with glsl often presumably).
>
> why would that be? isn't RECIPSQRT_IEEE(x) still optimization over
> RECIP_IEEE(SQRT(x))?
>
> Jan
>
>> Compile tested only...
>> ---
>> src/gallium/drivers/r600/r600_shader.c | 8 ++------
>> 1 file changed, 2 insertions(+), 6 deletions(-)
>>
>> diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c
>> index 6a755bb3fd..628c33787e 100644
>> --- a/src/gallium/drivers/r600/r600_shader.c
>> +++ b/src/gallium/drivers/r600/r600_shader.c
>> @@ -8830,11 +8830,7 @@ static const struct r600_shader_tgsi_instruction r600_shader_tgsi_instruction[]
>> [TGSI_OPCODE_MOV] = { ALU_OP1_MOV, tgsi_op2},
>> [TGSI_OPCODE_LIT] = { ALU_OP0_NOP, tgsi_lit},
>>
>> - /* XXX:
>> - * For state trackers other than OpenGL, we'll want to use
>> - * _RECIP_IEEE instead.
>> - */
>> - [TGSI_OPCODE_RCP] = { ALU_OP1_RECIP_CLAMPED, tgsi_trans_srcx_replicate},
>> + [TGSI_OPCODE_RCP] = { ALU_OP1_RECIP_IEEE, tgsi_trans_srcx_replicate},
>>
>> [TGSI_OPCODE_RSQ] = { ALU_OP0_NOP, tgsi_rsq},
>> [TGSI_OPCODE_EXP] = { ALU_OP0_NOP, tgsi_exp},
>> @@ -9034,7 +9030,7 @@ static const struct r600_shader_tgsi_instruction eg_shader_tgsi_instruction[] =
>> [TGSI_OPCODE_MOV] = { ALU_OP1_MOV, tgsi_op2},
>> [TGSI_OPCODE_LIT] = { ALU_OP0_NOP, tgsi_lit},
>> [TGSI_OPCODE_RCP] = { ALU_OP1_RECIP_IEEE, tgsi_trans_srcx_replicate},
>> - [TGSI_OPCODE_RSQ] = { ALU_OP1_RECIPSQRT_IEEE, tgsi_rsq},
>> + [TGSI_OPCODE_RSQ] = { ALU_OP0_NOP, tgsi_rsq},
>> [TGSI_OPCODE_EXP] = { ALU_OP0_NOP, tgsi_exp},
>> [TGSI_OPCODE_LOG] = { ALU_OP0_NOP, tgsi_log},
>> [TGSI_OPCODE_MUL] = { ALU_OP2_MUL_IEEE, tgsi_op2},
More information about the mesa-dev
mailing list