[Mesa-dev] [PATCH 1/4] r600: use min_dx10/max_dx10 instead of min/max

Thu Nov 9 17:39:44 UTC 2017

On 09.11.2017 18:26, Roland Scheidegger wrote:
> Am 09.11.2017 um 18:19 schrieb Jan Vesely:
>> On Thu, 2017-11-09 at 03:58 +0100, sroland at vmware.com wrote:
>>> From: Roland Scheidegger <sroland at vmware.com>
>>>
>>> I believe this is the safe thing to do, especially ever since the driver
>>> actually generates NaNs for muls too.
>>> Albeit since the radeon ISA docs are inaccurate/wrong there, I'm not
>>> entirely sure what the non-dx10 versions do,
>>
>> non-dx10 version return nan if one of the operands is nan (tested on
>> Turks).
> 
> Yes, I've modified a piglit test and came to the same conclusion. (I
> will put that up for review shortly.)
> I don't know why you'd ever want that, though (ieee fmin/fmax also
> should return non-nan).

My guess is that DX9-level hardware had that behavior by virtue of just 
not caring about NaN at all, and the hardware folks were just being 
conservative in adding a new opcode rather than changing the behavior of 
the old one. I don't think GCN has the old-style min/max.

Cheers,
Nicolai

> 
> Roland
> 
> 
>>
>> Jan
>>
>>>   but (as required by dx10)
>>> the dx10 versions should pick a non-nan source over a nan source.
>>> Other drivers presumably do the same (radeonsi, llvmpipe).
>>> This was shown to make some difference for bug 103544, albeit it is not
>>> required to fix it.
>>> ---
>>>   src/gallium/drivers/r600/r600_shader.c  | 12 ++++++------
>>>   src/gallium/drivers/r600/sb/sb_expr.cpp |  2 ++
>>>   2 files changed, 8 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c
>>> index 188fbc9d47..6a755bb3fd 100644
>>> --- a/src/gallium/drivers/r600/r600_shader.c
>>> +++ b/src/gallium/drivers/r600/r600_shader.c
>>> @@ -8844,8 +8844,8 @@ static const struct r600_shader_tgsi_instruction r600_shader_tgsi_instruction[]
>>>   	[TGSI_OPCODE_DP3]	= { ALU_OP2_DOT4_IEEE, tgsi_dp},
>>>   	[TGSI_OPCODE_DP4]	= { ALU_OP2_DOT4_IEEE, tgsi_dp},
>>>   	[TGSI_OPCODE_DST]	= { ALU_OP0_NOP, tgsi_opdst},
>>> -	[TGSI_OPCODE_MIN]	= { ALU_OP2_MIN, tgsi_op2},
>>> -	[TGSI_OPCODE_MAX]	= { ALU_OP2_MAX, tgsi_op2},
>>> +	[TGSI_OPCODE_MIN]	= { ALU_OP2_MIN_DX10, tgsi_op2},
>>> +	[TGSI_OPCODE_MAX]	= { ALU_OP2_MAX_DX10, tgsi_op2},
>>>   	[TGSI_OPCODE_SLT]	= { ALU_OP2_SETGT, tgsi_op2_swap},
>>>   	[TGSI_OPCODE_SGE]	= { ALU_OP2_SETGE, tgsi_op2},
>>>   	[TGSI_OPCODE_MAD]	= { ALU_OP3_MULADD_IEEE, tgsi_op3},
>>> @@ -9042,8 +9042,8 @@ static const struct r600_shader_tgsi_instruction eg_shader_tgsi_instruction[] =
>>>   	[TGSI_OPCODE_DP3]	= { ALU_OP2_DOT4_IEEE, tgsi_dp},
>>>   	[TGSI_OPCODE_DP4]	= { ALU_OP2_DOT4_IEEE, tgsi_dp},
>>>   	[TGSI_OPCODE_DST]	= { ALU_OP0_NOP, tgsi_opdst},
>>> -	[TGSI_OPCODE_MIN]	= { ALU_OP2_MIN, tgsi_op2},
>>> -	[TGSI_OPCODE_MAX]	= { ALU_OP2_MAX, tgsi_op2},
>>> +	[TGSI_OPCODE_MIN]	= { ALU_OP2_MIN_DX10, tgsi_op2},
>>> +	[TGSI_OPCODE_MAX]	= { ALU_OP2_MAX_DX10, tgsi_op2},
>>>   	[TGSI_OPCODE_SLT]	= { ALU_OP2_SETGT, tgsi_op2_swap},
>>>   	[TGSI_OPCODE_SGE]	= { ALU_OP2_SETGE, tgsi_op2},
>>>   	[TGSI_OPCODE_MAD]	= { ALU_OP3_MULADD_IEEE, tgsi_op3},
>>> @@ -9265,8 +9265,8 @@ static const struct r600_shader_tgsi_instruction cm_shader_tgsi_instruction[] =
>>>   	[TGSI_OPCODE_DP3]	= { ALU_OP2_DOT4_IEEE, tgsi_dp},
>>>   	[TGSI_OPCODE_DP4]	= { ALU_OP2_DOT4_IEEE, tgsi_dp},
>>>   	[TGSI_OPCODE_DST]	= { ALU_OP0_NOP, tgsi_opdst},
>>> -	[TGSI_OPCODE_MIN]	= { ALU_OP2_MIN, tgsi_op2},
>>> -	[TGSI_OPCODE_MAX]	= { ALU_OP2_MAX, tgsi_op2},
>>> +	[TGSI_OPCODE_MIN]	= { ALU_OP2_MIN_DX10, tgsi_op2},
>>> +	[TGSI_OPCODE_MAX]	= { ALU_OP2_MAX_DX10, tgsi_op2},
>>>   	[TGSI_OPCODE_SLT]	= { ALU_OP2_SETGT, tgsi_op2_swap},
>>>   	[TGSI_OPCODE_SGE]	= { ALU_OP2_SETGE, tgsi_op2},
>>>   	[TGSI_OPCODE_MAD]	= { ALU_OP3_MULADD_IEEE, tgsi_op3},
>>> diff --git a/src/gallium/drivers/r600/sb/sb_expr.cpp b/src/gallium/drivers/r600/sb/sb_expr.cpp
>>> index 3dd3a4815b..7a5d62c8e8 100644
>>> --- a/src/gallium/drivers/r600/sb/sb_expr.cpp
>>> +++ b/src/gallium/drivers/r600/sb/sb_expr.cpp
>>> @@ -753,7 +753,9 @@ bool expr_handler::fold_alu_op2(alu_node& n) {
>>>   				n.bc.src[0].abs == n.bc.src[1].abs) {
>>>   			switch (n.bc.op) {
>>>   			case ALU_OP2_MIN: // (MIN x, x) => (MOV x)
>>> +			case ALU_OP2_MIN_DX10:
>>>   			case ALU_OP2_MAX:
>>> +			case ALU_OP2_MAX_DX10:
>>>   				convert_to_mov(n, v0, n.bc.src[0].neg, n.bc.src[0].abs);
>>>   				return fold_alu_op1(n);
>>>   			case ALU_OP2_ADD:  // (ADD x, x) => (MUL x, 2)
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.