[Mesa-stable] [Mesa-dev] [PATCH 1/5] st/nine: Clamp RCP when 0*inf!=0
Axel Davy
davyaxel0 at gmail.com
Wed Sep 12 06:17:48 UTC 2018
On 9/11/18 11:28 PM, Roland Scheidegger wrote:
> Am 09.09.2018 um 21:19 schrieb Axel Davy:
>> Tests done on several devices of all 3 vendors and
>> of different generations showed that there are several
>> ways of handling infs and NaN for d3d9.
>>
>> Tests showed Intel on windows does always clamp
>> RCP, RSQ and LOG (thus preventing inf/nan generation),
>> for all shader versions (some vendor behaviours vary
>> with shader versions).
>> Doing this in nine avoids 0*inf issues for drivers
>> that can't generate 0*inf=0 (which is controled by
>> TGSI's MUL_ZERO_WINS).
>>
>> For now clamp for all drivers. An ulterior optimization
>> would be to avoid clamping for drivers with MUL_ZERO_WINS
>> for the specific shader versions where NV or AMD don't
>> clamp.
>>
>> LOG and RSQ being already clamped, this patch only
>> clamps RCP.
>>
>> Fixes: https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FiXit%2FMesa-3D%2Fissues%2F316&data=02%7C01%7Csroland%40vmware.com%7Cdccfde1e101a477ee00808d6168941d4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636721176130476488&sdata=JbGHhpPJPgUcw4i%2FSYN%2B30a7okSb5sT8bR%2B4PKvCnyM%3D&reserved=0
>>
>> Signed-off-by: Axel Davy <davyaxel0 at gmail.com>
>> CC: <mesa-stable at lists.freedesktop.org>
>> ---
>> src/gallium/state_trackers/nine/nine_shader.c | 14 +++++++++++++-
>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/state_trackers/nine/nine_shader.c b/src/gallium/state_trackers/nine/nine_shader.c
>> index 7db07d8f69..5b8ad3f161 100644
>> --- a/src/gallium/state_trackers/nine/nine_shader.c
>> +++ b/src/gallium/state_trackers/nine/nine_shader.c
>> @@ -2273,6 +2273,18 @@ DECL_SPECIAL(POW)
>> return D3D_OK;
>> }
>>
>> +DECL_SPECIAL(RCP)
>> +{
>> + struct ureg_program *ureg = tx->ureg;
>> + struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
>> + struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
>> + struct ureg_dst tmp = tx_scratch(tx);
>> + ureg_RCP(ureg, tmp, src);
>> + ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
>> + ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), ureg_src(tmp));
> I'm not sure what the ureg_MAX is supposed to do?
> The min already gets rid of all NaNs (iff the driver follows the
> d3d10-mandated behavior of picking the non-nan number for min/max if one
> of the values is a NaN - if not doing both min/max isn't going to help
> neither...).
>
> Roland
The goal is to catch inf and -inf and replace them by FLT_MAX and -FLT_MAX.
Without, the NaN would appear when doing mul or mad.
Axel
>
>
>> + return D3D_OK;
>> +}
>> +
>> DECL_SPECIAL(RSQ)
>> {
>> struct ureg_program *ureg = tx->ureg;
>> @@ -2909,7 +2921,7 @@ static const struct sm1_op_info inst_table[] =
>> _OPI(SUB, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(SUB)), /* 3 */
>> _OPI(MAD, MAD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 4 */
>> _OPI(MUL, MUL, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 5 */
>> - _OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 6 */
>> + _OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RCP)), /* 6 */
>> _OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RSQ)), /* 7 */
>> _OPI(DP3, DP3, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 8 */
>> _OPI(DP4, DP4, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 9 */
>>
>
More information about the mesa-stable
mailing list