[Mesa-stable] [Mesa-dev] [PATCH 1/5] st/nine: Clamp RCP when 0*inf!=0
Roland Scheidegger
sroland at vmware.com
Tue Sep 11 21:28:11 UTC 2018
Am 09.09.2018 um 21:19 schrieb Axel Davy:
> Tests done on several devices of all 3 vendors and
> of different generations showed that there are several
> ways of handling infs and NaN for d3d9.
>
> Tests showed Intel on windows does always clamp
> RCP, RSQ and LOG (thus preventing inf/nan generation),
> for all shader versions (some vendor behaviours vary
> with shader versions).
> Doing this in nine avoids 0*inf issues for drivers
> that can't generate 0*inf=0 (which is controled by
> TGSI's MUL_ZERO_WINS).
>
> For now clamp for all drivers. An ulterior optimization
> would be to avoid clamping for drivers with MUL_ZERO_WINS
> for the specific shader versions where NV or AMD don't
> clamp.
>
> LOG and RSQ being already clamped, this patch only
> clamps RCP.
>
> Fixes: https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FiXit%2FMesa-3D%2Fissues%2F316&data=02%7C01%7Csroland%40vmware.com%7Cdccfde1e101a477ee00808d6168941d4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636721176130476488&sdata=JbGHhpPJPgUcw4i%2FSYN%2B30a7okSb5sT8bR%2B4PKvCnyM%3D&reserved=0
>
> Signed-off-by: Axel Davy <davyaxel0 at gmail.com>
> CC: <mesa-stable at lists.freedesktop.org>
> ---
> src/gallium/state_trackers/nine/nine_shader.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/state_trackers/nine/nine_shader.c b/src/gallium/state_trackers/nine/nine_shader.c
> index 7db07d8f69..5b8ad3f161 100644
> --- a/src/gallium/state_trackers/nine/nine_shader.c
> +++ b/src/gallium/state_trackers/nine/nine_shader.c
> @@ -2273,6 +2273,18 @@ DECL_SPECIAL(POW)
> return D3D_OK;
> }
>
> +DECL_SPECIAL(RCP)
> +{
> + struct ureg_program *ureg = tx->ureg;
> + struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
> + struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
> + struct ureg_dst tmp = tx_scratch(tx);
> + ureg_RCP(ureg, tmp, src);
> + ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
> + ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), ureg_src(tmp));
I'm not sure what the ureg_MAX is supposed to do?
The min already gets rid of all NaNs (iff the driver follows the
d3d10-mandated behavior of picking the non-nan number for min/max if one
of the values is a NaN - if not doing both min/max isn't going to help
neither...).
Roland
> + return D3D_OK;
> +}
> +
> DECL_SPECIAL(RSQ)
> {
> struct ureg_program *ureg = tx->ureg;
> @@ -2909,7 +2921,7 @@ static const struct sm1_op_info inst_table[] =
> _OPI(SUB, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(SUB)), /* 3 */
> _OPI(MAD, MAD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 4 */
> _OPI(MUL, MUL, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 5 */
> - _OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 6 */
> + _OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RCP)), /* 6 */
> _OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RSQ)), /* 7 */
> _OPI(DP3, DP3, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 8 */
> _OPI(DP4, DP4, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 9 */
>
More information about the mesa-stable
mailing list