[Mesa-dev] [PATCH] nir/spirv/glsl450: rewrite atan2 to deal with denorms / infinities
Jason Ekstrand
jason at jlekstrand.net
Mon Jan 16 18:21:38 UTC 2017
+curro
I'm not sure what I think here. TBH, I haven't actually read it in detail
yet, but here are some first impressions:
1) There are two implementations of atan2 (SPIR-V and GLSL) and they
should be kept in sync. The same dEQP tests exist in both cases.
2) The atan2 tests are not in any mustpass list and I doubt we'll see them
show up soon (I could be wrong on that one) so I'm not sure fixing them is
all that high priority.
3) There's a good reason why they aren't on the mustpass: atan2 has lots
of hard-to-get-right corners and, unless you have a very good hardware
atan2 instruction, any "correct" implementation is liable to emit a *lot*
of shader code. A quick grep through shader-db for atan yields a number of
shaders that use the atan2 variant so this may have a minor perf impact.
The above does not necessarily sum to "we shouldn't fix it" but it probably
does mean it's low-priority at best and we need to be careful.
Looking a bit into the math of atan2, I'm not convinced flushing denorms to
zero is actually what we want to do. For x, I think it's valid, but as y
approaches 0, you get +/- pi/2 depending on whether y positive or
negative. As both approach zero, it's undefined. As you approach
infinity, it approaches something in the interval [-pi/2, pi/2] but it
depends on the direction of approach. If you do "man atan2" you'll see
this all layed out in a horrifying amount of detail (somewhere around a
dozen different cases).
On Mon, Jan 16, 2017 at 4:08 AM, Juan A. Suarez Romero <jasuarez at igalia.com>
wrote:
> Rewrite atan2(y,x) to cover (+/-)INF values.
>
> Also, in case either 'y' or 'x' is a denorm value, flush it to 0 at the
> very beginning.
>
> The reason is that in other case, the hardware will do the flush in some
> of the steps, but not in order. So we end up handling in some steps a
> denorm value and in others a 0. This causes wrong results.
>
> Doing it at the very beginning we ensure always the same value is used
> (a 0) in all the steps.
>
> This fixes several test cases in Vulkan CTS
> (dEQP-VK.glsl.builtin.precision.atan2.*)
> ---
> src/compiler/spirv/vtn_glsl450.c | 68 ++++++++++++++++++++++++++++++
> ++++------
> 1 file changed, 58 insertions(+), 10 deletions(-)
>
> diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_
> glsl450.c
> index 0d32fdd..d2743a8 100644
> --- a/src/compiler/spirv/vtn_glsl450.c
> +++ b/src/compiler/spirv/vtn_glsl450.c
> @@ -299,31 +299,79 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x)
> return nir_fmul(b, tmp, nir_fsign(b, y_over_x));
> }
>
> +/*
> + * Computes atan2(y,x)
> + *
> + * If any of the parameters is a denorm value, it is flushed to 0 at the
> very
> + * beginning to avoid precision errors
> + */
> static nir_ssa_def *
> build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x)
> {
> nir_ssa_def *zero = nir_imm_float(b, 0.0f);
> -
> - /* If |x| >= 1.0e-8 * |y|: */
> + nir_ssa_def *inf = nir_imm_float(b, INFINITY);
> + nir_ssa_def *minus_inf = nir_imm_float(b, -INFINITY);
> + nir_ssa_def *m_3_pi_4 = nir_fmul(b, nir_imm_float(b, 3.0f),
> + nir_imm_float(b, M_PI_4f));
> +
> + nir_ssa_def *denorm_y = nir_bcsel(b, nir_feq(b, nir_fmov(b,
> nir_fabs(b, y)),
> + zero),
> + zero,
> + y);
> + nir_ssa_def *denorm_x = nir_bcsel(b, nir_feq(b, nir_fmov(b,
> nir_fabs(b, x)),
> + zero),
> + zero,
> + x);
> +
> + /* if y == +-INF */
> + nir_ssa_def *y_is_inf = nir_feq(b, nir_fabs(b, y), inf);
> +
> + /* if x == +-INF */
> + nir_ssa_def *x_is_inf = nir_feq(b, nir_fabs(b, x), inf);
> +
> + /* Case: y is +-INF */
> + nir_ssa_def *y_is_inf_then =
> + nir_fmul(b, nir_fsign(b, y),
> + nir_bcsel(b, nir_feq(b, x, inf),
> + nir_imm_float(b, M_PI_4f),
> + nir_bcsel(b, nir_feq(b, x, minus_inf),
> + m_3_pi_4,
> + nir_imm_float(b, M_PI_2f))));
> +
> + /* Case: x is +-INF */
> + nir_ssa_def *x_is_inf_then =
> + nir_fmul(b, nir_fsign(b, y),
> + nir_bcsel(b, nir_feq(b, x, inf),
> + zero,
> + nir_imm_float(b, M_PIf)));
> +
> + /* If x > 0 */
> nir_ssa_def *condition =
> - nir_fge(b, nir_fabs(b, x),
> - nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y)));
> + nir_fne(b, denorm_x, zero);
>
> /* Then...call atan(y/x) and fix it up: */
> - nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, y, x));
> + nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, denorm_y, denorm_x));
> +
> nir_ssa_def *r_then =
> - nir_bcsel(b, nir_flt(b, x, zero),
> + nir_bcsel(b, nir_flt(b, denorm_x, zero),
> nir_fadd(b, atan1,
> - nir_bcsel(b, nir_fge(b, y, zero),
> + nir_bcsel(b, nir_fge(b, denorm_y, zero),
> nir_imm_float(b, M_PIf),
> nir_imm_float(b, -M_PIf))),
> atan1);
>
> /* Else... */
> nir_ssa_def *r_else =
> - nir_fmul(b, nir_fsign(b, y), nir_imm_float(b, M_PI_2f));
> -
> - return nir_bcsel(b, condition, r_then, r_else);
> + nir_fmul(b, nir_fsign(b, denorm_y), nir_imm_float(b, M_PI_2f));
> +
> + /* Everything together */
> + return nir_bcsel(b, y_is_inf,
> + y_is_inf_then,
> + nir_bcsel(b, x_is_inf,
> + x_is_inf_then,
> + nir_bcsel(b, condition,
> + r_then,
> + r_else)));
> }
>
> static nir_ssa_def *
> --
> 2.9.3
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170116/89a71855/attachment-0001.html>
More information about the mesa-dev
mailing list