[Mesa-dev] [PATCH] nir/spirv/glsl450: rewrite atan2 to deal with denorms / infinities
Juan A. Suarez Romero
jasuarez at igalia.com
Tue Jan 17 10:34:41 UTC 2017
On Mon, 2017-01-16 at 10:21 -0800, Jason Ekstrand wrote:
> +curro
>
> I'm not sure what I think here. TBH, I haven't actually read it in detail yet, but here are some first impressions:
> 1) There are two implementations of atan2 (SPIR-V and GLSL) and they should be kept in sync. The same dEQP tests exist in both cases.
Yes. I focused in Vulkan, but I agree these changes (if accepted) could be ported to the GLSL version.
> 2) The atan2 tests are not in any mustpass list and I doubt we'll see them show up soon (I could be wrong on that one) so I'm not sure fixing them is all that high priority.
Agree. There were removed from mustpass precisely due the corner cases it has.
> 3) There's a good reason why they aren't on the mustpass: atan2 has lots of hard-to-get-right corners and, unless you have a very good hardware atan2 instruction, any "correct" implementation is liable to emit a *lot* of shader code. A quick grep through shader-db for atan yields a number of shaders that use the atan2 variant so this may have a minor perf impact.
>
This was the motivation to submit a change here: current atan2 implementation, already emit a lot of code. So adding some instructions more I think won't make a big change here. And atan2 doesn't seem a very used function. In shader-db (including private shaders), I only found 36 shaders using it, out of of 26k.
> The above does not necessarily sum to "we shouldn't fix it" but it probably does mean it's low-priority at best and we need to be careful.
>
> Looking a bit into the math of atan2, I'm not convinced flushing denorms to zero is actually what we want to do. For x, I think it's valid, but as y approaches 0, you get +/- pi/2 depending on whether y positive or negative. As both approach zero, it's undefined. As you approach infinity, it approaches something in the interval [-pi/2, pi/2] but it depends on the direction of approach. If you do "man atan2" you'll see this all layed out in a horrifying amount of detail (somewhere around a dozen different cases).
>
>
Yes, originally I only flushed 'x'. But then also flushed 'y' to keep coherence. But didn't think about that corner case you mention. I think we can remove the flush to 0 in 'y'.
> On Mon, Jan 16, 2017 at 4:08 AM, Juan A. Suarez Romero <jasuarez at igalia.com> wrote:
> > Rewrite atan2(y,x) to cover (+/-)INF values.
> >
> >
> >
> > Also, in case either 'y' or 'x' is a denorm value, flush it to 0 at the
> >
> > very beginning.
> >
> >
> >
> > The reason is that in other case, the hardware will do the flush in some
> >
> > of the steps, but not in order. So we end up handling in some steps a
> >
> > denorm value and in others a 0. This causes wrong results.
> >
> >
> >
> > Doing it at the very beginning we ensure always the same value is used
> >
> > (a 0) in all the steps.
> >
> >
> >
> > This fixes several test cases in Vulkan CTS
> >
> > (dEQP-VK.glsl.builtin.precision.atan2.*)
> >
> > ---
> >
> > src/compiler/spirv/vtn_glsl450.c | 68 ++++++++++++++++++++++++++++++++++------
> >
> > 1 file changed, 58 insertions(+), 10 deletions(-)
> >
> >
> >
> > diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
> >
> > index 0d32fdd..d2743a8 100644
> >
> > --- a/src/compiler/spirv/vtn_glsl450.c
> >
> > +++ b/src/compiler/spirv/vtn_glsl450.c
> >
> > @@ -299,31 +299,79 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x)
> >
> > return nir_fmul(b, tmp, nir_fsign(b, y_over_x));
> >
> > }
> >
> >
> >
> > +/*
> >
> > + * Computes atan2(y,x)
> >
> > + *
> >
> > + * If any of the parameters is a denorm value, it is flushed to 0 at the very
> >
> > + * beginning to avoid precision errors
> >
> > + */
> >
> > static nir_ssa_def *
> >
> > build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x)
> >
> > {
> >
> > nir_ssa_def *zero = nir_imm_float(b, 0.0f);
> >
> > -
> >
> > - /* If |x| >= 1.0e-8 * |y|: */
> >
> > + nir_ssa_def *inf = nir_imm_float(b, INFINITY);
> >
> > + nir_ssa_def *minus_inf = nir_imm_float(b, -INFINITY);
> >
> > + nir_ssa_def *m_3_pi_4 = nir_fmul(b, nir_imm_float(b, 3.0f),
> >
> > + nir_imm_float(b, M_PI_4f));
> >
> > +
> >
> > + nir_ssa_def *denorm_y = nir_bcsel(b, nir_feq(b, nir_fmov(b, nir_fabs(b, y)),
> >
> > + zero),
> >
> > + zero,
> >
> > + y);
> >
> > + nir_ssa_def *denorm_x = nir_bcsel(b, nir_feq(b, nir_fmov(b, nir_fabs(b, x)),
> >
> > + zero),
> >
> > + zero,
> >
> > + x);
> >
> > +
> >
> > + /* if y == +-INF */
> >
> > + nir_ssa_def *y_is_inf = nir_feq(b, nir_fabs(b, y), inf);
> >
> > +
> >
> > + /* if x == +-INF */
> >
> > + nir_ssa_def *x_is_inf = nir_feq(b, nir_fabs(b, x), inf);
> >
> > +
> >
> > + /* Case: y is +-INF */
> >
> > + nir_ssa_def *y_is_inf_then =
> >
> > + nir_fmul(b, nir_fsign(b, y),
> >
> > + nir_bcsel(b, nir_feq(b, x, inf),
> >
> > + nir_imm_float(b, M_PI_4f),
> >
> > + nir_bcsel(b, nir_feq(b, x, minus_inf),
> >
> > + m_3_pi_4,
> >
> > + nir_imm_float(b, M_PI_2f))));
> >
> > +
> >
> > + /* Case: x is +-INF */
> >
> > + nir_ssa_def *x_is_inf_then =
> >
> > + nir_fmul(b, nir_fsign(b, y),
> >
> > + nir_bcsel(b, nir_feq(b, x, inf),
> >
> > + zero,
> >
> > + nir_imm_float(b, M_PIf)));
> >
> > +
> >
> > + /* If x > 0 */
> >
> > nir_ssa_def *condition =
> >
> > - nir_fge(b, nir_fabs(b, x),
> >
> > - nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y)));
> >
> > + nir_fne(b, denorm_x, zero);
> >
> >
> >
> > /* Then...call atan(y/x) and fix it up: */
> >
> > - nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, y, x));
> >
> > + nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, denorm_y, denorm_x));
> >
> > +
> >
> > nir_ssa_def *r_then =
> >
> > - nir_bcsel(b, nir_flt(b, x, zero),
> >
> > + nir_bcsel(b, nir_flt(b, denorm_x, zero),
> >
> > nir_fadd(b, atan1,
> >
> > - nir_bcsel(b, nir_fge(b, y, zero),
> >
> > + nir_bcsel(b, nir_fge(b, denorm_y, zero),
> >
> > nir_imm_float(b, M_PIf),
> >
> > nir_imm_float(b, -M_PIf))),
> >
> > atan1);
> >
> >
> >
> > /* Else... */
> >
> > nir_ssa_def *r_else =
> >
> > - nir_fmul(b, nir_fsign(b, y), nir_imm_float(b, M_PI_2f));
> >
> > -
> >
> > - return nir_bcsel(b, condition, r_then, r_else);
> >
> > + nir_fmul(b, nir_fsign(b, denorm_y), nir_imm_float(b, M_PI_2f));
> >
> > +
> >
> > + /* Everything together */
> >
> > + return nir_bcsel(b, y_is_inf,
> >
> > + y_is_inf_then,
> >
> > + nir_bcsel(b, x_is_inf,
> >
> > + x_is_inf_then,
> >
> > + nir_bcsel(b, condition,
> >
> > + r_then,
> >
> > + r_else)));
> >
> > }
> >
> >
> >
> > static nir_ssa_def *
> >
> > --
> >
> > 2.9.3
> >
> >
> >
> > _______________________________________________
> >
> > mesa-dev mailing list
> >
> > mesa-dev at lists.freedesktop.org
> >
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170117/410cab1c/attachment-0001.html>
More information about the mesa-dev
mailing list