<div dir="ltr"><div><div>+curro </div>I'm not sure what I think here. TBH, I haven't actually read it in detail yet, but here are some first impressions: </div><div> 1) There are two implementations of atan2 (SPIR-V and GLSL) and they should be kept in sync. The same dEQP tests exist in both cases. 2) The atan2 tests are not in any mustpass list and I doubt we'll see them show up soon (I could be wrong on that one) so I'm not sure fixing them is all that high priority. </div><div> 3) There's a good reason why they aren't on the mustpass: atan2 has lots of hard-to-get-right corners and, unless you have a very good hardware atan2 instruction, any "correct" implementation is liable to emit a *lot* of shader code. A quick grep through shader-db for atan yields a number of shaders that use the atan2 variant so this may have a minor perf impact. </div><div>The above does not necessarily sum to "we shouldn't fix it" but it probably does mean it's low-priority at best and we need to be careful. </div><div>Looking a bit into the math of atan2, I'm not convinced flushing denorms to zero is actually what we want to do. For x, I think it's valid, but as y approaches 0, you get +/- pi/2 depending on whether y positive or negative. As both approach zero, it's undefined. As you approach infinity, it approaches something in the interval [-pi/2, pi/2] but it depends on the direction of approach. If you do "man atan2" you'll see this all layed out in a horrifying amount of detail (somewhere around a dozen different cases). </div></div><div class="gmail_extra"> <div class="gmail_quote">On Mon, Jan 16, 2017 at 4:08 AM, Juan A. Suarez Romero <<a href="mailto:jasuarez@igalia.com" target="_blank">jasuarez@igalia.com</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Rewrite atan2(y,x) to cover (+/-)INF values. Also, in case either 'y' or 'x' is a denorm value, flush it to 0 at the very beginning. The reason is that in other case, the hardware will do the flush in some of the steps, but not in order. So we end up handling in some steps a denorm value and in others a 0. This causes wrong results. Doing it at the very beginning we ensure always the same value is used (a 0) in all the steps. This fixes several test cases in Vulkan CTS (dEQP-VK.glsl.builtin.precision.atan2.*) --- src/compiler/spirv/vtn_glsl450.c | 68 ++++++++++++++++++++++++++++++++++------ 1 file changed, 58 insertions(+), 10 deletions(-) diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index 0d32fdd..d2743a8 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -299,31 +299,79 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x) return nir_fmul(b, tmp, nir_fsign(b, y_over_x)); } +/* + * Computes atan2(y,x) + * + * If any of the parameters is a denorm value, it is flushed to 0 at the very + * beginning to avoid precision errors + */ static nir_ssa_def * build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x) { nir_ssa_def *zero = nir_imm_float(b, 0.0f); - - /* If |x| >= 1.0e-8 * |y|: */ + nir_ssa_def *inf = nir_imm_float(b, INFINITY); + nir_ssa_def *minus_inf = nir_imm_float(b, -INFINITY); + nir_ssa_def *m_3_pi_4 = nir_fmul(b, nir_imm_float(b, 3.0f), + nir_imm_float(b, M_PI_4f)); + + nir_ssa_def *denorm_y = nir_bcsel(b, nir_feq(b, nir_fmov(b, nir_fabs(b, y)), + zero), + zero, + y); + nir_ssa_def *denorm_x = nir_bcsel(b, nir_feq(b, nir_fmov(b, nir_fabs(b, x)), + zero), + zero, + x); + + /* if y == +-INF */ + nir_ssa_def *y_is_inf = nir_feq(b, nir_fabs(b, y), inf); + + /* if x == +-INF */ + nir_ssa_def *x_is_inf = nir_feq(b, nir_fabs(b, x), inf); + + /* Case: y is +-INF */ + nir_ssa_def *y_is_inf_then = + nir_fmul(b, nir_fsign(b, y), + nir_bcsel(b, nir_feq(b, x, inf), + nir_imm_float(b, M_PI_4f), + nir_bcsel(b, nir_feq(b, x, minus_inf), + m_3_pi_4, + nir_imm_float(b, M_PI_2f)))); + + /* Case: x is +-INF */ + nir_ssa_def *x_is_inf_then = + nir_fmul(b, nir_fsign(b, y), + nir_bcsel(b, nir_feq(b, x, inf), + zero, + nir_imm_float(b, M_PIf))); + + /* If x > 0 */ nir_ssa_def *condition = - nir_fge(b, nir_fabs(b, x), - nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y))); + nir_fne(b, denorm_x, zero); /* Then...call atan(y/x) and fix it up: */ - nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, y, x)); + nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, denorm_y, denorm_x)); + nir_ssa_def *r_then = - nir_bcsel(b, nir_flt(b, x, zero), + nir_bcsel(b, nir_flt(b, denorm_x, zero), nir_fadd(b, atan1, - nir_bcsel(b, nir_fge(b, y, zero), + nir_bcsel(b, nir_fge(b, denorm_y, zero), nir_imm_float(b, M_PIf), nir_imm_float(b, -M_PIf))), atan1); /* Else... */ nir_ssa_def *r_else = - nir_fmul(b, nir_fsign(b, y), nir_imm_float(b, M_PI_2f)); - - return nir_bcsel(b, condition, r_then, r_else); + nir_fmul(b, nir_fsign(b, denorm_y), nir_imm_float(b, M_PI_2f)); + + /* Everything together */ + return nir_bcsel(b, y_is_inf, + y_is_inf_then, + nir_bcsel(b, x_is_inf, + x_is_inf_then, + nir_bcsel(b, condition, + r_then, + r_else))); } static nir_ssa_def * -- 2.9.3 _______________________________________________ mesa-dev mailing list <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a> </blockquote></div> </div>