<div dir="ltr">Jason<div><br></div><div>Clamping to +- 10 sounds good to me. I have sent a patch to the mesa-dev list.</div><div><br></div><div>Please port spirv_to_nir in a separate patch if you could, I'm afraid I'm not really familiar with that part and don't know how to test it.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Dec 8, 2016 at 4:54 PM, Jason Ekstrand <span dir="ltr"><<a href="mailto:jason@jlekstrand.net" target="_blank">jason@jlekstrand.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="">On Thu, Dec 8, 2016 at 3:31 PM, Haixia Shi <span dir="ltr"><<a href="mailto:hshi@chromium.org" target="_blank">hshi@chromium.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">We're encountering failures in the latest version of dEQP (specifically, dEQP-GLES3.functional.shaders.<wbr>builtin_functions.precision.ta<wbr>nh.highp_*) on all Intel devices.<div><br></div><div>The problem is that when the abs value of input is too large (say 80), the function should return +/-1, but the actual shader output is 0.</div><div><br></div><div>The following patch in glsl compiler fixes the problem, but I'm not really sure if a fix in the compiler frontend is the right place. On the other hand, once the intermediate code is emitted, it's not really feasible for drivers to handle precision problems in backend. Any ideas?</div></div></blockquote><div><br></div></span><div>I think this is the correct way to handle this. It isn't really a back-end precision problem. The problem is that, once x gets large enough, exp(x) tends towards infinity and you may end up with inf/inf.<br></div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Here's the patch</div><div><br></div><div><div>diff --git a/src/compiler/glsl/builtin_fu<wbr>nctions.cpp b/src/compiler/glsl/builtin_fu<wbr>nctions.cpp</div><div>index 3e4bcbb..6cfeb1b 100644</div><div>--- a/src/compiler/glsl/builtin_fu<wbr>nctions.cpp</div><div>+++ b/src/compiler/glsl/builtin_fu<wbr>nctions.cpp</div><div>@@ -3563,9 +3563,13 @@ builtin_builder::_tanh(const glsl_type *type)</div><div> ir_variable *x = in_var(type, "x");</div><div> MAKE_SIG(type, v130, 1, x);</div><div> </div><div>+ /* clamp x to [-20, +20] to avoid numeric problems */</div><div>+ ir_variable *t = body.make_temp(type, "tmp");</div><div>+ body.emit(assign(t, min2(max2(x, imm(-20.0f)), imm(20.0f))));</div></div></div></blockquote><div><br></div></span><div>I think I would feel more comfortable if we clamped to +- 10 instead. I'm a bit more comfortable with the stability of the calculation at lower values. Also, when x > 10, e^(-x) is so small relative to e^x that it won't actually count in the floating-point calculation. In particular e^(-10)/e^(10) = e^(-20) = 2^-27.182 and you only have 23 mantissa bits.<br><br></div><div>Also, would you mind fixing spirv_to_nir while you're at it? It should be in vtn_glsl450.c. I can port it myself if you'd prefer.<br></div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>+</div><div> /* (e^x - e^(-x)) / (e^x + e^(-x)) */</div><div>- body.emit(ret(div(sub(exp(x), exp(neg(x))),</div><div>- add(exp(x), exp(neg(x))))));</div><div>+ body.emit(ret(div(sub(exp(t), exp(neg(t))),</div><div>+ add(exp(t), exp(neg(t))))));</div><div> </div><div> return sig;</div><div> }</div></div><div><br></div><div><br></div><div><br></div></div> </blockquote></span></div><br></div></div> </blockquote></div><br></div>