[Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

Fri Dec 9 04:41:10 UTC 2016

I'm wondering, isn't that actually a problem of the test, that is it
can't actually expect reasonable results with such input values?
Since within the shader languages those functions which are composed of
multiple other functions are usually allowed to basically accumulate all
the errors of said functions. Though I agree that results outside [-1,1]
would be odd...

btw I'm wondering if some vendors wouldn't implement that with slightly
simplified formula, e.g. (e^2x - 1) / (e^2x + 1) (this is what nvidia
used for cg apparently according to docs, saving one of the
exponentials). Might be worse for accuracy though (and won't solve this
problem, though it would now only need a one-sided clamp).

Roland

Am 09.12.2016 um 02:41 schrieb Haixia Shi:
> Clamp input scalar value to range [-10, +10] to avoid precision problems
> when the absolute value of input is too large.
> 
> Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test
> failures.
> 
> v2: added more explanation in the comment.
> v3: fixed a typo in the comment.
> 
> Signed-off-by: Haixia Shi <hshi at chromium.org>
> Cc: Jason Ekstrand <jason at jlekstrand.net>,
> Cc: Stéphane Marchesin <marcheu at chromium.org>,
> Cc: Kenneth Graunke <kenneth at whitecape.org>
> 
> Change-Id: I324c948b3323ff8107127c42934f14459e124b95
> ---
>  src/compiler/glsl/builtin_functions.cpp | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/src/compiler/glsl/builtin_functions.cpp b/src/compiler/glsl/builtin_functions.cpp
> index 3e4bcbb..0bacffb 100644
> --- a/src/compiler/glsl/builtin_functions.cpp
> +++ b/src/compiler/glsl/builtin_functions.cpp
> @@ -3563,9 +3563,18 @@ builtin_builder::_tanh(const glsl_type *type)
>     ir_variable *x = in_var(type, "x");
>     MAKE_SIG(type, v130, 1, x);
>  
> +   /*
> +    * Clamp x to [-10, +10] to avoid precision problems.
> +    * When x > 10, e^(-x) is so small relative to e^x that it gets flushed to
> +    * zero in the computation e^x + e^(-x). The same happens in the other
> +    * direction when x < -10.
> +    */
> +   ir_variable *t = body.make_temp(type, "tmp");
> +   body.emit(assign(t, min2(max2(x, imm(-10.0f)), imm(10.0f))));
> +
>     /* (e^x - e^(-x)) / (e^x + e^(-x)) */
> -   body.emit(ret(div(sub(exp(x), exp(neg(x))),
> -                     add(exp(x), exp(neg(x))))));
> +   body.emit(ret(div(sub(exp(t), exp(neg(t))),
> +                     add(exp(t), exp(neg(t))))));
>  
>     return sig;
>  }
>