[Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

Roland Scheidegger sroland at vmware.com
Tue Dec 20 02:48:17 UTC 2016

```Am 20.12.2016 um 00:12 schrieb Giuseppe Bilotta:
> Hello,
>
> the formula used for
> tanh should be changed again. Specifically, as suggested by Roland
>
> On Fri, Dec 9, 2016 at 5:41 AM, Roland Scheidegger <sroland at vmware.com> wrote:
>> btw I'm wondering if some vendors wouldn't implement that with slightly
>> simplified formula, e.g. (e^2x - 1) / (e^2x + 1) (this is what nvidia
>> used for cg apparently according to docs, saving one of the
>> exponentials). Might be worse for accuracy though (and won't solve this
>> problem, though it would now only need a one-sided clamp).

It was changed to this formula.

>
> Another option is the 1 - 2/(1+expf(2x)), or even better 1 -
> 2/(2+expm1f(2x)).. I've run some tests and this seems to have the same
> accuracy as the
> one mentioned by Roland, with the bonus benefit of not needing any
> clamping. The accuracy seems to actually be better
> than the direct evaluation (difference over sum of exps), except
> around 0 (say, when abs(x) < 1).

The 1 - 2/(1+expf(2x)) is worse for numbers close to zero (probably
provably so, I think you might have one bit more to play with there with
the other formula due to the division by essentially 2). e.g. if you
have 8e-8, libm tanhf() gives me 8e-8 as a result (it looks like it's
actually hard-coded to return the input as result for sufficiently small
values), the (e^2x - 1) / (e^2x + 1) formula gives 5.960464e-08 whereas
1 - 2/(1+expf(2x)) will give you back 0.0f (but, with even smaller
values like 2e-8, both methods will return 0.0f which is pretty wrong in
any case, the relative error can get to enormous levels there).
I'm not sure which method is better for larger values, I think they
might be about the same. Nvidia docs stating they use the slightly more
complex formula for cg though may be a hint that this indeed has some
properties which are nice-to-have. Though arguably it's not that more
complex, since the only part it saves is the one-sided clamp - the most
expensive parts are the exp and the div, neither of which you can get
rid of.
Not sure it really matters though one way or another. If you wanted good
accuracy around 0, you'd have to use a different formula plus a select
(seems like libm implementations actually use 3 cases depending on input
value magnitude - not so hot with vectors, but thankfully glsl doesn't
require 1 ULP accuracy).

Roland

>
> I've found the relative error away from 0 to be typically in the same
> order of magnitude as the error in tanhf() itself (compared to tanh())
> , and generally less than machine epsilon.. I'm currently looking at
> options to improve the accuracy without clamping and without excessive
> additional computations, might propose a patch in the next couple of
> days.
>
> Just one question though â€”not knowing much of the shader language, can
> I expect expm1 to be available?
>

```