[Mesa-dev] [PATCH 3/3] nir/algebraic: Add lowering for ldexp

Wed Apr 13 19:54:49 UTC 2016

On Wed, Apr 13, 2016 at 12:37 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> The algorithm used is different from both the naieve suggestion from the
> GLSL spec and the one used in GLSL IR today.  Unfortunately, the GLSL IR
> implementation doesn't handle some of the corner cases correctly and

Let's change this to say exactly what the GLSL ldexp lowering code
doesn't handle properly: it doesn't generate infinity for inputs of
infinity. I've sent a patch to fix the behavior of ldexp(0.0f, exp) to
return 0.0f.

> neither does a naieve f * 2.0^exp implementation.  Assuming that hardware

naive (or naïve if you want to be pedantic)

> does the sane thing when multiplying by an exact power of two, this
> implementation should actually be correct.  It does pass all of the Vulkan
> CTS tests (which a simple port of the GLSL IR implementation does not).

I'd cut this text starting from "Assuming"

>
> Cc: Matt Turner <mattst88 at gmail.com>
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py
> index 2749b06..7d1937e 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -371,6 +371,37 @@ optimizations = [
>       'options->lower_unpack_snorm_4x8'),
>  ]
>
> +def fexp2i(exp):
> +   # We assume that exp is already in the range [-126, 127].
> +   return ('ishl', ('iadd', exp, 127), 23)
> +
> +def ldexp32(f, exp):
> +   # First, we clamp exp to a reasonable range.  The maximum possible range
> +   # for a normal exponent is [-126, 127] and, throwing in denormals, you get
> +   # a maximum range of [-149, 127].  This means that we can potentially have
> +   # a swing of +-276.  If you start with FLT_MAX, you actually have to do
> +   # ldexp(FLT_MAX, -278) to get it to flush all the way to zero.  The GLSL
> +   # spec, on the other hand, only requires that we handle an exponent value
> +   # in the range [-127, 128].  This implementation is *mostly* correct; it

It's actually -126 to 128.

The series is

Reviewed-by: Matt Turner <mattst88 at gmail.com>