[Mesa-dev] [PATCH] i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range.

Jason Ekstrand jason at jlekstrand.net
Mon Apr 4 18:34:01 UTC 2016


On Sun, Apr 3, 2016 at 5:34 PM, Kenneth Graunke <kenneth at whitecape.org>
wrote:

> The SIN and COS instructions on Intel hardware can produce values
> slightly outside of the [-1.0, 1.0] range for a small set of values.
> Obviously, this can break everyone's expectations about trig functions.
>
> According to an internal presentation, the COS instruction can produce
> a value up to 1.000027 for inputs in the range (0.08296, 0.09888).  One
> suggested workaround is to multiply by 0.99997, scaling down the
> amplitude slightly.  Apparently this also minimizes the error function,
> reducing the maximum error from 0.00006 to about 0.00003.
>
> When enabled, fixes 16 dEQP precision tests
>
>    dEQP-GLES31.functional.shaders.builtin_functions.precision.
>    {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}.
>
> at the cost of making every sin and cos call more expensive (about
> twice the number of cycles on recent hardware).  Enabling this
> option has been shown to reduce GPUTest Volplosion performance by
> about 10%.
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_compiler.c   |  2 ++
>  src/mesa/drivers/dri/i965/brw_compiler.h   |  6 ++++++
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 16 ++++++++++++++--
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 16 ++++++++++++++--
>  4 files changed, 36 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c
> b/src/mesa/drivers/dri/i965/brw_compiler.c
> index 3da6aac..6509267 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.c
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.c
> @@ -147,6 +147,8 @@ brw_compiler_create(void *mem_ctx, const struct
> brw_device_info *devinfo)
>     brw_fs_alloc_reg_sets(compiler);
>     brw_vec4_alloc_reg_set(compiler);
>
> +   compiler->precise_trig = env_var_as_boolean("INTEL_PRECISE_TRIG",
> false);
> +
>     compiler->scalar_stage[MESA_SHADER_VERTEX] =
>        devinfo->gen >= 8 && !(INTEL_DEBUG & DEBUG_VEC4VS);
>     compiler->scalar_stage[MESA_SHADER_TESS_CTRL] = false;
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h
> b/src/mesa/drivers/dri/i965/brw_compiler.h
> index 27a95a3..231e000 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> @@ -92,6 +92,12 @@ struct brw_compiler {
>
>     bool scalar_stage[MESA_SHADER_STAGES];
>     struct gl_shader_compiler_options
> glsl_compiler_options[MESA_SHADER_STAGES];
> +
> +   /**
> +    * Apply workarounds for SIN and COS output range problems.
> +    * This can negatively impact performance.
> +    */
> +   bool precise_trig;
>

This seems like the most reasonable thing to do

Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>


>  };
>
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 7839428..5cca91e 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -775,12 +775,24 @@ fs_visitor::nir_emit_alu(const fs_builder &bld,
> nir_alu_instr *instr)
>        break;
>
>     case nir_op_fsin:
> -      inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]);
> +      if (!compiler->precise_trig) {
> +         inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]);
> +      } else {
> +         fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F);
> +         inst = bld.emit(SHADER_OPCODE_SIN, tmp, op[0]);
> +         inst = bld.MUL(result, tmp, brw_imm_f(0.99997));
> +      }
>        inst->saturate = instr->dest.saturate;
>        break;
>
>     case nir_op_fcos:
> -      inst = bld.emit(SHADER_OPCODE_COS, result, op[0]);
> +      if (!compiler->precise_trig) {
> +         inst = bld.emit(SHADER_OPCODE_COS, result, op[0]);
> +      } else {
> +         fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F);
> +         inst = bld.emit(SHADER_OPCODE_COS, tmp, op[0]);
> +         inst = bld.MUL(result, tmp, brw_imm_f(0.99997));
> +      }
>        inst->saturate = instr->dest.saturate;
>        break;
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index ee6929b..6c8fd06 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1101,12 +1101,24 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
>        break;
>
>     case nir_op_fsin:
> -      inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]);
> +      if (!compiler->precise_trig) {
> +         inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]);
> +      } else {
> +         src_reg tmp = src_reg(this, glsl_type::vec4_type);
> +         inst = emit_math(SHADER_OPCODE_SIN, dst_reg(tmp), op[0]);
> +         inst = emit(MUL(dst, tmp, brw_imm_f(0.99997)));
> +      }
>        inst->saturate = instr->dest.saturate;
>        break;
>
>     case nir_op_fcos:
> -      inst = emit_math(SHADER_OPCODE_COS, dst, op[0]);
> +      if (!compiler->precise_trig) {
> +         inst = emit_math(SHADER_OPCODE_COS, dst, op[0]);
> +      } else {
> +         src_reg tmp = src_reg(this, glsl_type::vec4_type);
> +         inst = emit_math(SHADER_OPCODE_COS, dst_reg(tmp), op[0]);
> +         inst = emit(MUL(dst, tmp, brw_imm_f(0.99997)));
> +      }
>        inst->saturate = instr->dest.saturate;
>        break;
>
> --
> 2.7.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160404/8a95db37/attachment.html>


More information about the mesa-dev mailing list