Mesa (master): i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/ COS output range.

Kenneth Graunke kwg at kemper.freedesktop.org
Mon Apr 4 19:04:17 UTC 2016


Module: Mesa
Branch: master
Commit: 65fbc43d54403905e3eaea02372b5a364dc1d773
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=65fbc43d54403905e3eaea02372b5a364dc1d773

Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Jan 27 12:21:04 2016 -0800

i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range.

The SIN and COS instructions on Intel hardware can produce values
slightly outside of the [-1.0, 1.0] range for a small set of values.
Obviously, this can break everyone's expectations about trig functions.

According to an internal presentation, the COS instruction can produce
a value up to 1.000027 for inputs in the range (0.08296, 0.09888).  One
suggested workaround is to multiply by 0.99997, scaling down the
amplitude slightly.  Apparently this also minimizes the error function,
reducing the maximum error from 0.00006 to about 0.00003.

When enabled, fixes 16 dEQP precision tests

   dEQP-GLES31.functional.shaders.builtin_functions.precision.
   {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}.

at the cost of making every sin and cos call more expensive (about
twice the number of cycles on recent hardware).  Enabling this
option has been shown to reduce GPUTest Volplosion performance by
about 10%.

Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
Reviewed-by: Matt Turner <mattst88 at gmail.com>
Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

---

 src/mesa/drivers/dri/i965/brw_compiler.c   |  2 ++
 src/mesa/drivers/dri/i965/brw_compiler.h   |  6 ++++++
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 16 ++++++++++++++--
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 16 ++++++++++++++--
 4 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c b/src/mesa/drivers/dri/i965/brw_compiler.c
index 3da6aac..6509267 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.c
+++ b/src/mesa/drivers/dri/i965/brw_compiler.c
@@ -147,6 +147,8 @@ brw_compiler_create(void *mem_ctx, const struct brw_device_info *devinfo)
    brw_fs_alloc_reg_sets(compiler);
    brw_vec4_alloc_reg_set(compiler);
 
+   compiler->precise_trig = env_var_as_boolean("INTEL_PRECISE_TRIG", false);
+
    compiler->scalar_stage[MESA_SHADER_VERTEX] =
       devinfo->gen >= 8 && !(INTEL_DEBUG & DEBUG_VEC4VS);
    compiler->scalar_stage[MESA_SHADER_TESS_CTRL] = false;
diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h b/src/mesa/drivers/dri/i965/brw_compiler.h
index 27a95a3..231e000 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.h
+++ b/src/mesa/drivers/dri/i965/brw_compiler.h
@@ -92,6 +92,12 @@ struct brw_compiler {
 
    bool scalar_stage[MESA_SHADER_STAGES];
    struct gl_shader_compiler_options glsl_compiler_options[MESA_SHADER_STAGES];
+
+   /**
+    * Apply workarounds for SIN and COS output range problems.
+    * This can negatively impact performance.
+    */
+   bool precise_trig;
 };
 
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 7839428..5cca91e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -775,12 +775,24 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr)
       break;
 
    case nir_op_fsin:
-      inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]);
+      if (!compiler->precise_trig) {
+         inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]);
+      } else {
+         fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F);
+         inst = bld.emit(SHADER_OPCODE_SIN, tmp, op[0]);
+         inst = bld.MUL(result, tmp, brw_imm_f(0.99997));
+      }
       inst->saturate = instr->dest.saturate;
       break;
 
    case nir_op_fcos:
-      inst = bld.emit(SHADER_OPCODE_COS, result, op[0]);
+      if (!compiler->precise_trig) {
+         inst = bld.emit(SHADER_OPCODE_COS, result, op[0]);
+      } else {
+         fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F);
+         inst = bld.emit(SHADER_OPCODE_COS, tmp, op[0]);
+         inst = bld.MUL(result, tmp, brw_imm_f(0.99997));
+      }
       inst->saturate = instr->dest.saturate;
       break;
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index ee6929b..6c8fd06 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -1101,12 +1101,24 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
       break;
 
    case nir_op_fsin:
-      inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]);
+      if (!compiler->precise_trig) {
+         inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]);
+      } else {
+         src_reg tmp = src_reg(this, glsl_type::vec4_type);
+         inst = emit_math(SHADER_OPCODE_SIN, dst_reg(tmp), op[0]);
+         inst = emit(MUL(dst, tmp, brw_imm_f(0.99997)));
+      }
       inst->saturate = instr->dest.saturate;
       break;
 
    case nir_op_fcos:
-      inst = emit_math(SHADER_OPCODE_COS, dst, op[0]);
+      if (!compiler->precise_trig) {
+         inst = emit_math(SHADER_OPCODE_COS, dst, op[0]);
+      } else {
+         src_reg tmp = src_reg(this, glsl_type::vec4_type);
+         inst = emit_math(SHADER_OPCODE_COS, dst_reg(tmp), op[0]);
+         inst = emit(MUL(dst, tmp, brw_imm_f(0.99997)));
+      }
       inst->saturate = instr->dest.saturate;
       break;
 




More information about the mesa-commit mailing list