[Mesa-dev] [PATCH] i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range.
Jason Ekstrand
jason at jlekstrand.net
Mon Apr 4 18:34:01 UTC 2016
On Sun, Apr 3, 2016 at 5:34 PM, Kenneth Graunke <kenneth at whitecape.org>
wrote:
> The SIN and COS instructions on Intel hardware can produce values
> slightly outside of the [-1.0, 1.0] range for a small set of values.
> Obviously, this can break everyone's expectations about trig functions.
>
> According to an internal presentation, the COS instruction can produce
> a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One
> suggested workaround is to multiply by 0.99997, scaling down the
> amplitude slightly. Apparently this also minimizes the error function,
> reducing the maximum error from 0.00006 to about 0.00003.
>
> When enabled, fixes 16 dEQP precision tests
>
> dEQP-GLES31.functional.shaders.builtin_functions.precision.
> {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}.
>
> at the cost of making every sin and cos call more expensive (about
> twice the number of cycles on recent hardware). Enabling this
> option has been shown to reduce GPUTest Volplosion performance by
> about 10%.
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
> src/mesa/drivers/dri/i965/brw_compiler.c | 2 ++
> src/mesa/drivers/dri/i965/brw_compiler.h | 6 ++++++
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 16 ++++++++++++++--
> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 16 ++++++++++++++--
> 4 files changed, 36 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c
> b/src/mesa/drivers/dri/i965/brw_compiler.c
> index 3da6aac..6509267 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.c
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.c
> @@ -147,6 +147,8 @@ brw_compiler_create(void *mem_ctx, const struct
> brw_device_info *devinfo)
> brw_fs_alloc_reg_sets(compiler);
> brw_vec4_alloc_reg_set(compiler);
>
> + compiler->precise_trig = env_var_as_boolean("INTEL_PRECISE_TRIG",
> false);
> +
> compiler->scalar_stage[MESA_SHADER_VERTEX] =
> devinfo->gen >= 8 && !(INTEL_DEBUG & DEBUG_VEC4VS);
> compiler->scalar_stage[MESA_SHADER_TESS_CTRL] = false;
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h
> b/src/mesa/drivers/dri/i965/brw_compiler.h
> index 27a95a3..231e000 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> @@ -92,6 +92,12 @@ struct brw_compiler {
>
> bool scalar_stage[MESA_SHADER_STAGES];
> struct gl_shader_compiler_options
> glsl_compiler_options[MESA_SHADER_STAGES];
> +
> + /**
> + * Apply workarounds for SIN and COS output range problems.
> + * This can negatively impact performance.
> + */
> + bool precise_trig;
>
This seems like the most reasonable thing to do
Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
> };
>
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 7839428..5cca91e 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -775,12 +775,24 @@ fs_visitor::nir_emit_alu(const fs_builder &bld,
> nir_alu_instr *instr)
> break;
>
> case nir_op_fsin:
> - inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]);
> + if (!compiler->precise_trig) {
> + inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]);
> + } else {
> + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F);
> + inst = bld.emit(SHADER_OPCODE_SIN, tmp, op[0]);
> + inst = bld.MUL(result, tmp, brw_imm_f(0.99997));
> + }
> inst->saturate = instr->dest.saturate;
> break;
>
> case nir_op_fcos:
> - inst = bld.emit(SHADER_OPCODE_COS, result, op[0]);
> + if (!compiler->precise_trig) {
> + inst = bld.emit(SHADER_OPCODE_COS, result, op[0]);
> + } else {
> + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F);
> + inst = bld.emit(SHADER_OPCODE_COS, tmp, op[0]);
> + inst = bld.MUL(result, tmp, brw_imm_f(0.99997));
> + }
> inst->saturate = instr->dest.saturate;
> break;
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index ee6929b..6c8fd06 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1101,12 +1101,24 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
> break;
>
> case nir_op_fsin:
> - inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]);
> + if (!compiler->precise_trig) {
> + inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]);
> + } else {
> + src_reg tmp = src_reg(this, glsl_type::vec4_type);
> + inst = emit_math(SHADER_OPCODE_SIN, dst_reg(tmp), op[0]);
> + inst = emit(MUL(dst, tmp, brw_imm_f(0.99997)));
> + }
> inst->saturate = instr->dest.saturate;
> break;
>
> case nir_op_fcos:
> - inst = emit_math(SHADER_OPCODE_COS, dst, op[0]);
> + if (!compiler->precise_trig) {
> + inst = emit_math(SHADER_OPCODE_COS, dst, op[0]);
> + } else {
> + src_reg tmp = src_reg(this, glsl_type::vec4_type);
> + inst = emit_math(SHADER_OPCODE_COS, dst_reg(tmp), op[0]);
> + inst = emit(MUL(dst, tmp, brw_imm_f(0.99997)));
> + }
> inst->saturate = instr->dest.saturate;
> break;
>
> --
> 2.7.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160404/8a95db37/attachment.html>
More information about the mesa-dev
mailing list