[Mesa-dev] [PATCH 4/4] i965: Port INTEL_PRECISE_TRIG=1 to NIR.
Eduardo Lima Mitev
elima at igalia.com
Fri Apr 8 16:36:16 UTC 2016
On 04/08/2016 01:35 AM, Kenneth Graunke wrote:
> This makes the extra multiply visible to NIR's algebraic optimizations
> (for constant reassociation) as well as constant folding. This means
> that when the result of sin/cos are multiplied by an constant, we can
> eliminate the extra multiply altogether, reducing the cost of the
> workaround.
>
> It also means we only have to implement it one place, rather than in
> both backends.
>
> This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion,
> which has a ton of sin() calls, but always multiplies them by an
> immediate constant. The extra multiply gets folded away.
>
Cleaner, indeed. Patch (and series) is:
Reviewed-by: Eduardo Lima Mitev <elima at igalia.com>
Thanks!
Eduardo
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
> src/mesa/drivers/dri/i965/Makefile.am | 5 +++
> src/mesa/drivers/dri/i965/Makefile.sources | 1 +
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 16 +-------
> src/mesa/drivers/dri/i965/brw_nir.c | 3 ++
> src/mesa/drivers/dri/i965/brw_nir.h | 2 +
> .../drivers/dri/i965/brw_nir_trig_workarounds.py | 43 ++++++++++++++++++++++
> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 16 +-------
> 7 files changed, 58 insertions(+), 28 deletions(-)
> create mode 100755 src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
>
> diff --git a/src/mesa/drivers/dri/i965/Makefile.am b/src/mesa/drivers/dri/i965/Makefile.am
> index 0db5a51..a41c830 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.am
> +++ b/src/mesa/drivers/dri/i965/Makefile.am
> @@ -33,6 +33,7 @@ AM_CFLAGS = \
> -I$(top_srcdir)/src/mesa/drivers/dri/common \
> -I$(top_srcdir)/src/mesa/drivers/dri/intel/server \
> -I$(top_srcdir)/src/gtest/include \
> + -I$(top_srcdir)/src/compiler/nir \
> -I$(top_builddir)/src/compiler/nir \
> -I$(top_builddir)/src/mesa/drivers/dri/common \
> $(DEFINES) \
> @@ -41,6 +42,10 @@ AM_CFLAGS = \
>
> AM_CXXFLAGS = $(AM_CFLAGS)
>
> +brw_nir_trig_workarounds.c: brw_nir_trig_workarounds.py $(top_srcdir)/src/compiler/nir/nir_algebraic.py
> + $(MKDIR_GEN)
> + $(AM_V_GEN) PYTHONPATH=$(top_srcdir)/src/compiler/nir $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_nir_trig_workarounds.py > $@ || ($(RM) $@; false)
> +
> noinst_LTLIBRARIES = libi965_dri.la libi965_compiler.la
> libi965_dri_la_SOURCES = $(i965_FILES)
> libi965_dri_la_LIBADD = libi965_compiler.la $(INTEL_LIBS)
> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources
> index 4689588..2619e43 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.sources
> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> @@ -44,6 +44,7 @@ i965_compiler_FILES = \
> brw_nir.c \
> brw_nir_analyze_boolean_resolves.c \
> brw_nir_attribute_workarounds.c \
> + brw_nir_trig_workarounds.c \
> brw_nir_opt_peephole_ffma.c \
> brw_nir_uniforms.cpp \
> brw_packed_float.c \
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 90b8789..bd6314a 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -775,24 +775,12 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr)
> break;
>
> case nir_op_fsin:
> - if (!compiler->precise_trig) {
> - inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]);
> - } else {
> - fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F);
> - inst = bld.emit(SHADER_OPCODE_SIN, tmp, op[0]);
> - inst = bld.MUL(result, tmp, brw_imm_f(0.99997));
> - }
> + inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]);
> inst->saturate = instr->dest.saturate;
> break;
>
> case nir_op_fcos:
> - if (!compiler->precise_trig) {
> - inst = bld.emit(SHADER_OPCODE_COS, result, op[0]);
> - } else {
> - fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F);
> - inst = bld.emit(SHADER_OPCODE_COS, tmp, op[0]);
> - inst = bld.MUL(result, tmp, brw_imm_f(0.99997));
> - }
> + inst = bld.emit(SHADER_OPCODE_COS, result, op[0]);
> inst->saturate = instr->dest.saturate;
> break;
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c
> index 1821c0d..932979a 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -447,6 +447,9 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir)
> if (nir->stage == MESA_SHADER_GEOMETRY)
> OPT(nir_lower_gs_intrinsics);
>
> + if (compiler->precise_trig)
> + OPT(brw_nir_apply_trig_workarounds);
> +
> static const nir_lower_tex_options tex_options = {
> .lower_txp = ~0,
> };
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.h b/src/mesa/drivers/dri/i965/brw_nir.h
> index b10c083..2711606 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.h
> +++ b/src/mesa/drivers/dri/i965/brw_nir.h
> @@ -106,6 +106,8 @@ bool brw_nir_apply_attribute_workarounds(nir_shader *nir,
> bool use_legacy_snorm_formula,
> const uint8_t *attrib_wa_flags);
>
> +bool brw_nir_apply_trig_workarounds(nir_shader *nir);
> +
> nir_shader *brw_nir_apply_sampler_key(nir_shader *nir,
> const struct brw_device_info *devinfo,
> const struct brw_sampler_prog_key_data *key,
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
> new file mode 100755
> index 0000000..67dab9a
> --- /dev/null
> +++ b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
> @@ -0,0 +1,43 @@
> +#! /usr/bin/env python
> +#
> +# Copyright (C) 2016 Intel Corporation
> +#
> +# Permission is hereby granted, free of charge, to any person obtaining a
> +# copy of this software and associated documentation files (the "Software"),
> +# to deal in the Software without restriction, including without limitation
> +# the rights to use, copy, modify, merge, publish, distribute, sublicense,
> +# and/or sell copies of the Software, and to permit persons to whom the
> +# Software is furnished to do so, subject to the following conditions:
> +#
> +# The above copyright notice and this permission notice (including the next
> +# paragraph) shall be included in all copies or substantial portions of the
> +# Software.
> +#
> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> +# IN THE SOFTWARE.
> +
> +import nir_algebraic
> +
> +# The SIN and COS instructions on Intel hardware can produce values
> +# slightly outside of the [-1.0, 1.0] range for a small set of values.
> +# Obviously, this can break everyone's expectations about trig functions.
> +#
> +# According to an internal presentation, the COS instruction can produce
> +# a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One
> +# suggested workaround is to multiply by 0.99997, scaling down the
> +# amplitude slightly. Apparently this also minimizes the error function,
> +# reducing the maximum error from 0.00006 to about 0.00003.
> +
> +trig_workarounds = [
> + (('fsin', 'x'), ('fmul', ('fsin', 'x'), 0.99997)),
> + (('fcos', 'x'), ('fmul', ('fcos', 'x'), 0.99997)),
> +]
> +
> +print '#include "brw_nir.h"'
> +print nir_algebraic.AlgebraicPass("brw_nir_apply_trig_workarounds",
> + trig_workarounds).render()
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index d9f96c5..e4e8c38 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1101,24 +1101,12 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
> break;
>
> case nir_op_fsin:
> - if (!compiler->precise_trig) {
> - inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]);
> - } else {
> - src_reg tmp = src_reg(this, glsl_type::vec4_type);
> - inst = emit_math(SHADER_OPCODE_SIN, dst_reg(tmp), op[0]);
> - inst = emit(MUL(dst, tmp, brw_imm_f(0.99997)));
> - }
> + inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]);
> inst->saturate = instr->dest.saturate;
> break;
>
> case nir_op_fcos:
> - if (!compiler->precise_trig) {
> - inst = emit_math(SHADER_OPCODE_COS, dst, op[0]);
> - } else {
> - src_reg tmp = src_reg(this, glsl_type::vec4_type);
> - inst = emit_math(SHADER_OPCODE_COS, dst_reg(tmp), op[0]);
> - inst = emit(MUL(dst, tmp, brw_imm_f(0.99997)));
> - }
> + inst = emit_math(SHADER_OPCODE_COS, dst, op[0]);
> inst->saturate = instr->dest.saturate;
> break;
>
>
More information about the mesa-dev
mailing list