[Mesa-dev] [PATCH 4/4] i965: Port INTEL_PRECISE_TRIG=1 to NIR.

Jason Ekstrand jason at jlekstrand.net
Fri Apr 8 22:24:21 UTC 2016


On Fri, Apr 8, 2016 at 9:36 AM, Eduardo Lima Mitev <elima at igalia.com> wrote:

> On 04/08/2016 01:35 AM, Kenneth Graunke wrote:
>
>> This makes the extra multiply visible to NIR's algebraic optimizations
>> (for constant reassociation) as well as constant folding.  This means
>> that when the result of sin/cos are multiplied by an constant, we can
>> eliminate the extra multiply altogether, reducing the cost of the
>> workaround.
>>
>> It also means we only have to implement it one place, rather than in
>> both backends.
>>
>> This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion,
>> which has a ton of sin() calls, but always multiplies them by an
>> immediate constant.  The extra multiply gets folded away.
>>
>>
> Cleaner, indeed. Patch (and series) is:
>
> Reviewed-by: Eduardo Lima Mitev <elima at igalia.com>
>
> Thanks!
>
> Eduardo
>
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>> ---
>>   src/mesa/drivers/dri/i965/Makefile.am              |  5 +++
>>   src/mesa/drivers/dri/i965/Makefile.sources         |  1 +
>>   src/mesa/drivers/dri/i965/brw_fs_nir.cpp           | 16 +-------
>>   src/mesa/drivers/dri/i965/brw_nir.c                |  3 ++
>>   src/mesa/drivers/dri/i965/brw_nir.h                |  2 +
>>   .../drivers/dri/i965/brw_nir_trig_workarounds.py   | 43
>> ++++++++++++++++++++++
>>   src/mesa/drivers/dri/i965/brw_vec4_nir.cpp         | 16 +-------
>>   7 files changed, 58 insertions(+), 28 deletions(-)
>>   create mode 100755 src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
>>
>> diff --git a/src/mesa/drivers/dri/i965/Makefile.am
>> b/src/mesa/drivers/dri/i965/Makefile.am
>> index 0db5a51..a41c830 100644
>> --- a/src/mesa/drivers/dri/i965/Makefile.am
>> +++ b/src/mesa/drivers/dri/i965/Makefile.am
>> @@ -33,6 +33,7 @@ AM_CFLAGS = \
>>         -I$(top_srcdir)/src/mesa/drivers/dri/common \
>>         -I$(top_srcdir)/src/mesa/drivers/dri/intel/server \
>>         -I$(top_srcdir)/src/gtest/include \
>> +       -I$(top_srcdir)/src/compiler/nir \
>>         -I$(top_builddir)/src/compiler/nir \
>>         -I$(top_builddir)/src/mesa/drivers/dri/common \
>>         $(DEFINES) \
>> @@ -41,6 +42,10 @@ AM_CFLAGS = \
>>
>>   AM_CXXFLAGS = $(AM_CFLAGS)
>>
>> +brw_nir_trig_workarounds.c: brw_nir_trig_workarounds.py
>> $(top_srcdir)/src/compiler/nir/nir_algebraic.py
>> +       $(MKDIR_GEN)
>> +       $(AM_V_GEN) PYTHONPATH=$(top_srcdir)/src/compiler/nir $(PYTHON2)
>> $(PYTHON_FLAGS) $(srcdir)/brw_nir_trig_workarounds.py > $@ || ($(RM) $@;
>> false)
>> +
>>   noinst_LTLIBRARIES = libi965_dri.la libi965_compiler.la
>>   libi965_dri_la_SOURCES = $(i965_FILES)
>>   libi965_dri_la_LIBADD = libi965_compiler.la $(INTEL_LIBS)
>> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources
>> b/src/mesa/drivers/dri/i965/Makefile.sources
>> index 4689588..2619e43 100644
>> --- a/src/mesa/drivers/dri/i965/Makefile.sources
>> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
>> @@ -44,6 +44,7 @@ i965_compiler_FILES = \
>>         brw_nir.c \
>>         brw_nir_analyze_boolean_resolves.c \
>>         brw_nir_attribute_workarounds.c \
>> +       brw_nir_trig_workarounds.c \
>>         brw_nir_opt_peephole_ffma.c \
>>         brw_nir_uniforms.cpp \
>>         brw_packed_float.c \
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 90b8789..bd6314a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -775,24 +775,12 @@ fs_visitor::nir_emit_alu(const fs_builder &bld,
>> nir_alu_instr *instr)
>>         break;
>>
>>      case nir_op_fsin:
>> -      if (!compiler->precise_trig) {
>> -         inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]);
>> -      } else {
>> -         fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F);
>> -         inst = bld.emit(SHADER_OPCODE_SIN, tmp, op[0]);
>> -         inst = bld.MUL(result, tmp, brw_imm_f(0.99997));
>> -      }
>> +      inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]);
>>         inst->saturate = instr->dest.saturate;
>>         break;
>>
>>      case nir_op_fcos:
>> -      if (!compiler->precise_trig) {
>> -         inst = bld.emit(SHADER_OPCODE_COS, result, op[0]);
>> -      } else {
>> -         fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F);
>> -         inst = bld.emit(SHADER_OPCODE_COS, tmp, op[0]);
>> -         inst = bld.MUL(result, tmp, brw_imm_f(0.99997));
>> -      }
>> +      inst = bld.emit(SHADER_OPCODE_COS, result, op[0]);
>>         inst->saturate = instr->dest.saturate;
>>         break;
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
>> b/src/mesa/drivers/dri/i965/brw_nir.c
>> index 1821c0d..932979a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir.c
>> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
>> @@ -447,6 +447,9 @@ brw_preprocess_nir(const struct brw_compiler
>> *compiler, nir_shader *nir)
>>      if (nir->stage == MESA_SHADER_GEOMETRY)
>>         OPT(nir_lower_gs_intrinsics);
>>
>> +   if (compiler->precise_trig)
>> +      OPT(brw_nir_apply_trig_workarounds);
>> +
>>      static const nir_lower_tex_options tex_options = {
>>         .lower_txp = ~0,
>>      };
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir.h
>> b/src/mesa/drivers/dri/i965/brw_nir.h
>> index b10c083..2711606 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir.h
>> +++ b/src/mesa/drivers/dri/i965/brw_nir.h
>> @@ -106,6 +106,8 @@ bool brw_nir_apply_attribute_workarounds(nir_shader
>> *nir,
>>                                            bool use_legacy_snorm_formula,
>>                                            const uint8_t
>> *attrib_wa_flags);
>>
>> +bool brw_nir_apply_trig_workarounds(nir_shader *nir);
>> +
>>   nir_shader *brw_nir_apply_sampler_key(nir_shader *nir,
>>                                         const struct brw_device_info
>> *devinfo,
>>                                         const struct
>> brw_sampler_prog_key_data *key,
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
>> b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
>> new file mode 100755
>> index 0000000..67dab9a
>> --- /dev/null
>> +++ b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
>> @@ -0,0 +1,43 @@
>> +#! /usr/bin/env python
>> +#
>> +# Copyright (C) 2016 Intel Corporation
>> +#
>> +# Permission is hereby granted, free of charge, to any person obtaining a
>> +# copy of this software and associated documentation files (the
>> "Software"),
>> +# to deal in the Software without restriction, including without
>> limitation
>> +# the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> +# and/or sell copies of the Software, and to permit persons to whom the
>> +# Software is furnished to do so, subject to the following conditions:
>> +#
>> +# The above copyright notice and this permission notice (including the
>> next
>> +# paragraph) shall be included in all copies or substantial portions of
>> the
>> +# Software.
>> +#
>> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> SHALL
>> +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> DEALINGS
>> +# IN THE SOFTWARE.
>> +
>> +import nir_algebraic
>> +
>> +# The SIN and COS instructions on Intel hardware can produce values
>> +# slightly outside of the [-1.0, 1.0] range for a small set of values.
>> +# Obviously, this can break everyone's expectations about trig functions.
>> +#
>> +# According to an internal presentation, the COS instruction can produce
>> +# a value up to 1.000027 for inputs in the range (0.08296, 0.09888).  One
>> +# suggested workaround is to multiply by 0.99997, scaling down the
>> +# amplitude slightly.  Apparently this also minimizes the error function,
>> +# reducing the maximum error from 0.00006 to about 0.00003.
>> +
>> +trig_workarounds = [
>> +   (('fsin', 'x'), ('fmul', ('fsin', 'x'), 0.99997)),
>> +   (('fcos', 'x'), ('fmul', ('fcos', 'x'), 0.99997)),
>> +]
>> +
>> +print '#include "brw_nir.h"'
>> +print nir_algebraic.AlgebraicPass("brw_nir_apply_trig_workarounds",
>> +                                  trig_workarounds).render()
>>
>
Side note: Part of the intention of nir_algebraic when I wrote it over a
year ago was to make adding custom algebraic transformation passes very
light-weight and easy.  Glad to see someone finally taking advantage of
that. :-)

Series Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>


> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> index d9f96c5..e4e8c38 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> @@ -1101,24 +1101,12 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
>>         break;
>>
>>      case nir_op_fsin:
>> -      if (!compiler->precise_trig) {
>> -         inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]);
>> -      } else {
>> -         src_reg tmp = src_reg(this, glsl_type::vec4_type);
>> -         inst = emit_math(SHADER_OPCODE_SIN, dst_reg(tmp), op[0]);
>> -         inst = emit(MUL(dst, tmp, brw_imm_f(0.99997)));
>> -      }
>> +      inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]);
>>         inst->saturate = instr->dest.saturate;
>>         break;
>>
>>      case nir_op_fcos:
>> -      if (!compiler->precise_trig) {
>> -         inst = emit_math(SHADER_OPCODE_COS, dst, op[0]);
>> -      } else {
>> -         src_reg tmp = src_reg(this, glsl_type::vec4_type);
>> -         inst = emit_math(SHADER_OPCODE_COS, dst_reg(tmp), op[0]);
>> -         inst = emit(MUL(dst, tmp, brw_imm_f(0.99997)));
>> -      }
>> +      inst = emit_math(SHADER_OPCODE_COS, dst, op[0]);
>>         inst->saturate = instr->dest.saturate;
>>         break;
>>
>>
>>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160408/d461266f/attachment-0001.html>


More information about the mesa-dev mailing list