<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Apr 24, 2018 at 7:38 AM, Rob Clark <span dir="ltr"><<a href="mailto:robdclark@gmail.com" target="_blank">robdclark@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">side-note, not sure if it really effects what you are doing here, but<br>
karol ran into some cases, like 8bit signed imax, which needs to be<br>
"lowered" to 16b (or 32b) and converted back for hw that doesn't<br>
support smaller than 16b (or 32b). I think I have the same case with<br>
ir3, which also has 16b but no 8b, (but he is a bit further along cl<br>
cts than I am)..<br>
<br>
I think there will be more of this sort of thing coming for more<br>
instructions and for more than just 16b vs 32b. So not sure if<br>
writing rules for each in nir_opt_algebraic.py will be so fun..<br></blockquote><div><br></div><div>Yeah, it may be that what we want is a generic "lower this to something with more bits" pass. If this is a problem for the CL people, maybe we just want some way to make it configurable and put it in core NIR. I don't really have a huge preference. I'm just trying to make sure we explore the solution space.<br><br></div><div>--Jason<br><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
BR,<br>
-R<br>
<div class="HOEnZb"><div class="h5"><br>
On Tue, Apr 24, 2018 at 9:56 AM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br>
> It may be useful to just use nir_algebraic for this. We already do for trig<br>
> workarounds. It's more painful from a build-system perspective but, in<br>
> general, the fewer hand-rolled algebraic lowering passes we have, the<br>
> better.<br>
><br>
> On Wed, Apr 11, 2018 at 12:20 AM, Iago Toral Quiroga <<a href="mailto:itoral@igalia.com">itoral@igalia.com</a>><br>
> wrote:<br>
>><br>
>> The hardware doesn't support 16-bit integer types, so we need to implement<br>
>> these using 32-bit integer instructions and then convert the result back<br>
>> to 16-bit.<br>
>> ---<br>
>> src/intel/Makefile.sources | 1 +<br>
>> src/intel/compiler/brw_nir.c | 2 +<br>
>> src/intel/compiler/brw_nir.h | 2 +<br>
>> src/intel/compiler/brw_nir_<wbr>lower_16bit_int_math.c | 108<br>
>> ++++++++++++++++++++++<br>
>> src/intel/compiler/meson.build | 1 +<br>
>> 5 files changed, 114 insertions(+)<br>
>> create mode 100644 src/intel/compiler/brw_nir_<wbr>lower_16bit_int_math.c<br>
>><br>
>> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources<br>
>> index 91c71a8dfaf..2cd76961ea4 100644<br>
>> --- a/src/intel/Makefile.sources<br>
>> +++ b/src/intel/Makefile.sources<br>
>> @@ -79,6 +79,7 @@ COMPILER_FILES = \<br>
>> compiler/brw_nir_analyze_<wbr>boolean_resolves.c \<br>
>> compiler/brw_nir_analyze_ubo_<wbr>ranges.c \<br>
>> compiler/brw_nir_attribute_<wbr>workarounds.c \<br>
>> + compiler/brw_nir_lower_16bit_<wbr>int_math.c \<br>
>> compiler/brw_nir_lower_cs_<wbr>intrinsics.c \<br>
>> compiler/brw_nir_opt_peephole_<wbr>ffma.c \<br>
>> compiler/brw_nir_tcs_<wbr>workarounds.c \<br>
>> diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c<br>
>> index 69ab162f888..2e5754076ed 100644<br>
>> --- a/src/intel/compiler/brw_nir.c<br>
>> +++ b/src/intel/compiler/brw_nir.c<br>
>> @@ -638,6 +638,8 @@ brw_preprocess_nir(const struct brw_compiler<br>
>> *compiler, nir_shader *nir)<br>
>> nir_lower_isign64 |<br>
>> nir_lower_divmod64);<br>
>><br>
>> + brw_nir_lower_16bit_int_math(<wbr>nir);<br>
>> +<br>
>> nir = brw_nir_optimize(nir, compiler, is_scalar);<br>
>><br>
>> if (is_scalar) {<br>
>> diff --git a/src/intel/compiler/brw_nir.h b/src/intel/compiler/brw_nir.h<br>
>> index 03f52da08e5..6ba1a8bc654 100644<br>
>> --- a/src/intel/compiler/brw_nir.h<br>
>> +++ b/src/intel/compiler/brw_nir.h<br>
>> @@ -152,6 +152,8 @@ void brw_nir_analyze_ubo_ranges(<wbr>const struct<br>
>> brw_compiler *compiler,<br>
>><br>
>> bool brw_nir_opt_peephole_ffma(nir_<wbr>shader *shader);<br>
>><br>
>> +bool brw_nir_lower_16bit_int_math(<wbr>nir_shader *shader);<br>
>> +<br>
>> nir_shader *brw_nir_optimize(nir_shader *nir,<br>
>> const struct brw_compiler *compiler,<br>
>> bool is_scalar);<br>
>> diff --git a/src/intel/compiler/brw_nir_<wbr>lower_16bit_int_math.c<br>
>> b/src/intel/compiler/brw_nir_<wbr>lower_16bit_int_math.c<br>
>> new file mode 100644<br>
>> index 00000000000..6876309a822<br>
>> --- /dev/null<br>
>> +++ b/src/intel/compiler/brw_nir_<wbr>lower_16bit_int_math.c<br>
>> @@ -0,0 +1,108 @@<br>
>> +/*<br>
>> + * Copyright © 2018 Intel Corporation<br>
>> + *<br>
>> + * Permission is hereby granted, free of charge, to any person obtaining<br>
>> a<br>
>> + * copy of this software and associated documentation files (the<br>
>> "Software"),<br>
>> + * to deal in the Software without restriction, including without<br>
>> limitation<br>
>> + * the rights to use, copy, modify, merge, publish, distribute,<br>
>> sublicense,<br>
>> + * and/or sell copies of the Software, and to permit persons to whom the<br>
>> + * Software is furnished to do so, subject to the following conditions:<br>
>> + *<br>
>> + * The above copyright notice and this permission notice (including the<br>
>> next<br>
>> + * paragraph) shall be included in all copies or substantial portions of<br>
>> the<br>
>> + * Software.<br>
>> + *<br>
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,<br>
>> EXPRESS OR<br>
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF<br>
>> MERCHANTABILITY,<br>
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT<br>
>> SHALL<br>
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR<br>
>> OTHER<br>
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,<br>
>> ARISING<br>
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER<br>
>> DEALINGS<br>
>> + * IN THE SOFTWARE.<br>
>> + */<br>
>> +<br>
>> +#include "brw_nir.h"<br>
>> +#include "nir_builder.h"<br>
>> +<br>
>> +/**<br>
>> + * Intel hardware doesn't support 16-bit integer Math instructions so<br>
>> this<br>
>> + * pass implements them in 32-bit and then converts the result back to<br>
>> 16-bit.<br>
>> + */<br>
>> +static void<br>
>> +lower_math_instr(nir_builder *bld, nir_alu_instr *alu, bool is_signed)<br>
>> +{<br>
>> + const nir_op op = alu->op;<br>
>> +<br>
>> + bld->cursor = nir_before_instr(&alu->instr);<br>
>> +<br>
>> + nir_ssa_def *srcs_32[4] = { NULL, NULL, NULL, NULL };<br>
>> + const uint32_t num_inputs = nir_op_infos[op].num_inputs;<br>
>> + for (uint32_t i = 0; i < num_inputs; i++) {<br>
>> + nir_ssa_def *src = nir_ssa_for_alu_src(bld, alu, i);<br>
>> + srcs_32[i] = is_signed ? nir_i2i32(bld, src) : nir_u2u32(bld, src);<br>
>> + }<br>
>> +<br>
>> + nir_ssa_def *dst_32 =<br>
>> + nir_build_alu(bld, op, srcs_32[0], srcs_32[1], srcs_32[2],<br>
>> srcs_32[3]);<br>
>> +<br>
>> + nir_ssa_def *dst_16 =<br>
>> + is_signed ? nir_i2i16(bld, dst_32) : nir_u2u16(bld, dst_32);<br>
>> +<br>
>> + nir_ssa_def_rewrite_uses(&alu-<wbr>>dest.dest.ssa,<br>
>> nir_src_for_ssa(dst_16));<br>
>> +}<br>
>> +<br>
>> +static bool<br>
>> +lower_instr(nir_builder *bld, nir_alu_instr *alu)<br>
>> +{<br>
>> + assert(alu->dest.dest.is_ssa);<br>
>> + if (alu->dest.dest.ssa.bit_size != 16)<br>
>> + return false;<br>
>> +<br>
>> + bool is_signed = false;<br>
>> + switch (alu->op) {<br>
>> + case nir_op_idiv:<br>
>> + case nir_op_imod:<br>
>> + is_signed = true;<br>
><br>
><br>
> You can get is_signed from nit_op_infos<br>
><br>
>><br>
>> + /* Fallthrough */<br>
>> + case nir_op_udiv:<br>
>> + case nir_op_umod:<br>
>> + case nir_op_irem:<br>
><br>
><br>
> How is irem unsigned?<br>
><br>
>><br>
>> + lower_math_instr(bld, alu, is_signed);<br>
>> + return true;<br>
>> + default:<br>
>> + return false;<br>
>> + }<br>
>> +}<br>
>> +<br>
>> +static bool<br>
>> +lower_impl(nir_function_impl *impl)<br>
>> +{<br>
>> + nir_builder b;<br>
>> + nir_builder_init(&b, impl);<br>
>> + bool progress = false;<br>
>> +<br>
>> + nir_foreach_block(block, impl) {<br>
>> + nir_foreach_instr_safe(instr, block) {<br>
>> + if (instr->type == nir_instr_type_alu)<br>
>> + progress |= lower_instr(&b, nir_instr_as_alu(instr));<br>
>> + }<br>
>> + }<br>
>> +<br>
>> + nir_metadata_preserve(impl, nir_metadata_block_index |<br>
>> + nir_metadata_dominance);<br>
>> +<br>
>> + return progress;<br>
>> +}<br>
>> +<br>
>> +bool<br>
>> +brw_nir_lower_16bit_int_math(<wbr>nir_shader *shader)<br>
>> +{<br>
>> + bool progress = false;<br>
>> +<br>
>> + nir_foreach_function(function, shader) {<br>
>> + if (function->impl)<br>
>> + progress |= lower_impl(function->impl);<br>
>> + }<br>
>> +<br>
>> + return progress;<br>
>> +}<br>
>> diff --git a/src/intel/compiler/meson.<wbr>build<br>
>> b/src/intel/compiler/meson.<wbr>build<br>
>> index 72b7a6796cb..d80fcd6e31b 100644<br>
>> --- a/src/intel/compiler/meson.<wbr>build<br>
>> +++ b/src/intel/compiler/meson.<wbr>build<br>
>> @@ -76,6 +76,7 @@ libintel_compiler_files = files(<br>
>> 'brw_nir_analyze_boolean_<wbr>resolves.c',<br>
>> 'brw_nir_analyze_ubo_ranges.c'<wbr>,<br>
>> 'brw_nir_attribute_<wbr>workarounds.c',<br>
>> + 'brw_nir_lower_16bit_int_math.<wbr>c',<br>
>> 'brw_nir_lower_cs_intrinsics.<wbr>c',<br>
>> 'brw_nir_opt_peephole_ffma.c',<br>
>> 'brw_nir_tcs_workarounds.c',<br>
>> --<br>
>> 2.14.1<br>
>><br>
>> ______________________________<wbr>_________________<br>
>> mesa-dev mailing list<br>
>> <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
>> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
><br>
><br>
><br>
> ______________________________<wbr>_________________<br>
> mesa-dev mailing list<br>
> <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
><br>
</div></div></blockquote></div><br></div></div>