[Mesa-dev] [PATCH 2/2] i965/nir: Re-emit instructions instead of doing mov-to-flag when possible
Jason Ekstrand
jason at jlekstrand.net
Mon Mar 16 21:26:06 PDT 2015
On Mon, Mar 16, 2015 at 9:21 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> Because of the way that NIR does conditionals, we get them in any old SSA
> value. The actual boolean value used in the select or if is x != 0.
> Previously, we handled this by emitting a mov.nz to move the value to the
> flag register. However, this almost always adds at least one if not two
> instructions because we have to go through the VGRF when we could be
> comparing directly to the flag and then using the flag.
>
> With this commit, we simply re-emit the instruction that produces the value
> when we can. By doing so, we can use the flag directly and we trust in CSE
> to clean up for us if it can.
>
> Shader-db results:
>
> total instructions in shared programs: 4164120 -> 4110511 (-1.29%)
> instructions in affected programs: 2397042 -> 2343433 (-2.24%)
> helped: 13167
> HURT: 31
> GAINED: 4
> LOST: 4
With this series, the peephole series I sent out earlier, and one more
patch to fix up types in the NIR -> FS pass, we have the following
delta between GLSL IR and NIR:
total instructions in shared programs: 4090061 -> 4085083 (-0.12%)
instructions in affected programs: 2554907 -> 2549929 (-0.19%)
helped: 6311
HURT: 9448
GAINED: 67
LOST: 30
Yes, NIR is doing better!
--Jason
> ---
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 87 ++++++++++++++++++++++++++++++--
> 1 file changed, 84 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 492767b..4ff1b4d 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1283,12 +1283,93 @@ fs_visitor::get_nir_src(nir_src src, brw_reg_type type)
> }
> }
>
> +static nir_alu_instr *
> +get_bool_alu_parent_instr(nir_src src)
> +{
> + nir_instr *parent_instr = src.is_ssa ? src.ssa->parent_instr :
> + src.reg.reg->parent_instr;
> + if (!parent_instr || parent_instr->type != nir_instr_type_alu)
> + return NULL;
> +
> + nir_alu_instr *alu = nir_instr_as_alu(parent_instr);
> +
> + /* Instead of trying to algorithmically determine what instructions can
> + * or cannot be reconstructed to make a boolean, we give and explicit
> + * list for now. This has three primary benifits. 1) It's simple.
> + * 2) It's not liable to hit strange edge-cases that don't have piglit
> + * tests. 3) All of these instructions we *know* get emitted in a
> + * single instruction so we won't hurt too much by re-emitting them.
> + */
> + switch (alu->op) {
> + case nir_op_flt:
> + case nir_op_ilt:
> + case nir_op_ult:
> + case nir_op_fge:
> + case nir_op_ige:
> + case nir_op_uge:
> + case nir_op_feq:
> + case nir_op_ieq:
> + case nir_op_fne:
> + case nir_op_ine:
> + case nir_op_inot:
> + case nir_op_ixor:
> + case nir_op_ior:
> + case nir_op_iand:
> + case nir_op_f2b:
> + case nir_op_i2b:
> + break;
> + default:
> + return NULL;
> + }
> +
> + /* Now we need to check that all of the sources are at least psudo-ssa.
> + * If one of them was involved in a phi node then we can't be sure that
> + * just re-creating the value will work.
> + */
> + for (unsigned i = 0; i < nir_op_infos[alu->op].num_inputs; i++)
> + if (!alu->src[i].src.is_ssa &&
> + alu->src[i].src.reg.reg->parent_instr == NULL)
> + return NULL;
> +
> + switch (nir_op_infos[alu->op].output_type) {
> + case nir_type_int:
> + case nir_type_unsigned:
> + case nir_type_bool:
> + return alu;
> +
> + case nir_type_float:
> + default:
> + /* We can't treat a float-destination instruction as if it were a
> + * bool. Doing so would require messing with the types which might
> + * be bad and could even hang the GPU.
> + */
> + return NULL;
> + }
> +}
> +
> void
> fs_visitor::get_nir_src_as_flag(nir_src src, unsigned comp)
> {
> - fs_reg cond_src = get_nir_src(src, BRW_REGISTER_TYPE_UD);
> - fs_inst *inst = emit(MOV(reg_null_d, offset(cond_src, comp)));
> - inst->conditional_mod = BRW_CONDITIONAL_NZ;
> + nir_alu_instr *bool_alu = get_bool_alu_parent_instr(src);
> +
> + if (bool_alu) {
> + /* If it's an ALU instruction, we're *probably* better off just
> + * re-emitting it with a conditional mod than actually saving off the
> + * value and copying it to the flag register.
> + */
> + assert(bool_alu->dest.write_mask == 1);
> + nir_emit_alu(bool_alu);
> + fs_inst *alu_inst = (fs_inst *) this->instructions.get_tail();
> + alu_inst->dst = retype(reg_null_d, alu_inst->dst.type);
> +
> + if (alu_inst->conditional_mod == BRW_CONDITIONAL_NONE)
> + alu_inst->conditional_mod = BRW_CONDITIONAL_NZ;
> + } else {
> + /* The tried-and-true solution */
> + fs_reg cond_src = get_nir_src(src, BRW_REGISTER_TYPE_UD);
> + fs_inst *inst = emit(MOV(reg_null_d, offset(cond_src, comp)));
> + inst->conditional_mod = BRW_CONDITIONAL_NZ;
> + }
> }
>
> fs_reg
> --
> 2.3.2
>
More information about the mesa-dev
mailing list