[Mesa-dev] [PATCH 20/21] i965/ir: Make BROADCAST emit an unmasked single-channel move.

Francisco Jerez currojerez at riseup.net
Wed May 25 00:21:39 UTC 2016


Jason Ekstrand <jason at jlekstrand.net> writes:

> On Tue, May 24, 2016 at 12:18 AM, Francisco Jerez <currojerez at riseup.net>
> wrote:
>
>> Alternatively we could have extended the current semantics to 32-wide
>> mode by changing brw_broadcast() to emit multiple indexed MOV
>> instructions in the generator copying the selected value to all
>> destination registers, but it seemed rather silly to waste EU cycles
>> unnecessarily copying the exact same value 32 times in the GRF.
>>
>
> It appears as if emit_uniformize in fs_builder sets stride == 0 on the
> result the version in vec4_visitor does not.  It's probably not needed for
> correctness since I think the generator will always take channel 0 anyway,
> but we should fix it none the less.

Unfortunately we can't without substantial churn, because the VEC4 IR
currently has no way to represent scalar regions in the VGRF.

>
>
>> The vstride change in the Align16 path is required to avoid assertions
>> in validate_reg() since the change causes the execution size of the
>> MOV and SEL instructions to be equal to the source region width.
>> ---
>>  src/mesa/drivers/dri/i965/brw_defines.h          |  6 ++++++
>>  src/mesa/drivers/dri/i965/brw_eu_emit.c          | 12 +++++++++---
>>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  1 +
>>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  1 +
>>  4 files changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
>> b/src/mesa/drivers/dri/i965/brw_defines.h
>> index 702eb5a..8794d44 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -1080,6 +1080,12 @@ enum opcode {
>>     /**
>>      * Pick the channel from its first source register given by the index
>>      * specified as second source.  Useful for variable indexing of
>> surfaces.
>> +    *
>> +    * Note that because the result of this instruction is by definition
>> +    * uniform and it can always be splatted to multiple channels using a
>> +    * scalar regioning mode, only the first channel of the destination
>> region
>> +    * is guaranteed to be updated, which implies that BROADCAST
>> instructions
>> +    * should usually be marked force_writemask_all.
>>      */
>>     SHADER_OPCODE_BROADCAST,
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> index b11398c..ee7462f 100644
>> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> @@ -3425,6 +3425,10 @@ brw_broadcast(struct brw_codegen *p,
>>     const bool align1 = brw_inst_access_mode(devinfo, p->current) ==
>> BRW_ALIGN_1;
>>     brw_inst *inst;
>>
>> +   brw_push_insn_state(p);
>> +   brw_set_default_mask_control(p, BRW_MASK_DISABLE);
>> +   brw_set_default_exec_size(p, align1 ? BRW_EXECUTE_1 : BRW_EXECUTE_4);
>> +
>>     assert(src.file == BRW_GENERAL_REGISTER_FILE &&
>>            src.address_mode == BRW_ADDRESS_DIRECT);
>>
>> @@ -3475,19 +3479,21 @@ brw_broadcast(struct brw_codegen *p,
>>            */
>>           inst = brw_MOV(p,
>>                          brw_null_reg(),
>> -                        stride(brw_swizzle(idx, BRW_SWIZZLE_XXXX), 0, 4,
>> 1));
>> +                        stride(brw_swizzle(idx, BRW_SWIZZLE_XXXX), 4, 4,
>> 1));
>>           brw_inst_set_pred_control(devinfo, inst, BRW_PREDICATE_NONE);
>>           brw_inst_set_cond_modifier(devinfo, inst, BRW_CONDITIONAL_NZ);
>>           brw_inst_set_flag_reg_nr(devinfo, inst, 1);
>>
>>           /* and use predicated SEL to pick the right channel. */
>>           inst = brw_SEL(p, dst,
>> -                        stride(suboffset(src, 4), 0, 4, 1),
>> -                        stride(src, 0, 4, 1));
>> +                        stride(suboffset(src, 4), 4, 4, 1),
>> +                        stride(src, 4, 4, 1));
>>           brw_inst_set_pred_control(devinfo, inst, BRW_PREDICATE_NORMAL);
>>           brw_inst_set_flag_reg_nr(devinfo, inst, 1);
>>        }
>>     }
>> +
>> +   brw_pop_insn_state(p);
>>  }
>>
>>  /**
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> index 6421870..804c639 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> @@ -2017,6 +2017,7 @@ fs_generator::generate_code(const cfg_t *cfg, int
>> dispatch_width)
>>           break;
>>
>>        case SHADER_OPCODE_BROADCAST:
>> +         assert(inst->force_writemask_all);
>>           brw_broadcast(p, dst, src[0], src[1]);
>>           break;
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> index baf4422..bb0254e 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> @@ -1886,6 +1886,7 @@ generate_code(struct brw_codegen *p,
>>           break;
>>
>>        case SHADER_OPCODE_BROADCAST:
>> +         assert(inst->force_writemask_all);
>>           brw_broadcast(p, dst, src[0], src[1]);
>>           break;
>>
>> --
>> 2.7.3
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160524/62d4ef65/attachment.sig>


More information about the mesa-dev mailing list