[Mesa-dev] [PATCH 18/21] i965/fs: Allow specifying arbitrary execution sizes up to 32 to FIND_LIVE_CHANNEL.

Kenneth Graunke kenneth at whitecape.org
Wed May 25 00:34:50 UTC 2016


On Tuesday, May 24, 2016 5:27:59 PM PDT Francisco Jerez wrote:
> Jason Ekstrand <jason at jlekstrand.net> writes:
> 
> > On Tue, May 24, 2016 at 12:18 AM, Francisco Jerez <currojerez at riseup.net>
> > wrote:
> >
> >> Due to a Gen7-specific hardware bug native 32-wide instructions get
> >> the lower 16 bits of the execution mask applied incorrectly to both
> >> halves of the instruction, so the MOV trick we currently use wouldn't
> >> work.  Instead emit multiple 16-wide MOV instructions in 32-wide mode
> >> in order to cover the whole execution mask.
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 25 +++++++++++++++++--------
> >>  1 file changed, 17 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> >> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> >> index af7caed..d36877c 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> >> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> >> @@ -3330,6 +3330,7 @@ void
> >>  brw_find_live_channel(struct brw_codegen *p, struct brw_reg dst)
> >>  {
> >>     const struct brw_device_info *devinfo = p->devinfo;
> >> +   const unsigned exec_size = 1 << brw_inst_exec_size(devinfo,
> >> p->current);
> >>     brw_inst *inst;
> >>
> >>     assert(devinfo->gen >= 7);
> >> @@ -3359,15 +3360,23 @@ brw_find_live_channel(struct brw_codegen *p,
> >> struct brw_reg dst)
> >>
> >>           brw_MOV(p, flag, brw_imm_ud(0));
> >>
> >> -         /* Run a 16-wide instruction returning zero with execution
> >> masking
> >> -          * and a conditional modifier enabled in order to get the current
> >> -          * execution mask in f1.0.
> >> +         /* Run enough instructions returning zero with execution masking
> >> and
> >> +          * a conditional modifier enabled in order to get the full
> >> execution
> >> +          * mask in f1.0.  We could use a single 32-wide move here if it
> >> +          * weren't because of the hardware bug that causes channel
> >> enables to
> >> +          * be applied incorrectly to the second half of 32-wide
> >> instructions
> >> +          * on Gen7.
> >>            */
> >> -         inst = brw_MOV(p, brw_null_reg(), brw_imm_ud(0));
> >> -         brw_inst_set_exec_size(devinfo, inst, BRW_EXECUTE_16);
> >> -         brw_inst_set_mask_control(devinfo, inst, BRW_MASK_ENABLE);
> >> -         brw_inst_set_cond_modifier(devinfo, inst, BRW_CONDITIONAL_Z);
> >> -         brw_inst_set_flag_reg_nr(devinfo, inst, 1);
> >> +         const unsigned lower_size = MIN2(16, exec_size);
> >> +         for (unsigned i = 0; i < exec_size / lower_size; i++) {
> >> +            inst = brw_MOV(p, retype(brw_null_reg(),
> >> BRW_REGISTER_TYPE_UW),
> >> +                           brw_imm_uw(0));
> >>
> >
> > Is there a reason this is changing from D to UW?
> >
> 
> It's likely to have lower execution latency than an instruction with
> 32-bit integer execution type.  It shouldn't have any practical
> implications other than that, the result of the instruction is only used
> to set bits of the flag register.

I've never heard anything about them having different latencies.
That doesn't mean that you're wrong, though. :)

--Ken
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160524/95a6d138/attachment.sig>


More information about the mesa-dev mailing list