[Mesa-dev] [PATCH] i965/fs: Take the sample mask into account in FIND_LIVE_CHANNEL

Jason Ekstrand jason at jlekstrand.net
Wed Sep 14 07:22:25 UTC 2016


On Wed, Sep 14, 2016 at 12:02 AM, Francisco Jerez <currojerez at riseup.net>
wrote:

> Jason Ekstrand <jason at jlekstrand.net> writes:
>
> > Just looking at the channel enables is not sufficient, at least not on
> Sky
> > Lake.  Channels that are disabled by the sample_mask may show up in the
> > channel enable register as being enabled even if they are not executing.
> > This can cause FIND_LIVE_CHANNEL to return a channel that isn't actually
> > executing.  In our handling of interpolateAtSample we do a clever trick
> > with emit_uniformize to call the interpolator once for each unique sample
> > id.  Thanks to FIND_LIVE_CHANNEL returning a dead channel, we can get an
> > infinite loop which hangs the GPU.
> >
> > Signed-off-by: Jason Ekstrand <jason at jlekstrand.net>
>
> NAK, FIND_LIVE_CHANNEL returns channels from the EU execution mask by
> design (see the doxygen comment in brw_defines.h), which is necessary
> for the instruction to return a well-defined result when only helper
> invocations are enabled in the execution mask.  Several users of the
> instruction are likely to be relying on this.
>

Perhaps I need to be a bit more specific about what problem is being fixed
here.  It's not just that we need to take sample mask into account.  It's
actually a far more subtle problem.  It can happen (Yes, I've seen this in
practice) that a group of channels is disabled (i.e., doesn't execute, not
just helper invocations) but the corresponding bits in ec0 are set to 1.
Maybe this means that ec0 is still broken on Sky Lake.  Maybe it just means
that ec0 doesn't mean what it looks like it means.  I'm not sure.  What I
do know is that FIND_LIVE_CHANNEL is returning a 100% dead channel.

I'm fine if this is the wrong fix.  Maybe we need to just do the gen7 thing
everywhere.


> And isn't interpolateAtSample supposed to give a well-defined result too
> when it's run from a helper invocation?
>

Probably?


> > ---
> >  src/mesa/drivers/dri/i965/brw_eu.h               |  3 ++-
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c          | 22
> +++++++++++++++-------
> >  src/mesa/drivers/dri/i965/brw_fs_builder.h       |  3 ++-
> >  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  2 +-
> >  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  2 +-
> >  5 files changed, 21 insertions(+), 11 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu.h
> b/src/mesa/drivers/dri/i965/brw_eu.h
> > index 3e52764..9aaab78 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu.h
> > +++ b/src/mesa/drivers/dri/i965/brw_eu.h
> > @@ -488,7 +488,8 @@ brw_pixel_interpolator_query(struct brw_codegen *p,
> >
> >  void
> >  brw_find_live_channel(struct brw_codegen *p,
> > -                      struct brw_reg dst);
> > +                      struct brw_reg dst,
> > +                      struct brw_reg sample_mask);
> >
> >  void
> >  brw_broadcast(struct brw_codegen *p,
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index 3b12030..f593a8d 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > @@ -3361,7 +3361,8 @@ brw_pixel_interpolator_query(struct brw_codegen
> *p,
> >  }
> >
> >  void
> > -brw_find_live_channel(struct brw_codegen *p, struct brw_reg dst)
> > +brw_find_live_channel(struct brw_codegen *p, struct brw_reg dst,
> > +                      struct brw_reg sample_mask)
> >  {
> >     const struct gen_device_info *devinfo = p->devinfo;
> >     const unsigned exec_size = 1 << brw_inst_exec_size(devinfo,
> p->current);
> > @@ -3377,13 +3378,20 @@ brw_find_live_channel(struct brw_codegen *p,
> struct brw_reg dst)
> >
> >        if (devinfo->gen >= 8) {
> >           /* Getting the first active channel index is easy on Gen8:
> Just find
> > -          * the first bit set in the mask register.  The same register
> exists
> > -          * on HSW already but it reads back as all ones when the
> current
> > -          * instruction has execution masking disabled, so it's kind of
> > -          * useless.
> > +          * the first bit set in the mask register AND the sample
> mask.  The
> > +          * same register exists on HSW already but it reads back as
> all ones
> > +          * when the current instruction has execution masking
> disabled, so
> > +          * it's kind of useless.
> >            */
> > -         inst = brw_FBL(p, vec1(dst),
> > -                        retype(brw_mask_reg(0), BRW_REGISTER_TYPE_UD));
> > +         struct brw_reg mask_reg = retype(brw_mask_reg(0),
> > +                                          BRW_REGISTER_TYPE_UD);
> > +         if (sample_mask.file != BRW_IMMEDIATE_VALUE ||
> > +             sample_mask.ud != 0xffffffff) {
> > +            brw_AND(p, vec1(dst), mask_reg, sample_mask);
> > +            mask_reg = vec1(dst);
> > +         }
> > +
> > +         inst = brw_FBL(p, vec1(dst), mask_reg);
> >
> >           /* Quarter control has the effect of magically shifting the
> value of
> >            * this register so you'll get the first active channel
> relative to
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h
> b/src/mesa/drivers/dri/i965/brw_fs_builder.h
> > index 483672f..45b5f88 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
> > @@ -407,7 +407,8 @@ namespace brw {
> >           const dst_reg chan_index = vgrf(BRW_REGISTER_TYPE_UD);
> >           const dst_reg dst = vgrf(src.type);
> >
> > -         ubld.emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, chan_index);
> > +         ubld.emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, chan_index,
> > +                   sample_mask_reg());
> >           ubld.emit(SHADER_OPCODE_BROADCAST, dst, src,
> component(chan_index, 0));
> >
> >           return src_reg(component(dst, 0));
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > index 2f4ba7b..d923b0b 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > @@ -2041,7 +2041,7 @@ fs_generator::generate_code(const cfg_t *cfg, int
> dispatch_width)
> >           break;
> >
> >        case SHADER_OPCODE_FIND_LIVE_CHANNEL:
> > -         brw_find_live_channel(p, dst);
> > +         brw_find_live_channel(p, dst, src[0]);
> >           break;
> >
> >        case SHADER_OPCODE_BROADCAST:
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > index 256abae..63fca6f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > @@ -1863,7 +1863,7 @@ generate_code(struct brw_codegen *p,
> >           break;
> >
> >        case SHADER_OPCODE_FIND_LIVE_CHANNEL:
> > -         brw_find_live_channel(p, dst);
> > +         brw_find_live_channel(p, dst, brw_null_reg());
> >           break;
> >
> >        case SHADER_OPCODE_BROADCAST:
> > --
> > 2.5.0.400.gff86faf
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160914/1d60a190/attachment-0001.html>


More information about the mesa-dev mailing list