[Mesa-dev] [PATCH v3] i965: skip reading unused slots at the begining of the URB for the FS

Kenneth Graunke kenneth at whitecape.org
Fri Sep 29 19:33:44 UTC 2017


On Friday, September 29, 2017 4:08:17 AM PDT Iago Toral wrote:
> On Fri, 2017-09-29 at 03:53 -0700, Kenneth Graunke wrote:
> > On Friday, September 29, 2017 3:36:34 AM PDT Iago Toral Quiroga
> > wrote:
> > > We can start reading the URB at the first offset that contains
> > > varyings
> > > that are actually read in the URB. We still need to make sure that
> > > we
> > > read at least one varying to honor hardware requirements.
> > > 
> > > This helps alleviate a problem introduced with 99df02ca26f61 for
> > > separate shader objects: without separate shader objects we assign
> > > locations sequentially, however, since that commit we have changed
> > > the
> > > method for SSO so that the VUE slot assigned depends on the number
> > > of
> > > builtin slots plus the location assigned to the varying. This fixed
> > > layout is intended to help SSO programs by avoiding on-the-fly
> > > recompiles
> > > when swapping out shaders, however, it also means that if a varying
> > > uses
> > > a large location number close to the maximum allowed by the SF/FS
> > > units
> > > (31), then the offset introduced by the number of builtin slots can
> > > push
> > > the location outside the range and trigger an assertion.
> > > 
> > > This problem is affecting at least the following CTS tests for
> > > enhanced layouts:
> > > 
> > > KHR-GL45.enhanced_layouts.varying_array_components
> > > KHR-GL45.enhanced_layouts.varying_array_locations
> > > KHR-GL45.enhanced_layouts.varying_components
> > > KHR-GL45.enhanced_layouts.varying_locations
> > > 
> > > which use SSO and the the location layout qualifier to select such
> > > location numbers explicitly.
> > > 
> > > This change helps these tests because for SSO we always have to
> > > include
> > > things such as VARYING_SLOT_CLIP_DIST{0,1} even if the fragment
> > > shader is
> > > very unlikely to read them, so by doing this we free builtin slots
> > > from
> > > the fixed VUE layout and we avoid the tests to crash in this
> > > scenario.
> > > 
> > > Of course, this is not a proper fix, we'd still run into problems
> > > if someone
> > > tries to use an explicit max location and read gl_ViewportIndex,
> > > gl_LayerID or
> > > gl_CullDistancein in the FS, but that would be a much less common
> > > bug and we
> > > can probably wait to see if anyone actually runs into that
> > > situation in a real
> > > world scenario before making the decision that more aggresive
> > > changes are
> > > required to support this without reverting 99df02ca26f61.
> > > 
> > > v2:
> > > - Add a debug message when we skip clip distances (Ilia)
> > > - we also need to account for this when we compute the urb setup
> > >   for the fragment shader stage, so add a compiler util to compute
> > >   the first slot that we need to read from the URB instead of
> > >   replicating the logic in both places.
> > > 
> > > v3:
> > > - Make the util more generic so it can account for all unused slots
> > >   at the beginning of the URB, that will make it more useful (Ken).
> > > - Drop the debug message, it was not what Ilia was asking for.
> > > 
> > > Suggested-by: Kenneth Graunke <kenneth at whitecape.org>
> > > ---
> > >  src/intel/compiler/brw_compiler.h             | 29
> > > +++++++++++++++++++++++++++
> > >  src/intel/compiler/brw_fs.cpp                 |  7 +++----
> > >  src/mesa/drivers/dri/i965/genX_state_upload.c | 16 ++++++---------
> > >  3 files changed, 38 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/src/intel/compiler/brw_compiler.h
> > > b/src/intel/compiler/brw_compiler.h
> > > index 6753a8daf0..038f3f9551 100644
> > > --- a/src/intel/compiler/brw_compiler.h
> > > +++ b/src/intel/compiler/brw_compiler.h
> > > @@ -1222,6 +1222,35 @@ brw_stage_has_packed_dispatch(const struct
> > > gen_device_info *devinfo,
> > >     }
> > >  }
> > >  
> > > +/**
> > > + * Computes the first varying slot in the URB produced by the
> > > previous stage
> > > + * that is used in the next stage. We do this by testing the
> > > varying slots in
> > > + * the previous stage's vue map against the inputs read in the
> > > next stage.
> > > + *
> > > + * Note that:
> > > + *
> > > + * - Each URB offset contains two varying slots and we can only
> > > skip a
> > > + *   full offset if both slots are unused, so the value we return
> > > here is always
> > > + *   rounded down to the closest multiple of two.
> > > + *
> > > + * - gl_Layer and gl_ViewportIndex don't have their own varying
> > > slots, they are
> > > + *   part of the vue header, so if these are read we can't skip
> > > anything.
> > > + */
> > > +static inline int
> > > +brw_compute_first_urb_slot_required(uint64_t inputs_read,
> > > +                                    const struct brw_vue_map
> > > *prev_stage_vue_map)
> > > +{
> > > +   if ((inputs_read & (VARYING_BIT_LAYER | VARYING_BIT_VIEWPORT))
> > > == 0) {
> > 
> > You don't actually need to do this, they will be in slot 0, so the
> > first
> > iteration of the loop will pick them up. :)
> 
> Actually, we need this, there are regressions without this condition. 
> 
> The problem is that these two varyings correspond to varying slots 22
> and 23, but we have them in the vue header, so slot 0, so we remove the
> slots when we setup the vue map and we need to check for them
> explicitly, see this code from brw_vue_map.c:
> 
>   /* gl_Layer and gl_ViewportIndex don't get their own varying slots --
> they
>     * are stored in the first VUE slot (VARYING_SLOT_PSIZ).
>     */
>    slots_valid &= ~(VARYING_BIT_LAYER | VARYING_BIT_VIEWPORT);

Gah.  That makes sense.  As is,

Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170929/151581a7/attachment.sig>


More information about the mesa-dev mailing list