[Mesa-dev] [PATCH v2] i965: Disable auxiliary buffers when there are self-dependencies.

Jason Ekstrand jason at jlekstrand.net
Sat Oct 7 15:03:54 UTC 2017


On October 6, 2017 10:25:55 PM Kenneth Graunke <kenneth at whitecape.org> wrote:

> On Friday, October 6, 2017 8:09:33 PM PDT Jason Ekstrand wrote:
>> On October 6, 2017 8:00:04 PM Kenneth Graunke <kenneth at whitecape.org> wrote:
>> > diff --git a/src/mesa/drivers/dri/i965/brw_draw.c
>> > b/src/mesa/drivers/dri/i965/brw_draw.c
>> > index c7ed7284501..fcb194dbe86 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_draw.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_draw.c
>> > @@ -336,21 +336,39 @@ brw_merge_inputs(struct brw_context *brw,
>> >     }
>> >  }
>> >
>> > +/* Disable auxiliary buffers if a renderbuffer is also bound as a texture
>> > + * or shader image.  This causes a self-dependency, where both rendering
>> > + * and sampling may concurrently read or write the CCS buffer, causing
>> > + * incorrect pixels.
>> > + *
>> > + * We don't support sampling from CCS_D, so this only matters for CCS_E.
>> > + */
>> >  static bool
>> > -intel_disable_rb_aux_buffer(struct brw_context *brw, const struct brw_bo 
>> *bo)
>> > +intel_disable_rb_aux_buffer(struct brw_context *brw,
>> > +                            struct intel_mipmap_tree *tex_mt,
>> > +                            const char *usage)
>> >  {
>> >     const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
>> >     bool found = false;
>> >
>> > +   /* Nothing to disable, don't bother looking */
>> > +   if (tex_mt->aux_usage != ISL_AUX_USAGE_CCS_E)
>> > +      return false;
>>
>> As I mentioned in the office today, I'm not convinced this is actually
>> needed.  When it isn't CCS_E, we'll resolve anyway so passing disable_aux =
>> true won't hurt anything.
>
> Yeah, that's true - but I figured we could avoid the overhead of the
> loop in other cases, since it doesn't really matter either way.
>
> Make sense?  Should we leave it?  Or would you rather drop it?

Leave it in but for a completely different reason than the one you stated. 
:-)  I don't care all that much about trying to avoid a loop whose number 
of iterations was bounded above by 8.

If the have a multisampled self dependency, it should work fine because 
compression is limited to a single pixel and we dispatch all the samples 
for a given pixel together.  We shouldn't force a resolve so we don't want 
to do this of its MCS.  In the case of CCS_D, I'm not sure if it's safe.  
We will potentially try to sample from a CCS_D image as CCS_E on gen9+ so 
it's probably best to disable it for that too.

>> > +
>> >     for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
>> >        const struct intel_renderbuffer *irb =
>> >           intel_renderbuffer(fb->_ColorDrawBuffers[i]);
>> >
>> > -      if (irb && irb->mt->bo == bo) {
>> > +      if (irb && irb->mt->bo == tex_mt->bo) {
>> >           found = brw->draw_aux_buffer_disabled[i] = true;
>> >        }
>> >     }
>> >
>> > +   if (found) {
>> > +      perf_debug("Disabling CCS because a renderbuffer is also bound %s.\n",
>> > +                 usage);
>> > +   }
>> > +
>> >     return found;
>> >  }
>> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> > index 670a92c1168..48392e7494a 100644
>> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> > @@ -2678,7 +2676,7 @@ intel_miptree_prepare_fb_fetch(struct brw_context *brw,
>> >                                 uint32_t start_layer, uint32_t num_layers)
>> >  {
>> >     intel_miptree_prepare_texture_slices(brw, mt, mt->surf.format, level, 1,
>> > -                                        start_layer, num_layers, NULL);
>> > +                                        start_layer, num_layers, false);
>>
>> I think we probably want true here.  It is for fb_fetch after all.  :-)
>> Also, we probably want to do disable_rb_aux_buffer for framebuffer fetch as
>> well.
>
> Sure, I can do that.  It won't actually do anything, however, as this is
> the code for non-coherent framebuffer fetch...and on Gen9+, we always do
> a coherent fetch using the Render Target Read messages (even if the app
> says that a non-coherent fetch would be fine).  And, Gen9+ is the only
> platform where CCS_E exists.  So, the only platforms where this matters
> don't use this code :)
>
> Still, I think your suggestion would make the code clearer, and would
> future-proof it in case we ever decide that non-coherent fetches are
> useful on modern hardware (say, if they're cheaper someday).




More information about the mesa-dev mailing list