[Mesa-dev] [v3 4/6] i965/rbc: Consult rb settings for texture surface setup
Pohjolainen, Topi
topi.pohjolainen at gmail.com
Fri Sep 9 18:00:10 UTC 2016
On Fri, Sep 09, 2016 at 07:11:16AM -0700, Jason Ekstrand wrote:
> On Sep 8, 2016 11:13 PM, "Pohjolainen, Topi"
> <[1]topi.pohjolainen at gmail.com> wrote:
> >
> > On Thu, Sep 08, 2016 at 08:49:56AM -0700, Jason Ekstrand wrote:
> > > On Sep 7, 2016 9:30 PM, "Pohjolainen, Topi"
> > > <[1][2]topi.pohjolainen at gmail.com> wrote:
> > > >
> > > > On Wed, Sep 07, 2016 at 03:25:30PM -0700, Jason Ekstrand
> wrote:
> > > > > On Sep 7, 2016 10:24 AM, "Topi Pohjolainen"
> > > > > <[1][2][3]topi.pohjolainen at gmail.com> wrote:
> > > > > >
> > > > > > Once mcs buffer gets allocated without delay for
> lossless
> > > > > > compression (same as we do for msaa), one gets
> regression in:
> > > > > >
> > > > > > GL45-CTS.texture_barrier_ARB.same-texel-rw
> > > > > >
> > > > > > Setting the auxiliary surface for both sampling engine
> and
> > > data
> > > > > > port seems to fix this. I haven't found any hardware
> > > documentation
> > > > > > backing this though.
> > > > > >
> > > > > > v2 (Jason): Prepare also for the case where surface is
> sampled
> > > with
> > > > > > non-compressible format forcing also
> rendering
> > > without
> > > > > > compression.
> > > > > > v3: Split asserts and decision making.
> > > > > >
> > > > > > Signed-off-by: Topi Pohjolainen
> > > <[2][3][4]topi.pohjolainen at intel.com>
> > > > > > ---
> > > > > > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 63
> > > > > +++++++++++++++++++++---
> > > > > > 1 file changed, 56 insertions(+), 7 deletions(-)
> > > > > >
> > > > > > diff --git
> a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > > > > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > > > > > index c1273c5..054c5c8 100644
> > > > > > --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > > > > > +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > > > > > @@ -140,9 +140,7 @@ brw_emit_surface_state(struct
> brw_context
> > > *brw,
> > > > > > struct isl_surf *aux_surf = NULL, aux_surf_s;
> > > > > > uint64_t aux_offset = 0;
> > > > > > enum isl_aux_usage aux_usage = ISL_AUX_USAGE_NONE;
> > > > > > - if (mt->mcs_mt &&
> > > > > > - ((view.usage &
> ISL_SURF_USAGE_RENDER_TARGET_BIT) ||
> > > > > > - mt->fast_clear_state !=
> > > INTEL_FAST_CLEAR_STATE_RESOLVED)) {
> > > > > > + if (mt->mcs_mt && !(flags &
> INTEL_AUX_BUFFER_DISABLED)) {
> > > > > > intel_miptree_get_aux_isl_surf(brw, mt,
> &aux_surf_s,
> > > > > &aux_usage);
> > > > > > aux_surf = &aux_surf_s;
> > > > > > assert(mt->mcs_mt->offset == 0);
> > > > > > @@ -425,6 +423,54 @@ swizzle_to_scs(GLenum swizzle,
> bool
> > > > > need_green_to_blue)
> > > > > > return (need_green_to_blue && scs == HSW_SCS_GREEN)
> ?
> > > > > HSW_SCS_BLUE : scs;
> > > > > > }
> > > > > >
> > > > > > +static unsigned
> > > > > > +brw_find_matching_rb(const struct gl_framebuffer *fb,
> > > > > > + const struct intel_mipmap_tree
> *mt)
> > > > > > +{
> > > > > > + for (unsigned i = 0; i < fb->_NumColorDrawBuffers;
> i++) {
> > > > > > + const struct intel_renderbuffer *irb =
> > > > > > + intel_renderbuffer(fb->_ColorDrawBuffers[i]);
> > > > > > +
> > > > > > + if (irb->mt == mt)
> > > > > > + return i;
> > > > > > + }
> > > > > > +
> > > > > > + return fb->_NumColorDrawBuffers;
> > > > > > +}
> > > > > > +
> > > > > > +static bool
> > > > > > +brw_disable_aux_surface(const struct brw_context *brw,
> > > > > > + const struct intel_mipmap_tree
> *mt)
> > > > > > +{
> > > > > > + /* Nothing to disable. */
> > > > > > + if (!mt->mcs_mt)
> > > > > > + return false;
> > > > > > +
> > > > > > + /* There are special cases only for lossless
> compression.
> > > */
> > > > > > + if (!intel_miptree_is_lossless_compressed(brw, mt))
> > > > > > + return mt->fast_clear_state ==
> > > > > INTEL_FAST_CLEAR_STATE_RESOLVED;
> > > > > > +
> > > > > > + const struct gl_framebuffer *fb =
> brw->ctx.DrawBuffer;
> > > > > > + const unsigned rb_index = brw_find_matching_rb(fb,
> mt);
> > > > > > +
> > > > > > + /* In practise it looks that setting the same
> lossless
> > > compressed
> > > > > surface
> > > > > > + * to be sampled without auxiliary surface and to
> be
> > > written with
> > > > > auxiliary
> > > > > > + * surface confuses the hardware. Therefore sampler
> engine
> > > must
> > > > > be provided
> > > > > > + * with auxiliary buffer regardless of the fast
> clear
> > > state if
> > > > > the same
> > > > > > + * surface is also going to be written during the
> same
> > > rendering
> > > > > pass with
> > > > > > + * auxiliary buffer enabled.
> > > > > > + */
> > > > > > + if (rb_index < fb->_NumColorDrawBuffers) {
> > > > > > + if (brw->draw_aux_buffer_disabled[rb_index]) {
> > > > > > + assert(mt->fast_clear_state ==
> > > > > INTEL_FAST_CLEAR_STATE_RESOLVED);
> > > > > > + }
> > > > > > +
> > > > > > + return brw->draw_aux_buffer_disabled[rb_index];
> > > > >
> > > > > Can we use "mt->fast_clear_state ==
> > > INTEL_FAST_CLEAR_STATE_RESOLVED"
> > > > > here as well? For render targets we want to look at
> > > > > aux_buffer_disabled but for textures (which is the only
> case
> > > that uses
> > > > > this function), fast_clear_state should be sufficient.
> > > >
> > > > Like the comment says, we can't skip setting aux buffer for
> sampling
> > > in case
> > > > the same surface is written as render target with aux enabled.
> > > Hardware
> > > > doesn't seem to like this. This is specifically for that
> purpose
> > > checking
> > > > the flag.
> > >
> > > Let me see if I understand this correctly: this handles the case
> where
> > > the miptree is bound as both a render target and a texture and
> there is
> > > nothing about the texture view scenario causing it to skip
> compression
> > > and, for some reason, the surface starts of the draw call fully
> > > resolved. If *any* drawing has been done without a resolve, we
> would
> > > be in the UNRESOLVED state and this wouldn't be needed. That's
> > > terribly specific but also a valid use case.
> > >
> > > We could potentially handle this by setting the render target
> surfaces
> > > to UNRESOLVED a bit earlier in the pipeline setup code. I
> believe we
> > > flag things as unresolved rather late; in particular, after
> we've done
> > > our texture state setup. The other option is to do exactly what
> you did
> > > and add a comment. That's probably the best option. With a
> good
> > > comment added,
> >
> > So, above there is already:
> >
> > * In practise it looks that setting the same lossless compressed
> surface
> > * to be sampled without auxiliary surface and to be written with
> auxiliary
> > * surface confuses the hardware. Therefore sampler engine must be
> provided
> > * with auxiliary buffer regardless of the fast clear state if the
> same
> > * surface is also going to be written during the same rendering
> pass with
> > * auxiliary buffer enabled.
> >
> > What would you add?
>
> Hmmm... I think my reservations about that convent are because it
> sounds like a workaround. How about this:
>
> If we are drawing into this with compression enabled, then we must also
> enable compression when texturing from it regardless of
> fast_clear_state. If we don't then, after the first draw call with
> this setup, there will be data in the CCS which won't get picked up by
> subsequent texturing operations as required by ARB_texture_barrier.
> Since we don't want to re-emit the binding table or do a resolve
> operation every draw call, the easiest thing to do is just enable
> compression on the texturing side. This is completely safe to do
> since, if compressed texturing weren't allowed, we would have disabled
> compression of render targets in whatever_that_function_is_called().
This sounds really good, I'll make the change.
>
> > >
> > > Reviewed-by: Jason Ekstrand <[4][5]jason at jlekstrand.net>
Thanks!
More information about the mesa-dev
mailing list