[Mesa-dev] [PATCH 1/2] i965/skl: Enable fast clears for SKL
Ben Widawsky
ben at bwidawsk.net
Tue Mar 10 22:37:20 PDT 2015
On Tue, Mar 10, 2015 at 09:19:30PM -0700, Kenneth Graunke wrote:
> On Thursday, February 26, 2015 03:42:52 PM Ben Widawsky wrote:
> > From: Kristian Høgsberg <krh at bitplanet.net>
> >
> > v2 by Ben:
> > Rebase with conflict resolution
> > Add the SKL scaling factors
>
> I'd probably keep those as a separate patch, but commit that before this
> one.
>
> > Squashed in i965/skl: Dont zero surf[12]. The patch can't work properly without
> > that, so there's no point in a separate patch.
> >
> > Signed-off-by: Kristian Høgsberg <krh at bitplanet.net>
> > Reviewed-by: Ben Widawsky <ben at bwidawsk.net>
> > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> > ---
> > src/mesa/drivers/dri/i965/brw_clear.c | 2 +-
> > src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 16 +++++++++++----
> > src/mesa/drivers/dri/i965/gen8_surface_state.c | 26 ++++++++++++++++++++-----
> > src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 1 +
> > 4 files changed, 35 insertions(+), 10 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_clear.c b/src/mesa/drivers/dri/i965/brw_clear.c
> > index 1231420..a67b8a0 100644
> > --- a/src/mesa/drivers/dri/i965/brw_clear.c
> > +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> > @@ -242,7 +242,7 @@ brw_clear(struct gl_context *ctx, GLbitfield mask)
> > }
> >
> > /* Clear color buffers with fast clear or at least rep16 writes. */
> > - if (brw->gen >= 6 && brw->gen < 9 && (mask & BUFFER_BITS_COLOR)) {
> > + if (brw->gen >= 6 && (mask & BUFFER_BITS_COLOR)) {
> > if (brw_meta_fast_clear(brw, fb, mask, partial_clear)) {
> > debug_mask("blorp color", mask & BUFFER_BITS_COLOR);
> > mask &= ~BUFFER_BITS_COLOR;
> > diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> > index c8f2a14..c4d4b68 100644
> > --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> > +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> > @@ -336,12 +336,17 @@ get_buffer_rect(struct brw_context *brw, struct gl_framebuffer *fb,
> > */
> > static bool
> > is_color_fast_clear_compatible(struct brw_context *brw,
> > - mesa_format format,
> > + struct intel_mipmap_tree *mt,
> > const union gl_color_union *color)
> > {
> > + mesa_format format = _mesa_get_render_format(&brw->ctx, mt->format);
> > +
> > if (_mesa_is_format_integer_color(format))
> > return false;
> >
>
> Perhaps mention:
>
> /* WaZeroOneClearValuesMSAA: "If Number of Multisamples is not
> * NUMSAMPLES_1, only 0/1 values allowed."
> */
>
> I sort of expected num_samples == 0, but checking the MSAA layout seems
> fine too (and avoids the whole num_samples == 0? 1? WAT? problem).
>
I am guessing I own this patch now, so I will do it.
> > + if (brw->gen >= 9 && mt->msaa_layout == INTEL_MSAA_LAYOUT_NONE)
> > + return true;
> > +
> > for (int i = 0; i < 4; i++) {
> > if (color->f[i] != 0.0 && color->f[i] != 1.0 &&
> > _mesa_format_has_color_component(format, i)) {
> > @@ -409,7 +414,6 @@ brw_meta_fast_clear(struct brw_context *brw, struct gl_framebuffer *fb,
> > GLbitfield buffers, bool partial_clear)
> > {
> > struct gl_context *ctx = &brw->ctx;
> > - mesa_format format;
> > enum { FAST_CLEAR, REP_CLEAR, PLAIN_CLEAR } clear_type;
> > GLbitfield plain_clear_buffers, meta_save, rep_clear_buffers, fast_clear_buffers;
> > struct rect fast_clear_rect, clear_rect;
> > @@ -455,8 +459,7 @@ brw_meta_fast_clear(struct brw_context *brw, struct gl_framebuffer *fb,
> > /* Fast clear is only supported for colors where all components are
> > * either 0 or 1.
> > */
> > - format = _mesa_get_render_format(ctx, irb->mt->format);
> > - if (!is_color_fast_clear_compatible(brw, format, &ctx->Color.ClearColor))
> > + if (!is_color_fast_clear_compatible(brw, irb->mt, &ctx->Color.ClearColor))
> > clear_type = REP_CLEAR;
> >
> > /* From the SNB PRM (Vol4_Part1):
> > @@ -492,6 +495,7 @@ brw_meta_fast_clear(struct brw_context *brw, struct gl_framebuffer *fb,
> >
> > switch (clear_type) {
> > case FAST_CLEAR:
> > + irb->mt->fast_clear_color = ctx->Color.ClearColor;
> > irb->mt->fast_clear_color_value =
> > compute_fast_clear_color_bits(&ctx->Color.ClearColor);
> > irb->need_downsample = true;
> > @@ -689,6 +693,10 @@ brw_meta_resolve_color(struct brw_context *brw,
> >
> > brw_bind_rep_write_shader(brw, (float *) fast_clear_color);
> >
> > + /* SKL+ also has a resolve mode for compressed render targets and thus more
> > + * bits to let us select the type of resolve. For fast clear resolves, it
> > + * turns out we can use the same value as pre-SKL though.
> > + */
> > set_fast_clear_op(brw, GEN7_PS_RENDER_TARGET_RESOLVE_ENABLE);
> >
> > mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED;
> > diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > index d6b870e..48b2da9 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > @@ -146,6 +146,22 @@ gen8_emit_buffer_surface_state(struct brw_context *brw,
> > }
> >
> > static void
> > +set_fast_clear_color(struct brw_context *brw, struct intel_mipmap_tree *mt, uint32_t *surf)
> > +{
> > + if (brw->gen >= 9) {
> > + /* SKL+ can set a wider range of fast clear colors in certain cases, so
> > + * we have to program the colors as four dwords.
> > + */
> > + surf[12] = mt->fast_clear_color.ui[0];
> > + surf[13] = mt->fast_clear_color.ui[1];
> > + surf[14] = mt->fast_clear_color.ui[2];
> > + surf[15] = mt->fast_clear_color.ui[3];
> > + } else {
> > + surf[7] = mt->fast_clear_color_value;
> > + }
> > +}
> > +
> > +static void
> > gen8_update_texture_surface(struct gl_context *ctx,
> > unsigned unit,
> > uint32_t *surf_offset,
> > @@ -243,7 +259,7 @@ gen8_update_texture_surface(struct gl_context *ctx,
> > (firstImage->_BaseFormat == GL_DEPTH_COMPONENT ||
> > firstImage->_BaseFormat == GL_DEPTH_STENCIL);
> >
> > - surf[7] = mt->fast_clear_color_value;
> > + set_fast_clear_color(brw, mt, surf);
> >
> > const int swizzle =
> > unlikely(alpha_depth) ? SWIZZLE_XYZW : brw_get_texture_swizzle(ctx, tObj);
> > @@ -264,7 +280,6 @@ gen8_update_texture_surface(struct gl_context *ctx,
> > surf[10] = 0;
> > surf[11] = 0;
> > }
> > - surf[12] = 0;
> >
> > /* Emit relocation to surface contents */
> > drm_intel_bo_emit_reloc(brw->batch.bo,
> > @@ -421,13 +436,15 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
> > surf[6] = 0;
> > }
> >
> > - surf[7] = mt->fast_clear_color_value |
> > - SET_FIELD(HSW_SCS_RED, GEN7_SURFACE_SCS_R) |
> > + surf[7] = SET_FIELD(HSW_SCS_RED, GEN7_SURFACE_SCS_R) |
> > SET_FIELD(HSW_SCS_GREEN, GEN7_SURFACE_SCS_G) |
> > SET_FIELD(HSW_SCS_BLUE, GEN7_SURFACE_SCS_B) |
> > SET_FIELD(HSW_SCS_ALPHA, GEN7_SURFACE_SCS_A);
> >
> > assert(mt->offset % mt->cpp == 0);
> > +
> > + set_fast_clear_color(brw, mt, surf);
>
> set_fast_clear_color does "surf[7] = ..." which will clobber the SCS
> values set here. You might make both use |= instead. surf[7] should
> already be memset to 0 by allocate_surface_state().
>
> Have you tested this on both Skylake and Broadwell?
Well, this patch by itself, no. With both patches, as I said it in the other
one, it was very minimally tested on SKL, and I didn't test Broadwell. I was
actually hoping to get the Chrome OS people to pick this up and run with it,
but, well, it never happened. Thanks for catching that. I hope I would have seen
it in piglit when I got around to sending it out to Jenkins :-)
ANYWAY... If you wanted to preserve, it should have been
surf[7] |= mt->fast_clear_color_value;
I'm fine with |= on both. It is a bit safer in case someone reorders things, and
it allows reuse of some of the bits which have new SKL meanings.
To be honest, when I looked at the patch I saw this, but I assumed the SCS
wasn't useful when we have the 32bit clear values, since I didn't really know
what SCS does. In fairness, for me, it wasn't negligence, it was incompetence.
>
> While reviewing, I also came across the text "If programmed to non 0/1 values,
> SW must ensure a render target partial resolve pass before binding a cleared
> RT to texture." which I believe is called WaZeroOneClearValuesAtSampler.
>
> I think we're OK on that, since it looks like intel_update_state calls
> intel_miptree_resolve_color for all textures. Assuming the "render
> target partial resolve" is what that's doing...
Yeah, I looked for this and it seemed okay to me too. I can add a comment
somewhere if you'd like.
>
> With the surf[7] thing fixed, this is
> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
>
Thank you.
> > +
> > *((uint64_t *) &surf[8]) = mt->bo->offset64 + mt->offset; /* reloc */
> >
> > if (aux_mt) {
> > @@ -440,7 +457,6 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
> > surf[10] = 0;
> > surf[11] = 0;
> > }
> > - surf[12] = 0;
> >
> > drm_intel_bo_emit_reloc(brw->batch.bo,
> > brw->wm.base.surf_offset[surf_index] + 8 * 4,
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> > index ee9cf1e..5e9d006 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> > @@ -468,6 +468,7 @@ struct intel_mipmap_tree
> > * OR into dword 7 of SURFACE_STATE.
> > */
> > uint32_t fast_clear_color_value;
> > + union gl_color_union fast_clear_color;
> >
> > /* These are also refcounted:
> > */
> >
More information about the mesa-dev
mailing list