[Mesa-dev] [PATCH 03/10] i965/skl: Enable fast color clears on SKL
Chad Versace
chad.versace at intel.com
Mon Nov 9 11:43:06 PST 2015
On Tue 03 Nov 2015, Ben Widawsky wrote:
> On Fri, Oct 16, 2015 at 04:10:02PM -0700, Chad Versace wrote:
> > But this patch doesn't enable fast clears! The reverts in pathches 6 and
> > 7 need to be folded into this patch, otherwise the patch does not do
> > what it claims.
> >
> > Also, you can't enable fast clears before patches 4 and 5 without introducing
> > regressions. Patches 4 and 5 must precede this patch.
> >
>
> I did mention this in a later patch. I should have at least added that comment
> here as well and probably changed the short summary. To repeat what I had there
> though, I really liked having the differing SKL functionality be split out in
> the separate patches and couldn't do it in my earlier patches because I didn't
> have the tiny one liners that you added for strictly disabling the fast clears.
> I still prefer having these split out, and I consider that a pretty strong
> distinction given that I usually opt for fewer patches.
>
> I've modified this locally to summarize the above. I haven't read your review
> comments for the patches after this yet. It's likely that you noticed this, and
> we can continue the discussion there.
The commit subject says "Enable fast color clears on Skylake", but this
patch doesn't actually enable them. It's not until the revert in patch
7 that they're actually enabled.
The commit subject needs to reflect what the patch actually does.
Claiming it enables fast color clears creates a misleading git history.
>
> > On Tue 13 Oct 2015, Ben Widawsky wrote:
> > > Based on a patch originally from Kristian. Skylake has extended capabilities
> > > with regard to fast clears, but that is saved for another patch.
> > >
> > > The same effect could be acheived with the following, however I think the way
> > > I've done it is more in line with how the docs explain it.
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > @@ -150,9 +150,13 @@ intel_get_non_msrt_mcs_alignment(struct brw_context *brw,
> > > /* In release builds, fall through */
> > > case I915_TILING_Y:
> > > *width_px = 32 / mt->cpp;
> > > - *height = 4;
> > > + if (brw->gen >= 9)
> > > + *height = 2;
> > > + else
> > > + *height = 4;
> > >
> > > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> > > ---
> > > src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 54 +++++++++++++++++--------
> > > src/mesa/drivers/dri/i965/gen8_surface_state.c | 34 ++++++++++++----
> > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +++++
> > > src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 7 +++-
> > > 4 files changed, 78 insertions(+), 26 deletions(-)
> > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > index b6e3520..32f0ff7 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > @@ -192,6 +192,12 @@ intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling)
> > > *
> > > * - MCS buffer for non-MSRT is supported only for RT formats 32bpp,
> > > * 64bpp, and 128bpp.
> > > + *
> > > + * From the Skylake documentation, it is made clear that X-tiling is no longer
> > > + * supported:
> > > + *
> > > + * - MCS and Lossless compression is supported for TiledY/TileYs/TileYf
> > > + * non-MSRTs only.
> > > */
> > > static bool
> > > intel_miptree_supports_non_msrt_fast_clear(struct brw_context *brw,
> > > @@ -1485,6 +1491,9 @@ intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw,
> > > intel_get_non_msrt_mcs_alignment(mt, &block_width_px, &block_height);
> > > unsigned width_divisor = block_width_px * 4;
> > > unsigned height_divisor = block_height * 8;
> >
> > > + if (brw->gen >= 9)
> > > + height_divisor /= 2;
> > > +
> >
> > This hunk really needs a comment. Please explain why Skylake's MCS is
> > twice as tall as previous generations'.
> >
>
> To my knowledge we've yet to actually articulate in a comment what the MCS
> buffer's layout is (perhaps I just don't see it). Therefore I am uncertain what
> comment I could provide here which isn't obvious from the code. I had a
> different path to finding this than you did which I believe gives me some bias
> (I just needed a reminder that divisor needs to be halved instead of doubled to
> obtain twice the height).
>
> If you suggest something, I will happily add it.
/* The Skylake MCS is twice as tall as the Broadwell MCS.
*
* In pre-Skylake, each bit in the MCS contained the state of 2 cachelines
* in the main surface. In Skylake, it's two bits. The extra bit
* doubles the MCS height, not width, because in Skylake the MCS is always
* Y-tiled.
*/
>
> > > unsigned mcs_width =
> > > ALIGN(mt->logical_width0, width_divisor) / width_divisor;
> > > unsigned mcs_height =
> > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> > > index 805cd71..d9cdba0 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> > > @@ -634,14 +634,17 @@ struct intel_mipmap_tree
> > > * color mipmap tree, if any.
> > > *
> > > * This value will only ever contain ones in bits 28-31, so it is safe to
> > > - * OR into dword 7 of SURFACE_STATE.
> > > + * OR into dword 7 of SURFACE_STATE. (dword 12-15 on SKL)
> >
> > The comment is wrong. "This value will only ever contain ones in bits
> > 28-31" applies only to fast_clear_color_value but not to
> > gen9_fast_clear_color. And it doesn't make sense to bit-or
> > gen9_fast_color_clear to dwords 12-15, as gen9_fast_color_clear is
> > exactly 4 dwords big, and all the bits are significant.
> >
>
>
> How does this sound?
That sounds good.
> }
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index d9cdba0..64f73ea 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -633,8 +633,12 @@ struct intel_mipmap_tree
> * The SURFACE_STATE bits associated with the last fast color clear to this
> * color mipmap tree, if any.
> *
> - * This value will only ever contain ones in bits 28-31, so it is safe to
> - * OR into dword 7 of SURFACE_STATE. (dword 12-15 on SKL)
> + * Prior to GEN9 there is a single bit for RGBA clear values which gives you
> + * the option of 2^4 clear colors. Each bit determines if the color channel
> + * is fully saturated or unsaturated (Cherryview does add a 32b value per
> + * channel, but it is globally applied instead of being part of the render
> + * surface state). Starting with GEN9, the surface state accepts a 32b value
> + * for each color channel.
> *
> * @see RENDER_SURFACE_STATE.RedClearColor
> * @see RENDER_SURFACE_STATE.GreenClearColor
More information about the mesa-dev
mailing list