[Mesa-dev] [PATCH] intel/blorp: Only double the fast-clear rect alignment on HSW

Sat Feb 10 17:51:01 UTC 2018

One side comment:  The only tests that caught anything were dEQP or CTS
tests.  Somehow, Piglit didn't catch any of this.  This means that we don't
have any real data on IVB.  I'd be happy to make ISL_DEV_GEN(dev) == 7
instead since the extra alignment won't hurt anything on IVB (no
mipmapping).

On Sat, Feb 10, 2018 at 9:48 AM, Jason Ekstrand <jason at jlekstrand.net>
wrote:

> ---
>  src/intel/blorp/blorp_clear.c | 66 ++++++++++++++++++++++++++++++
> ++++++-------
>  1 file changed, 56 insertions(+), 10 deletions(-)
>
> diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
> index a2dbcd1..63b74e3 100644
> --- a/src/intel/blorp/blorp_clear.c
> +++ b/src/intel/blorp/blorp_clear.c
> @@ -235,16 +235,62 @@ get_fast_clear_rect(const struct isl_device *dev,
>        x_scaledown = x_align / 2;
>        y_scaledown = y_align / 2;
>
> -      /* From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel > Pixel
> -       * Backend > MCS Buffer for Render Target(s) [DevIVB+] > Table
> "Color
> -       * Clear of Non-MultiSampled Render Target Restrictions":
> -       *
> -       *   Clear rectangle must be aligned to two times the number of
> -       *   pixels in the table shown below due to 16x16 hashing across the
> -       *   slice.
> -       */
> -      x_align *= 2;
> -      y_align *= 2;
> +      if (ISL_DEV_IS_HASWELL(dev)) {
> +         /* The following text was added in the Haswell PRM, "3D Media
> GPGPU
> +          * Engine" >> "MCS Buffer for Render Target(s)" >> Table "Color
> Clear
> +          * of Non-MultiSampler Render Target Restrictions":
> +          *
> +          *    "Clear rectangle must be aligned to two times the number of
> +          *    pixels in the table shown below due to 16X16 hashing
> across the
> +          *    slice."
> +          *
> +          * It has persisted in the documentation for all platforms up
> until
> +          * Cannonlake and possibly even beyond.  However, we believe
> that it
> +          * is only needed on Haswell.
> +          *
> +          * There are a couple possible explanations for this restriction:
> +          *
> +          * 1) If you assume that the hardware is writing to the CCS as
> +          *    bytes, then the x/y_align computed above gives you an
> alignment
> +          *    in the CCS of 8x8 bytes and, if 16x16 is needed for
> hashing, we
> +          *    need to multiply by 2.
> +          *
> +          * 2) Haswell is a bit unique in that it's CCS tiling does not
> line
> +          *    up with Y-tiling on a cache-line granularity.  Instead, it
> has
> +          *    an extra bit of swizzling in bit 9.  Also, bit 6 swizzling
> +          *    applies to the CCS on Haswell.  This means that Haswell CTS
> +          *    does not match on a cache-line granularity but it does
> match on
> +          *    a 2x2 cache line granularity.
> +          *
> +          * Clearly, the first explanation seems to follow documentation
> the
> +          * best but they may be related.  In any case, empirical evidence
> +          * seems to confirm that it is, indeed required on Haswell.
> +          *
> +          * On Broadwell things get a bit stickier.  Broadwell adds
> support
> +          * for mip-mapped CCS with an alignment in the CCS of 256x128.
> For a
> +          * 32bpb main surface, the above computation will yield a
> x/y_align
> +          * of 128x128 for a Y-tiled main surface and 256x64 for
> X-tiled.  In
> +          * either case, if we double the alignment, we will get an
> alignment
> +          * bigger than horizontal and vertical alignment of the CCS and
> fast
> +          * clears of one LOD may leak into others.
> +          *
> +          * Starting with Skylake, the image alignment for the CCS is only
> +          * 128x64 which is exactly the x/h_align computed above if the
> main
> +          * surface has a 32bpb format.  Also, the "Render Target Resolve"
> +          * page in the bspec (not the PRM) says, "The Resolve Rectangle
> size
> +          * is same as Clear Rectangle size from SKL+".  The x/y_align
> +          * computed above (without doubling) match the resolve rectangle
> +          * calculation perfectly.
> +          *
> +          * Finally, to confirm all this, a full test run was performed on
> +          * Feb. 9, 2018 with this doubling removed and the only platform
> +          * which seemed to be affected was Haswell.  The run consisted of
> +          * piglit, dEQP, the Vulkan CTS 1.0.2, the OpenGL 4.5 CTS, and
> the
> +          * OpenGL ES 3.2 CTS.
> +          */
> +         x_align *= 2;
> +         y_align *= 2;
> +      }
>     } else {
>        assert(aux_surf->usage == ISL_SURF_USAGE_MCS_BIT);
>
> --
> 2.5.0.400.gff86faf
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180210/19753b38/attachment.html>