[Mesa-dev] [PATCH 1/9] intel/blorp: Only double the fast-clear rect alignment on HSW
Jason Ekstrand
jason at jlekstrand.net
Fri Jun 7 20:16:36 UTC 2019
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1052
I just kicked it off to Jenkins but it was fine last time I did this a year
ago.
On Fri, May 31, 2019 at 3:55 PM Nanley Chery <nanleychery at gmail.com> wrote:
> Thanks for reaching out to the HW team. Given that the internal
> documentation was updated to set the Project field of this restriction
> to HSW:GT3, what do you think about shortening the comment to mention
> that? I'd like to give this a RB as is, but there are a lot of truth
> claims I'd have to verify in order to do so..
>
> -Nanley
>
> On Mon, Dec 3, 2018 at 2:48 PM Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> >
> > I've received confirmation from the HW team that the extra doubling is
> only needed on Haswell GT3.
> >
> > On Tue, May 15, 2018 at 5:28 PM Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> >>
> >> The data in the commit message is a bit sketchy for Ivybridge. We don't
> >> run dEQP or any of the CTSs on Ivybridge in CI so all the data we have
> >> is piglit. On Haswell, piglit didn't catch anything so we don't have
> >> anything to go off of for Ivybridge besides the fact that the
> restriction
> >> wasn't added until Haswell.
> >> ---
> >> src/intel/blorp/blorp_clear.c | 66
> ++++++++++++++++++++++++++++++++++++-------
> >> 1 file changed, 56 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/src/intel/blorp/blorp_clear.c
> b/src/intel/blorp/blorp_clear.c
> >> index 832e8ee..618625b 100644
> >> --- a/src/intel/blorp/blorp_clear.c
> >> +++ b/src/intel/blorp/blorp_clear.c
> >> @@ -235,16 +235,62 @@ get_fast_clear_rect(const struct isl_device *dev,
> >> x_scaledown = x_align / 2;
> >> y_scaledown = y_align / 2;
> >>
> >> - /* From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel >
> Pixel
> >> - * Backend > MCS Buffer for Render Target(s) [DevIVB+] > Table
> "Color
> >> - * Clear of Non-MultiSampled Render Target Restrictions":
> >> - *
> >> - * Clear rectangle must be aligned to two times the number of
> >> - * pixels in the table shown below due to 16x16 hashing across
> the
> >> - * slice.
> >> - */
> >> - x_align *= 2;
> >> - y_align *= 2;
> >> + if (ISL_DEV_IS_HASWELL(dev)) {
> >> + /* The following text was added in the Haswell PRM, "3D Media
> GPGPU
> >> + * Engine" >> "MCS Buffer for Render Target(s)" >> Table
> "Color Clear
> >> + * of Non-MultiSampler Render Target Restrictions":
> >> + *
> >> + * "Clear rectangle must be aligned to two times the
> number of
> >> + * pixels in the table shown below due to 16X16 hashing
> across the
> >> + * slice."
> >> + *
> >> + * It has persisted in the documentation for all platforms up
> until
> >> + * Cannonlake and possibly even beyond. However, we believe
> that it
> >> + * is only needed on Haswell.
> >> + *
> >> + * There are a couple possible explanations for this
> restriction:
> >> + *
> >> + * 1) If you assume that the hardware is writing to the CCS as
> >> + * bytes, then the x/y_align computed above gives you an
> alignment
> >> + * in the CCS of 8x8 bytes and, if 16x16 is needed for
> hashing, we
> >> + * need to multiply by 2.
> >> + *
> >> + * 2) Haswell is a bit unique in that it's CCS tiling does
> not line
> >> + * up with Y-tiling on a cache-line granularity. Instead,
> it has
> >> + * an extra bit of swizzling in bit 9. Also, bit 6
> swizzling
> >> + * applies to the CCS on Haswell. This means that Haswell
> CTS
> >> + * does not match on a cache-line granularity but it does
> match on
> >> + * a 2x2 cache line granularity.
> >> + *
> >> + * Clearly, the first explanation seems to follow
> documentation the
> >> + * best but they may be related. In any case, empirical
> evidence
> >> + * seems to confirm that it is, indeed required on Haswell.
> >> + *
> >> + * On Broadwell things get a bit stickier. Broadwell adds
> support
> >> + * for mip-mapped CCS with an alignment in the CCS of
> 256x128. For a
> >> + * 32bpb main surface, the above computation will yield a
> x/y_align
> >> + * of 128x128 for a Y-tiled main surface and 256x64 for
> X-tiled. In
> >> + * either case, if we double the alignment, we will get an
> alignment
> >> + * bigger than horizontal and vertical alignment of the CCS
> and fast
> >> + * clears of one LOD may leak into others.
> >> + *
> >> + * Starting with Skylake, the image alignment for the CCS is
> only
> >> + * 128x64 which is exactly the x/h_align computed above if
> the main
> >> + * surface has a 32bpb format. Also, the "Render Target
> Resolve"
> >> + * page in the bspec (not the PRM) says, "The Resolve
> Rectangle size
> >> + * is same as Clear Rectangle size from SKL+". The x/y_align
> >> + * computed above (without doubling) match the resolve
> rectangle
> >> + * calculation perfectly.
> >> + *
> >> + * Finally, to confirm all this, a full test run was
> performed on
> >> + * Feb. 9, 2018 with this doubling removed and the only
> platform
> >> + * which seemed to be affected was Haswell. The run
> consisted of
> >> + * piglit, dEQP, the Vulkan CTS 1.0.2, the OpenGL 4.5 CTS,
> and the
> >> + * OpenGL ES 3.2 CTS.
> >> + */
> >> + x_align *= 2;
> >> + y_align *= 2;
> >> + }
> >> } else {
> >> assert(aux_surf->usage == ISL_SURF_USAGE_MCS_BIT);
> >>
> >> --
> >> 2.5.0.400.gff86faf
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190607/c31e2833/attachment-0001.html>
More information about the mesa-dev
mailing list