<div dir="ltr"><div><a href="https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1052">https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1052</a></div><div> </div><div>I just kicked it off to Jenkins but it was fine last time I did this a year ago. </div></div> <div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, May 31, 2019 at 3:55 PM Nanley Chery <<a href="mailto:nanleychery@gmail.com">nanleychery@gmail.com</a>> wrote: </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Thanks for reaching out to the HW team. Given that the internal documentation was updated to set the Project field of this restriction to HSW:GT3, what do you think about shortening the comment to mention that? I'd like to give this a RB as is, but there are a lot of truth claims I'd have to verify in order to do so.. -Nanley On Mon, Dec 3, 2018 at 2:48 PM Jason Ekstrand <<a href="mailto:jason@jlekstrand.net" target="_blank">jason@jlekstrand.net</a>> wrote: > > I've received confirmation from the HW team that the extra doubling is only needed on Haswell GT3. > > On Tue, May 15, 2018 at 5:28 PM Jason Ekstrand <<a href="mailto:jason@jlekstrand.net" target="_blank">jason@jlekstrand.net</a>> wrote: >> >> The data in the commit message is a bit sketchy for Ivybridge. We don't >> run dEQP or any of the CTSs on Ivybridge in CI so all the data we have >> is piglit. On Haswell, piglit didn't catch anything so we don't have >> anything to go off of for Ivybridge besides the fact that the restriction >> wasn't added until Haswell. >> --- >> src/intel/blorp/blorp_clear.c | 66 ++++++++++++++++++++++++++++++++++++------- >> 1 file changed, 56 insertions(+), 10 deletions(-) >> >> diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c >> index 832e8ee..618625b 100644 >> --- a/src/intel/blorp/blorp_clear.c >> +++ b/src/intel/blorp/blorp_clear.c >> @@ -235,16 +235,62 @@ get_fast_clear_rect(const struct isl_device *dev, >> x_scaledown = x_align / 2; >> y_scaledown = y_align / 2; >> >> - /* From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel > Pixel >> - * Backend > MCS Buffer for Render Target(s) [DevIVB+] > Table "Color >> - * Clear of Non-MultiSampled Render Target Restrictions": >> - * >> - * Clear rectangle must be aligned to two times the number of >> - * pixels in the table shown below due to 16x16 hashing across the >> - * slice. >> - */ >> - x_align *= 2; >> - y_align *= 2; >> + if (ISL_DEV_IS_HASWELL(dev)) { >> + /* The following text was added in the Haswell PRM, "3D Media GPGPU >> + * Engine" >> "MCS Buffer for Render Target(s)" >> Table "Color Clear >> + * of Non-MultiSampler Render Target Restrictions": >> + * >> + * "Clear rectangle must be aligned to two times the number of >> + * pixels in the table shown below due to 16X16 hashing across the >> + * slice." >> + * >> + * It has persisted in the documentation for all platforms up until >> + * Cannonlake and possibly even beyond. However, we believe that it >> + * is only needed on Haswell. >> + * >> + * There are a couple possible explanations for this restriction: >> + * >> + * 1) If you assume that the hardware is writing to the CCS as >> + * bytes, then the x/y_align computed above gives you an alignment >> + * in the CCS of 8x8 bytes and, if 16x16 is needed for hashing, we >> + * need to multiply by 2. >> + * >> + * 2) Haswell is a bit unique in that it's CCS tiling does not line >> + * up with Y-tiling on a cache-line granularity. Instead, it has >> + * an extra bit of swizzling in bit 9. Also, bit 6 swizzling >> + * applies to the CCS on Haswell. This means that Haswell CTS >> + * does not match on a cache-line granularity but it does match on >> + * a 2x2 cache line granularity. >> + * >> + * Clearly, the first explanation seems to follow documentation the >> + * best but they may be related. In any case, empirical evidence >> + * seems to confirm that it is, indeed required on Haswell. >> + * >> + * On Broadwell things get a bit stickier. Broadwell adds support >> + * for mip-mapped CCS with an alignment in the CCS of 256x128. For a >> + * 32bpb main surface, the above computation will yield a x/y_align >> + * of 128x128 for a Y-tiled main surface and 256x64 for X-tiled. In >> + * either case, if we double the alignment, we will get an alignment >> + * bigger than horizontal and vertical alignment of the CCS and fast >> + * clears of one LOD may leak into others. >> + * >> + * Starting with Skylake, the image alignment for the CCS is only >> + * 128x64 which is exactly the x/h_align computed above if the main >> + * surface has a 32bpb format. Also, the "Render Target Resolve" >> + * page in the bspec (not the PRM) says, "The Resolve Rectangle size >> + * is same as Clear Rectangle size from SKL+". The x/y_align >> + * computed above (without doubling) match the resolve rectangle >> + * calculation perfectly. >> + * >> + * Finally, to confirm all this, a full test run was performed on >> + * Feb. 9, 2018 with this doubling removed and the only platform >> + * which seemed to be affected was Haswell. The run consisted of >> + * piglit, dEQP, the Vulkan CTS 1.0.2, the OpenGL 4.5 CTS, and the >> + * OpenGL ES 3.2 CTS. >> + */ >> + x_align *= 2; >> + y_align *= 2; >> + } >> } else { >> assert(aux_surf->usage == ISL_SURF_USAGE_MCS_BIT); >> >> -- >> 2.5.0.400.gff86faf >> </blockquote></div>