[Intel-gfx] [PATCH] tests/kms_ccs: Fix the color/ccs surface generation

Mon Aug 7 15:56:46 UTC 2017

On Fri, Aug 04, 2017 at 07:56:11AM -0700, Jason Ekstrand wrote:
> On August 4, 2017 2:59:56 AM Daniel Stone <daniel at fooishbar.org> wrote:
> 
> > Hi Jason,
> > 
> > On 4 August 2017 at 01:52, Jason Ekstrand <jason at jlekstrand.net> wrote:
> > > Previously, the test used the old 64x64 convention that Ville introduced
> > > for CCS tiles and not the current 128x32 Y-tile convention.  Also, the
> > > original scheme for generating the CCS data was over-complicated and
> > > didn't work correctly because it assumed you could cut the main surface
> > > at an arbitrary Y coordinate.  While you clearly *can* do this (the
> > > hardware does), it's not a good idea for a generator in a test.  The new
> > > scheme, introduced here, is entirely based on the relationship between
> > > cache-lines in the main surface and the CCS that's documented in the
> > > PRM.  By keeping everything CCS cache-line aligned, our chances of
> > > generating correct data for an arbitrary-size surface are much higher.
> > 
> > Thanks for fixing this!
> > 
> > > +                * We need to cut the surface on a CCS cache-line boundary,
> > > +                * otherwise, we're going to be in trouble when we try to
> > > +                * generate CCS data for the surface.  A cache line in the
> > > +                * CCS is 16x16 cache-line-pairs in the main surface.  16
> > > +                * cache lines is 64 rows high.
> > > +                */
> > > +               half_height = ALIGN(height, 128) / 2;
> > > +               half_size = half_height * stride;
> > > +               for (i = 0; i < fb->size / 4; i++) {
> > > +                       if (i < half_size / 4)
> > > +                               ptr[i] = RED;
> > > +                       else
> > > +                               ptr[i] = COMPRESSED_RED;
> > 
> > I toyed with:
> > else if (!(i & 0xe))
> >         ptr[i] = COMPRESSED_RED;
> > else
> >         ptr[i] = BLACK;
> > 
> > ... to make the failure mode even more obvious. But that still writes
> > in (far) more compressed-red pixels than we strictly need for CCS,
> > right? Anyway, follow-up patch.
> 
> Yeah, we can probably do quite a bit of patterning so long as we keep the
> compressed/uncompressed split simple and make sure our pattern works in
> whole cache lines.
> 
> > > +static unsigned int
> > > +y_tile_y_pos(unsigned int offset, unsigned int stride)
> > >  {
> > > -       return ptr +
> > > -               ((y & ~0x3f) * stride) +
> > > -               ((x & ~0x7) * 64) +
> > > -               ((y & 0x3f) * 8) +
> > > -               (x & 7);
> > > +       unsigned int y_tiles, y;
> > > +       y_tiles = (offset / 4096) / (stride / 128);
> > > +       y = y_tiles * 32 + ((offset & 0x1f0) >> 4);
> > > +       return y;
> > >  }
> > > 
> > > @@ -143,12 +168,26 @@ static void display_fb(data_t *data, int compressed)
> > >         size[0] = f.pitches[0] * ALIGN(height, 32);
> > > 
> > >         if (compressed) {
> > > -               width = ALIGN(f.width, 16) / 16;
> > > -               height = ALIGN(f.height, 8) / 8;
> > > -               f.pitches[1] = ALIGN(width * 1, 64);
> > > +               /* From the Sky Lake PRM, Vol 12, "Color Control Surface":
> > > +                *
> > > +                *    "The compression state of the cache-line pair is
> > > +                *    specified by 2 bits in the CCS.  Each CCS cache-line
> > > +                *    represents an area on the main surface of 16x16 sets
> > > +                *    of 128 byte Y-tiled cache-line-pairs. CCS is always Y
> > > +                *    tiled."
> > > +                *
> > > +                * A "cache-line-pair" for a Y-tiled surface is two
> > > +                * horizontally adjacent cache lines.  When operating in
> > > +                * bytes and rows, this gives us a scale-down factor of
> > > +                * 32x16.  Since the main surface has a 32-bit format, we
> > > +                * need to multiply width by 4 to get bytes.
> > > +                */
> > 
> > Yeah, this code and comment match better (& helped clarify) my
> > understanding of how it works. But given my understanding was mostly
> > gleaned from other, borderline incomprehensible, comments, as well as
> > manually poking bits and observing the result, maybe that's not a
> > ringing endorsement.
> > 
> > > +               width = ALIGN(f.width * 4, 32) / 32;
> > > +               height = ALIGN(f.height, 16) / 16;
> > > +               f.pitches[1] = ALIGN(width * 1, 128);
> > >                 f.modifier[1] = modifier;
> > >                 f.offsets[1] = size[0];
> > > -               size[1] = f.pitches[1] * ALIGN(height, 64);
> > > +               size[1] = f.pitches[1] * ALIGN(height, 32);
> > 
> > I changed this to f.height rather than height, because otherwise the
> > kernel was rejecting the aux buffer for being too small.
> 
> Congratulations, you found a bug in the kernel branch you're running.  The
> downsized height is definitely what we want and it works fine with my kernel
> branch.

This is why I'd like someone (post or pre merging the kernel patches) to
add a pile of subtests here (in the spirit of kms_addfb_basic) which goes
through all kinds of invalid alignment/settings and makes sure the kernel
catches them all. The current single invalid testcase we have is imo way
too trustful of kernel hacker's ability to get this right :-)

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch