[PATCH 2/2] drm/vkms: Compute CRC with Cursor Plane

Daniel Vetter daniel at ffwll.ch
Tue Aug 14 19:52:33 UTC 2018


On Tue, Aug 14, 2018 at 9:03 PM, Haneen Mohammed
<hamohammed.sa at gmail.com> wrote:
> On Tue, Aug 14, 2018 at 10:21:29AM +0200, Daniel Vetter wrote:
>> On Mon, Aug 13, 2018 at 11:04:11PM +0300, Haneen Mohammed wrote:
>> > On Wed, Aug 08, 2018 at 10:23:27AM +0200, Daniel Vetter wrote:
>> > > On Wed, Aug 08, 2018 at 06:53:17AM +0300, Haneen Mohammed wrote:
>> > > > On Tue, Aug 07, 2018 at 06:33:36PM +0200, Daniel Vetter wrote:
>> > > > > On Mon, Aug 06, 2018 at 06:58:29AM +0300, Haneen Mohammed wrote:
>> > > > > > This patch compute CRC for output frame with cursor and primary plane.
>> > > > > > Blend cursor with primary plane and compute CRC on the resulted frame.
>> > > > > >
>> > > > > > Signed-off-by: Haneen Mohammed <hamohammed.sa at gmail.com>
>> > > > >
>> > > > > Nice!
>> > > > >
>> > > > > I assume with this you're passing all the crc based cursor tests in igt?
>> > > > > If so, please mention this in the commit message, so that there's a record
>> > > > > of the testing done on this.
>> > > > >
>> > > >
>> > > > Sure, I'll update the commit message.
>> > > > Is there any other change I should add/fix to this patchset?
>> > > >
>> > > > > One fairly huge gap we iirc have in our cursor testing is that there's not
>> > > > > much (if any?) alpha blending coverage. We kinda need that to make sure
>> > > > > this all works correctly. The usual trick is to only check the extreme
>> > > > > alpha values, i.e. fully opaque and fully transparent, since intermediate
>> > > > > values are affected by hw-specific rounding modes.
>> > > > > > ---
>> > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 149 +++++++++++++++++++++++++-----
>> > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   |   5 +-
>> > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  10 +-
>> > > > > >  3 files changed, 137 insertions(+), 27 deletions(-)
>> > > > > >
>> > > > > > diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
>> > > > > > index 37d717f38e3c..4853a739ae5a 100644
>> > > > > > --- a/drivers/gpu/drm/vkms/vkms_crc.c
>> > > > > > +++ b/drivers/gpu/drm/vkms/vkms_crc.c
>> > > > > > @@ -1,36 +1,137 @@
>> > > > > >  // SPDX-License-Identifier: GPL-2.0
>> > > > > >  #include "vkms_drv.h"
>> > > > > >  #include <linux/crc32.h>
>> > > > > > +#include <drm/drm_atomic.h>
>> > > > > > +#include <drm/drm_atomic_helper.h>
>> > > > > >  #include <drm/drm_gem_framebuffer_helper.h>
>> > > > > >
>> > > > > > -static uint32_t _vkms_get_crc(struct vkms_crc_data *crc_data)
>> > > > > > +/**
>> > > > > > + * compute_crc - Compute CRC value on output frame
>> > > > > > + *
>> > > > > > + * @vaddr_out: address to final framebuffer
>> > > > > > + * @crc_out: framebuffer's metadata
>> > > > > > + *
>> > > > > > + * returns CRC value computed using crc32 on the visible portion of
>> > > > > > + * the final framebuffer at vaddr_out
>> > > > > > + */
>> > > > > > +static uint32_t compute_crc(void *vaddr_out, struct vkms_crc_data *crc_out)
>> > > > > >  {
>> > > > > > -   struct drm_framebuffer *fb = &crc_data->fb;
>> > > > > > +   int i, src_offset;
>> > > > > > +   int x_src = crc_out->src.x1 >> 16;
>> > > > > > +   int y_src = crc_out->src.y1 >> 16;
>> > > > > > +   int h_src = drm_rect_height(&crc_out->src) >> 16;
>> > > > > > +   int w_src = drm_rect_width(&crc_out->src) >> 16;
>> > > > > > +   int size_byte = w_src * crc_out->cpp;
>> > > > > > +   u32 crc = 0;
>> > > > > > +
>> > > > > > +   for (i = y_src; i < y_src + h_src; i++) {
>> > > > > > +           src_offset = crc_out->offset
>> > > > > > +                        + (i * crc_out->pitch)
>> > > > > > +                        + (x_src * crc_out->cpp);
>> > > > > > +           crc = crc32_le(crc, vaddr_out + src_offset, size_byte);
>> > > > > > +   }
>> > > > > > +
>> > > > > > +   return crc;
>> > > > > > +}
>> > > > > > +
>> >
>> > Hey Daniel,
>> >
>> > When I change the above function "compute_crc" to compute the CRC for
>> > each pixel so we can clear the Alpha value before computing the CRC as bellow,
>> > the test "nonblocking-crc-pipe-A-frame-sequence" sometimes failes and
>> > sometimes passes. Should I still continue with the change, or leave it
>> > as it is?
>>
>> Hm, how does it fail? Could be that the code becomes too slow, and then
>> the frame counters don't increment nicely anymore. Or there's a race
>> somewhere in the CRC code that gets exposed. But that's just me guessing.
>> -Daniel
>>
>
> Oh sorry I should've elaborated more on how it fails. Yeah I think it
> becomes too slow and that can cause mismatch to frame counters?
>
> -----
> (kms_pipe_crc_basic:1902) CRITICAL: Failed assertion: crcs[j].frame + 1 == crcs[j + 1].frame
> (kms_pipe_crc_basic:1902) CRITICAL: Last errno: 9, Bad file descriptor
> (kms_pipe_crc_basic:1902) CRITICAL: error: 1090928 != 1090929
> -----

Yeah misses the deadline it seems.

> Maybe it's not that bad/slow because I've noticed that this happen
> occasionally when I enable Ftrace.

Yeah ftrace can have some decent overhead. And we're adding a pile of
function calls, because crc32_le isn't inlined. Does it still happen
without ftrace?

Optimizing this will be some real work - we'd need a local cache to
compute an entire block of blended pixels into, then add an entire
block. Block size needs to be large enough to be efficient, but not so
big that we start trashing the cache or have to allocate memory (stack
space in the kernel is very limited). Would be good if we can avoid
all this work just now ...
-Danie

>
>> >
>> > ----------------- vkms_crc.c -----------------
>> >     for (i = y_src; i < y_src + h_src; ++i) {
>> >             for (j = x_src; j < x_src + w_src; ++j) {
>> >                     src_offset = crc_out->offset
>> >                                  + (i * crc_out->pitch)
>> >                                  + (j * crc_out->cpp);
>> >                     memset(vaddr_out + src_offset + 24, 0,  8);
>> >                     crc = crc32_le(crc, vaddr_out + src_offset, sizeof(u32));
>> >             }
>> >     }
>> > ----------------- vkms_crc.c -----------------
>> >
>> > Thank you,
>> > Haneen
>> >
>> > > > > > +/**
>> > > > > > + * blend - belnd value at vaddr_src with value at vaddr_dst
>> > > > > > + * @vaddr_dst: destination address
>> > > > > > + * @vaddr_src: source address
>> > > > > > + * @crc_dst: destination framebuffer's metadata
>> > > > > > + * @crc_src: source framebuffer's metadata
>> > > > > > + *
>> > > > > > + * Blend value at vaddr_src with value at vaddr_dst.
>> > > > > > + * Currently, this function write value at vaddr_src on value
>> > > > > > + * at vaddr_dst using buffer's metadata to locate the new values
>> > > > > > + * from vaddr_src and their distenation at vaddr_dst.
>> > > > > > + *
>> > > > > > + * Todo: Use the alpha value to blend vaddr_src with vaddr_dst
>> > > > > > + *  instead of overwriting it.
>> > > > >
>> > > > > Another todo: We must clear the alpha channel in the result after
>> > > > > blending. This also applies to the primary plane, where the XRGB for the
>> > > > > pixel format mandates that we entirely ignore the alpha channel.
>> > > > >
>> > > > > This is also something we should probably have an igt testcase for.
>> > > > >
>> > > > > Since there's a few open ends: How many weeks are left in your outreachy
>> > > > > internship? We need to make sure that at least all the issues are covered
>> > > > > in a vkms todo file. Would be great to add that in
>> > > > > Documentation/gpu/vkms.rst, like we have for other drivers.
>> > > > > -Daniel
>> > > > >
>> > > >
>> > > > I've one more week! I can use this week to prepare the todo file and
>> > > > finalize this patch?
>> > >
>> > > Yeah sounds like the perfect wrap-up work. Since this wont be enough to
>> > > finish the cursor work it would be good to hide the cursor setup behind a
>> > > module option, perhaps "enable_cursor" or so. That way we wont have
>> > > not-totally-correct features enabled. And enabling/disabling cursor
>> > > support could be useful for testing.
>> > > -Daniel
>> > >
>> > > >
>> > > > Thank you,
>> > > > Haneen
>> > > >
>> > > > > > + */
>> > > > > > +static void blend(void *vaddr_dst, void *vaddr_src,
>> > > > > > +             struct vkms_crc_data *crc_dst,
>> > > > > > +             struct vkms_crc_data *crc_src)
>> > > > > > +{
>> > > > > > +   int i, j, j_dst, i_dst;
>> > > > > > +   int offset_src, offset_dst;
>> > > > > > +
>> > > > > > +   int x_src = crc_src->src.x1 >> 16;
>> > > > > > +   int y_src = crc_src->src.y1 >> 16;
>> > > > > > +
>> > > > > > +   int x_dst = crc_src->dst.x1;
>> > > > > > +   int y_dst = crc_src->dst.y1;
>> > > > > > +   int h_dst = drm_rect_height(&crc_src->dst);
>> > > > > > +   int w_dst = drm_rect_width(&crc_src->dst);
>> > > > > > +
>> > > > > > +   int y_limit = y_src + h_dst;
>> > > > > > +   int x_limit = x_src + w_dst;
>> > > > > > +
>> > > > > > +   for (i = y_src, i_dst = y_dst; i < y_limit; ++i) {
>> > > > > > +           for (j = x_src, j_dst = x_dst; j < x_limit; ++j) {
>> > > > > > +                   offset_dst = crc_dst->offset
>> > > > > > +                                + (i_dst * crc_dst->pitch)
>> > > > > > +                                + (j_dst++ * crc_dst->cpp);
>> > > > > > +                   offset_src = crc_src->offset
>> > > > > > +                                + (i * crc_src->pitch)
>> > > > > > +                                + (j * crc_src->cpp);
>> > > > > > +
>> > > > > > +                   memcpy(vaddr_dst + offset_dst,
>> > > > > > +                          vaddr_src + offset_src, sizeof(u32));
>> > > > > > +           }
>> > > > > > +           i_dst++;
>> > > > > > +   }
>> > > > > > +}
>> > > > > > +
>> > > > > > +static void compose_cursor(struct vkms_crc_data *cursor_crc,
>> > > > > > +                      struct vkms_crc_data *primary_crc, void *vaddr_out)
>> > > > > > +{
>> > > > > > +   struct drm_gem_object *cursor_obj;
>> > > > > > +   struct vkms_gem_object *cursor_vkms_obj;
>> > > > > > +
>> > > > > > +   cursor_obj = drm_gem_fb_get_obj(&cursor_crc->fb, 0);
>> > > > > > +   cursor_vkms_obj = drm_gem_to_vkms_gem(cursor_obj);
>> > > > > > +
>> > > > > > +   mutex_lock(&cursor_vkms_obj->pages_lock);
>> > > > > > +   if (WARN_ON(!cursor_vkms_obj->vaddr))
>> > > > > > +           goto out;
>> > > > > > +
>> > > > > > +   blend(vaddr_out, cursor_vkms_obj->vaddr, primary_crc, cursor_crc);
>> > > > > > +
>> > > > > > +out:
>> > > > > > +   mutex_unlock(&cursor_vkms_obj->pages_lock);
>> > > > > > +}
>> > > > > > +
>> > > > > > +static uint32_t _vkms_get_crc(struct vkms_crc_data *primary_crc,
>> > > > > > +                         struct vkms_crc_data *cursor_crc)
>> > > > > > +{
>> > > > > > +   struct drm_framebuffer *fb = &primary_crc->fb;
>> > > > > >     struct drm_gem_object *gem_obj = drm_gem_fb_get_obj(fb, 0);
>> > > > > >     struct vkms_gem_object *vkms_obj = drm_gem_to_vkms_gem(gem_obj);
>> > > > > > +   void *vaddr_out = kzalloc(vkms_obj->gem.size, GFP_KERNEL);
>> > > > > >     u32 crc = 0;
>> > > > > > -   int i = 0;
>> > > > > > -   unsigned int x = crc_data->src.x1 >> 16;
>> > > > > > -   unsigned int y = crc_data->src.y1 >> 16;
>> > > > > > -   unsigned int height = drm_rect_height(&crc_data->src) >> 16;
>> > > > > > -   unsigned int width = drm_rect_width(&crc_data->src) >> 16;
>> > > > > > -   unsigned int cpp = fb->format->cpp[0];
>> > > > > > -   unsigned int src_offset;
>> > > > > > -   unsigned int size_byte = width * cpp;
>> > > > > > -   void *vaddr;
>> > > > > >
>> > > > > > -   mutex_lock(&vkms_obj->pages_lock);
>> > > > > > -   vaddr = vkms_obj->vaddr;
>> > > > > > -   if (WARN_ON(!vaddr))
>> > > > > > -           goto out;
>> > > > > > +   if (!vaddr_out) {
>> > > > > > +           DRM_ERROR("Failed to allocate memory for output frame.");
>> > > > > > +           return 0;
>> > > > > > +   }
>> > > > > >
>> > > > > > -   for (i = y; i < y + height; i++) {
>> > > > > > -           src_offset = fb->offsets[0] + (i * fb->pitches[0]) + (x * cpp);
>> > > > > > -           crc = crc32_le(crc, vaddr + src_offset, size_byte);
>> > > > > > +   mutex_lock(&vkms_obj->pages_lock);
>> > > > > > +   if (WARN_ON(!vkms_obj->vaddr)) {
>> > > > > > +           mutex_lock(&vkms_obj->pages_lock);
>> > > > > > +           return crc;
>> > > > > >     }
>> > > > > >
>> > > > > > -out:
>> > > > > > +   memcpy(vaddr_out, vkms_obj->vaddr, vkms_obj->gem.size);
>> > > > > >     mutex_unlock(&vkms_obj->pages_lock);
>> > > > > > +
>> > > > > > +   if (cursor_crc)
>> > > > > > +           compose_cursor(cursor_crc, primary_crc, vaddr_out);
>> > > > > > +
>> > > > > > +   crc = compute_crc(vaddr_out, primary_crc);
>> > > > > > +
>> > > > > > +   kfree(vaddr_out);
>> > > > > > +
>> > > > > >     return crc;
>> > > > > >  }
>> > > > > >
>> > > > > > @@ -44,8 +145,8 @@ void vkms_crc_work_handle(struct work_struct *work)
>> > > > > >     struct vkms_device *vdev = container_of(out, struct vkms_device,
>> > > > > >                                             output);
>> > > > > >     struct vkms_crc_data *primary_crc = NULL;
>> > > > > > +   struct vkms_crc_data *cursor_crc = NULL;
>> > > > > >     struct drm_plane *plane;
>> > > > > > -
>> > > > > >     u32 crc32 = 0;
>> > > > > >
>> > > > > >     drm_for_each_plane(plane, &vdev->drm) {
>> > > > > > @@ -58,14 +159,14 @@ void vkms_crc_work_handle(struct work_struct *work)
>> > > > > >             if (drm_framebuffer_read_refcount(&crc_data->fb) == 0)
>> > > > > >                     continue;
>> > > > > >
>> > > > > > -           if (plane->type == DRM_PLANE_TYPE_PRIMARY) {
>> > > > > > +           if (plane->type == DRM_PLANE_TYPE_PRIMARY)
>> > > > > >                     primary_crc = crc_data;
>> > > > > > -                   break;
>> > > > > > -           }
>> > > > > > +           else
>> > > > > > +                   cursor_crc = crc_data;
>> > > > > >     }
>> > > > > >
>> > > > > >     if (primary_crc)
>> > > > > > -           crc32 = _vkms_get_crc(primary_crc);
>> > > > > > +           crc32 = _vkms_get_crc(primary_crc, cursor_crc);
>> > > > > >
>> > > > > >     drm_crtc_add_crc_entry(crtc, true, crtc_state->n_frame, &crc32);
>> > > > > >  }
>> > > > > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
>> > > > > > index 36e524f791fe..173875dc361e 100644
>> > > > > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
>> > > > > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
>> > > > > > @@ -25,8 +25,11 @@ static const u32 vkms_cursor_formats[] = {
>> > > > > >  };
>> > > > > >
>> > > > > >  struct vkms_crc_data {
>> > > > > > -   struct drm_rect src;
>> > > > > >     struct drm_framebuffer fb;
>> > > > > > +   struct drm_rect src, dst;
>> > > > > > +   unsigned int offset;
>> > > > > > +   unsigned int pitch;
>> > > > > > +   unsigned int cpp;
>> > > > > >  };
>> > > > > >
>> > > > > >  /**
>> > > > > > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
>> > > > > > index 428247d403dc..7041007396ae 100644
>> > > > > > --- a/drivers/gpu/drm/vkms/vkms_plane.c
>> > > > > > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
>> > > > > > @@ -85,16 +85,22 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>> > > > > >                                  struct drm_plane_state *old_state)
>> > > > > >  {
>> > > > > >     struct vkms_plane_state *vkms_plane_state;
>> > > > > > +   struct drm_framebuffer *fb = plane->state->fb;
>> > > > > >     struct vkms_crc_data *crc_data;
>> > > > > >
>> > > > > > -   if (!plane->state->crtc || !plane->state->fb)
>> > > > > > +   if (!plane->state->crtc || !fb)
>> > > > > >             return;
>> > > > > >
>> > > > > >     vkms_plane_state = to_vkms_plane_state(plane->state);
>> > > > > > +
>> > > > > >     crc_data = vkms_plane_state->crc_data;
>> > > > > >     memcpy(&crc_data->src, &plane->state->src, sizeof(struct drm_rect));
>> > > > > > -   memcpy(&crc_data->fb, plane->state->fb, sizeof(struct drm_framebuffer));
>> > > > > > +   memcpy(&crc_data->dst, &plane->state->dst, sizeof(struct drm_rect));
>> > > > > > +   memcpy(&crc_data->fb, fb, sizeof(struct drm_framebuffer));
>> > > > > >     drm_framebuffer_get(&crc_data->fb);
>> > > > > > +   crc_data->offset = fb->offsets[0];
>> > > > > > +   crc_data->pitch = fb->pitches[0];
>> > > > > > +   crc_data->cpp = fb->format->cpp[0];
>> > > > > >  }
>> > > > > >
>> > > > > >  static int vkms_plane_atomic_check(struct drm_plane *plane,
>> > > > > > --
>> > > > > > 2.17.1
>> > > > > >
>> > > > >
>> > > > > --
>> > > > > Daniel Vetter
>> > > > > Software Engineer, Intel Corporation
>> > > > > http://blog.ffwll.ch
>> > >
>> > > --
>> > > Daniel Vetter
>> > > Software Engineer, Intel Corporation
>> > > http://blog.ffwll.ch
>>
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


More information about the dri-devel mailing list