[PATCH 01/10] drm/vkms: Fix crc worker races

Wed Jun 12 13:33:11 UTC 2019

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
>
> The issue we have is that the crc worker might fall behind. We've
> tried to handle this by tracking both the earliest frame for which it
> still needs to compute a crc, and the last one. Plus when the
> crtc_state changes, we have a new work item, which are all run in
> order due to the ordered workqueue we allocate for each vkms crtc.
>
> Trouble is there's been a few small issues in the current code:
> - we need to capture frame_end in the vblank hrtimer, not in the
>   worker. The worker might run much later, and then we generate a lot
>   of crc for which there's already a different worker queued up.
> - frame number might be 0, so create a new crc_pending boolean to
>   track this without confusion.
> - we need to atomically grab frame_start/end and clear it, so do that
>   all in one go. This is not going to create a new race, because if we
>   race with the hrtimer then our work will be re-run.
> - only race that can happen is the following:
>   1. worker starts
>   2. hrtimer runs and updates frame_end
>   3. worker grabs frame_start/end, already reading the new frame_end,
>   and clears crc_pending
>   4. hrtimer calls queue_work()
>   5. worker completes
>   6. worker gets  re-run, crc_pending is false
>   Explain this case a bit better by rewording the comment.
>
> v2: Demote warning level output to debug when we fail to requeue, this
> is expected under high load when the crc worker can't quite keep up.
>
> Cc: Shayenne Moura <shayenneluzmoura at gmail.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo at gmail.com>
> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> Cc: Haneen Mohammed <hamohammed.sa at gmail.com>
> Cc: Daniel Vetter <daniel at ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_crc.c  | 27 +++++++++++++--------------
>  drivers/gpu/drm/vkms/vkms_crtc.c |  9 +++++++--
>  drivers/gpu/drm/vkms/vkms_drv.h  |  2 ++
>  3 files changed, 22 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> index d7b409a3c0f8..66603da634fe 100644
> --- a/drivers/gpu/drm/vkms/vkms_crc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> @@ -166,16 +166,24 @@ void vkms_crc_work_handle(struct work_struct *work)
>         struct drm_plane *plane;
>         u32 crc32 = 0;
>         u64 frame_start, frame_end;
> +       bool crc_pending;
>         unsigned long flags;
>
>         spin_lock_irqsave(&out->state_lock, flags);
>         frame_start = crtc_state->frame_start;
>         frame_end = crtc_state->frame_end;
> +       crc_pending = crtc_state->crc_pending;
> +       crtc_state->frame_start = 0;
> +       crtc_state->frame_end = 0;
> +       crtc_state->crc_pending = false;
>         spin_unlock_irqrestore(&out->state_lock, flags);
>
> -       /* _vblank_handle() hasn't updated frame_start yet */
> -       if (!frame_start || frame_start == frame_end)
> -               goto out;
> +       /*
> +        * We raced with the vblank hrtimer and previous work already computed
> +        * the crc, nothing to do.
> +        */
> +       if (!crc_pending)
> +               return;

I think this condition is not reachable because crc_pending will be
filled with true in `vkms_vblank_simulate()` which in turn schedule
the function `vkms_crc_work_handle()`. Just for checking, I tried to
reach this condition by running kms_flip, kms_pipe_crc_basic, and
kms_cursor_crc with two different VM setups[1], but I couldn't reach
it. What do you think?

[1] Qemu parameters
VM1: -m 1G -smp cores=2,cpus=2
VM2: -enable-kvm -m 2G -smp cores=4,cpus=4

>         drm_for_each_plane(plane, &vdev->drm) {
>                 struct vkms_plane_state *vplane_state;
> @@ -196,20 +204,11 @@ void vkms_crc_work_handle(struct work_struct *work)
>         if (primary_crc)
>                 crc32 = _vkms_get_crc(primary_crc, cursor_crc);
>
> -       frame_end = drm_crtc_accurate_vblank_count(crtc);
> -
> -       /* queue_work can fail to schedule crc_work; add crc for
> -        * missing frames
> +       /*
> +        * The worker can fall behind the vblank hrtimer, make sure we catch up.
>          */
>         while (frame_start <= frame_end)
>                 drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);

I want to take this opportunity to ask about this while; It's not
really specific to this patch.

I have to admit that I never fully got the idea behind this 'while';
it looks like that we just fill out the missed frames with a repeated
value. FWIU, `drm_crtc_add_crc_entry()` will add an entry with the CRC
information for a frame, but in this case, we are adding the same CRC
for a different set of frames. I agree that near frame has a similar
CRC value, but could we rely on this all the time? What could happen
if we have a great difference from the frame_start and frame_end?

> -
> -out:
> -       /* to avoid using the same value for frame number again */
> -       spin_lock_irqsave(&out->state_lock, flags);
> -       crtc_state->frame_end = frame_end;
> -       crtc_state->frame_start = 0;
> -       spin_unlock_irqrestore(&out->state_lock, flags);
>  }
>
>  static int vkms_crc_parse_source(const char *src_name, bool *enabled)
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index 1bbe099b7db8..c727d8486e97 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -30,13 +30,18 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>                  * has read the data
>                  */
>                 spin_lock(&output->state_lock);
> -               if (!state->frame_start)
> +               if (!state->crc_pending)
>                         state->frame_start = frame;
> +               else
> +                       DRM_DEBUG_DRIVER("crc worker falling behind, frame_start: %llu, frame_end: %llu\n",
> +                                        state->frame_start, frame);
> +               state->frame_end = frame;
> +               state->crc_pending = true;
>                 spin_unlock(&output->state_lock);
>
>                 ret = queue_work(output->crc_workq, &state->crc_work);
>                 if (!ret)
> -                       DRM_WARN("failed to queue vkms_crc_work_handle");
> +                       DRM_DEBUG_DRIVER("vkms_crc_work_handle already queued\n");
>         }
>
>         spin_unlock(&output->lock);
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 81f1cfbeb936..3c7e06b19efd 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -56,6 +56,8 @@ struct vkms_plane_state {
>  struct vkms_crtc_state {
>         struct drm_crtc_state base;
>         struct work_struct crc_work;
> +
> +       bool crc_pending;
>         u64 frame_start;
>         u64 frame_end;
>  };
> --
> 2.20.1
>

-- 

Rodrigo Siqueira
https://siqueira.tech