[PATCH V2] drm/vkms: Avoid extra discount in the timestamp value

Daniel Vetter daniel at ffwll.ch
Wed Jun 5 12:22:57 UTC 2019


On Tue, Jun 04, 2019 at 11:45:43PM -0300, Rodrigo Siqueira wrote:
> After the commit def35e7c5926 ("drm/vkms: Bugfix extra vblank frame")
> some of the crc tests started to fail in the vkms with the following
> error:
> 
>  [drm:drm_crtc_add_crc_entry [drm]] *ERROR* Overflow of CRC buffer,
>     userspace reads too slow.
>  [drm] failed to queue vkms_crc_work_handle
>  ...
> 
> The aforementioned commit fixed the extra vblank added by
> `drm_crtc_arm_vblank_event()` which is invoked inside
> `vkms_crtc_atomic_flush()` if the vblank event count was zero, otherwise
> `drm_crtc_send_vblank_event()` is invoked. The fix was implemented in
> `vkms_get_vblank_timestamp()` by subtracting one period from the current
> timestamp, as the code snippet below illustrates:
> 
>  if (!in_vblank_irq)
>   *vblank_time -= output->period_ns;
> 
> The above fix works well when `drm_crtc_arm_vblank_event()` is invoked.
> However, it does not properly work when `drm_crtc_send_vblank_event()`
> executes since it subtracts the correct timestamp, which it shouldn't.
> In this case, the `drm_crtc_accurate_vblank_count()` function will
> returns the wrong frame number, which generates the aforementioned
> error. Such decrease in `get_vblank_timestamp()` produce a negative
> number in the following calculation within `drm_update_vblank_count()`:
> 
>  u64 diff_ns = ktime_to_ns(ktime_sub(t_vblank, vblank->time));
> 
> After this operation, the DIV_ROUND_CLOSEST_ULL macro is invoked using
> diff_ns with a negative number, which generates an undefined result;
> therefore, the returned frame is a huge and incorrect number. Finally,
> the code below is part of the `vkms_crc_work_handle()`, note that the
> while loop depends on the returned value from
> `drm_crtc_accurate_vblank_count()` which may cause the loop to take a
> long time to finish in case of huge value.
> 
>  frame_end = drm_crtc_accurate_vblank_count(crtc);
>  while (frame_start <= frame_end)
>    drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);
> 
> This commit fixes this issue by checking if the vblank timestamp
> corresponding to the current software vblank counter is equal to the
> current vblank; if they are equal, it means that
> `drm_crtc_send_vblank_event()` was invoked and vkms does not need to
> discount the extra vblank, otherwise, `drm_crtc_arm_vblank_event()` was
> executed and vkms have to discount the extra vblank. This fix made the
> CRC tests work again whereas keep all tests from kms_flip working as
> well.
> 
> V2: Update commit message
> 
> Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo at gmail.com>
> Signed-off-by: Shayenne Moura <shayenneluzmoura at gmail.com>

Thanks a lot for typing up this commit message. I'm still not following
what's going on (this stuff is tricky), but now I think I can at least ask
useful scenarios.

For my understanding: Things go wrong when in the function
vkms_crtc_atomic_flush() the call to drm_crtc_vblank_get() returns 0, and
we call drm_crtc_send_vblank_event() directly? I'm not 100% from your
description above whether that's the failure case where everything blows
up.

The other part I'm having a hard time understanding: If we are in the case
where we call drm_crtc_send_vblank_event() directly, how do we end up in
the vkms_get_vblank_timestamp().

>From what I can see there's not caller to where we sample a new vblank ...
I think a full backtrace that hits the new condition you're adding to
understand when exactly we're hitting it would be perfect.

> ---
>  drivers/gpu/drm/vkms/vkms_crtc.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index 7508815fac11..3ce60e66673e 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -74,9 +74,13 @@ bool vkms_get_vblank_timestamp(struct drm_device *dev, unsigned int pipe,
>  {
>  	struct vkms_device *vkmsdev = drm_device_to_vkms_device(dev);
>  	struct vkms_output *output = &vkmsdev->output;
> +	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
>  
>  	*vblank_time = output->vblank_hrtimer.node.expires;
>  
> +	if (*vblank_time == vblank->time)
> +		return true;

Ok, I think with the above I do now have a rough idea what's going wrong.
I think that would be a bug in the drm_vblank.c. Or at least it could be,
I think I first need to better understand still what's going on here to
decided that.

I have a bit a hunch this is fallout from our vblank fudging, but I guess
we'll see.

It's definitely clear that things blow up if the vblank time somehow goes
backwards, but I don't understand yet how that's possible.
-Daniel
> +
>  	if (!in_vblank_irq)
>  		*vblank_time -= output->period_ns;
>  
> -- 
> 2.21.0
> 



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list