drm/vkms: deadlock between dev->event_lock and timer

Linus Torvalds torvalds at linux-foundation.org
Wed Sep 13 16:47:57 UTC 2023


On Wed, 13 Sept 2023 at 07:21, Tetsuo Handa
<penguin-kernel at i-love.sakura.ne.jp> wrote:
>
> Hello. A deadlock was reported in drivers/gpu/drm/vkms/ .
> It looks like this locking pattern remains as of 6.6-rc1. Please fix.
>
> void drm_crtc_vblank_off(struct drm_crtc *crtc) {
>   spin_lock_irq(&dev->event_lock);
>   drm_vblank_disable_and_save(dev, pipe) {
>     __disable_vblank(dev, pipe) {
>       crtc->funcs->disable_vblank(crtc) == vkms_disable_vblank {
>         hrtimer_cancel(&out->vblank_hrtimer) { // Retries with dev->event_lock held until lock_hrtimer_base() succeeds.
>           ret = hrtimer_try_to_cancel(timer) {
>             base = lock_hrtimer_base(timer, &flags); // Fails forever because vkms_vblank_simulate() is in progress.

Heh. Ok. This is clearly a bug, but it does seem to be limited to just
the vkms driver, and literally only to the "simulate vblank" case.

The worst part about it is that it's so subtle and not obvious.

Some light grepping seems to show that amdgpu has almost the exact
same pattern in its own vkms thing, except it uses

        hrtimer_try_to_cancel(&amdgpu_crtc->vblank_timer);

directly, which presumably fixes the livelock, but means that the
cancel will fail if it's currently running.

So just doing the same thing in the vkms driver probably fixes things.

Maybe the vkms people need to add a flag to say "it's canceled" so
that it doesn't then get re-enabled?  Or maybe it doesn't matter
and/or already happens for some reason I didn't look into.

                       Linus


More information about the dri-devel mailing list