regression with mainline kernel

Linus Torvalds torvalds at linux-foundation.org
Sat Nov 13 17:06:57 UTC 2021


[ Hmm. This email was marked as spam for me. I see no obvious reason
for it being marked as spam, but it happens.. ]

On Thu, Nov 11, 2021 at 12:52 PM Sudip Mukherjee
<sudipm.mukherjee at gmail.com> wrote:
>
> # first bad commit: [cd7f5ca33585918febe5e2f6dc090a21cfa775b0]
> drm/virtio: implement context init: add virtio_gpu_fence_event

Hmm. Judging from your automated screenshots, the login never appears.

> And, indeed reverting cd7f5ca33585 on top of debe436e77c7 has fixed
> the problem I was seeing on my qemu test of x86_64. The qemu image is
> based on Ubuntu.

Presumably either that commit is somehow buggy in itself - or it does
exactly what it means to do, and the new poll() semantics just
confuses the heck out of the X server (or wayland or whatever).

And honestly, if I read that thing correctly, the patch is entirely
broken. The new poll function (virtio_gpu_poll()) will unconditionally
remove the first event from the event list, and then report "Yeah, I
had events".

This is completely bogus for a few reasons:

 - poll() really should be idempotent, because the poll function gets
called multiple times

 - it doesn't even seem to check that the event that it removes is the
new VIRTGPU_EVENT_FENCE_SIGNALED_INTERNAL kind of event, so it will
unconditionally just remove random events.

 - it does seem to check the "vfpriv->ring_idx_mask" and do the old
thing if that is zero, but I see absolutely no reason for that (and
that check itself has caused problems, see below)

Honestly, my reaction to this all is that that commit is fundamentally
broken and probably should be reverted regardless as "this commit does
bad things".

HOWEVER - it has had a fix for a NULL pointer dereference in the
meantime - can you check whether the current top of tree happens to
work for you? Maybe your problem isn't due to "that commit does
unnatural things", but simply due to the bug fixed in d89c0c8322ec
("drm/virtio: Fix NULL dereference error in virtio_gpu_poll").

And if it's still broken with that commit, I'll happily revert it and
people need to go back to the drawing board.

In fact, I would really suggest that people look at that
virtio_gpu_poll() function regardless. That odd "let's unconditionally
just drop events in the poll function is really REALLY broken
behavior.

              Linus


More information about the dri-devel mailing list