[RFC PATCH 3/3] drm/virtio: drm_gem_plane_helper_prepare_fb for obj synchronization

Kim, Dongwon dongwon.kim at intel.com
Tue Sep 5 21:08:46 UTC 2023


Hi Dmitry,

On 8/31/2023 3:51 PM, Dmitry Osipenko wrote:
> On 8/24/23 20:58, Kim, Dongwon wrote:
> ...
>>> You can do fence-wait in the guest userspace/Mesa after blitting/drawing
>>> to the udmabuf.
>> There is already synchronization between QEMU and virtio-gpu driver on
>> the guest. Upon resource flush, virtio-gpu waits for the response for
>> the message from the QEMU and QEMU sends out the response once rendering
>> is done. The problem we are seeing is not that the rendering part is
>> reusing the buffer before it's displayed by the QEMU. Problem we are
>> facing is more like some frame is often not finished when
>> "resource-flush" is issued. So unless there is a way for QEMU to wait
>> for this fence (guest drm), I think we should have some synchronization
>> point in the guest side.
>>
>> I saw other DRM drivers, omap, tegra, vc4 and so on are doing the
>> similar so I guess this is a generic solution for such cases. But I do
>> understand your concern as the primary use case of virtio-gpu driver is
>> for virgl. So this extra wait would cost some performance drop. But I
>> have a couple of points here as well.
>>
>> 1. Wouldn't this extra wait caused by drm_gem_plane_helper_prepare_fb be
>> minimal as the actual
>> rendering is done in the host?
>>
>> 2. Can we just make this helper called only if virgl is not used as 3D
>> driver?
> The problem you described above shouldn't be resolved by your patch. You
> need to wait for FB to be released by the host's display and not to
> before GPU finished rendering on guest. I.e. you're swapping display
> buffers and your dGPU starts rendering to the buffer that is in active
> use by host's display, correct?

I don't believe the guest will start rendering on the same FB while host is
consuming it because the virtio-gpu driver on the guest won't release the FB for the next
frame before it gets the virtio resp for the resource flush command and the host (QEMU)
will hold the response until the rendering is finished.

And having this helper clearly fixes the issue we encountered (some old frame is
sometimes shown..). But you might be right. The way I understood the original problem
might not be 100% correct as it's based on my assumption on how things were fixed
with addition of the helper. Then can you help me to approach to the real problem?
If it's really fixed by the helper, then what would the original issue be?

>
> Maybe you need to do glFinish() on host after swapping buffers? But that
> will block host for a long time.

I can try glFinish everytime after it draws a frame on the host. But..
If that is the issue, then wouldn't I expect some skipped frames, not old frames?

> For now I don't have solution.
>
>>> Virglrender today supports native contexts. The method you're using for
>>> GPU priming was proven to be slow in comparison to multi-gpu native
>>> contexts. There is ongoing work for supporting fence passing from guest
>>> to host [1] that allows to do fence-syncing on host. You'll find links
>>> to the WIP virtio-intel native context in [1] as well. You won't find
>>> GPU priming support using native context in [1], patches hasn't been
>>> published yet.
>>>
>>> [1]
>>> https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/1138
>>>
>>> Note that in general it's not acceptable to upstream patches that serve
>>> downstream only. Yours display sync issue is irrelevant to the upstream
>>> stack unless you're going to upstream all the VMM and guest userspace
>>> patches, and in such case you should always publish all the patches and
>>> provide links.
>>>
>>> So, you need to check the performance impact and publish all the patches
>>> to the relevant upstream projects.
>> QEMU has all patches regarding this (blob scanout support) but guest Mesa
>> patch for KMSRO is still outstanding.
>>
>> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9592
>>
>> I understand this specific patch would need more
>> discussion/justification but
>> what about other two, are you generally ok with those in the same series?
> The first patch should be dropped. The second one could be useful,

Yes, the first one will be gone. I will upload a revised version of the second one
only with proper explanation.

> you'll need to provide step-by-step instruction for how to reproduce the
> multi-display issue, please write it in the cover-letter for the next
> patch version.

Thanks. Always appreciate your feedback!!



More information about the dri-devel mailing list