[RFC v1 0/4] drm: Add support for DRM_CAP_DEFERRED_OUT_FENCE capability
Zhang, Tina
tina.zhang at intel.com
Mon Aug 2 04:48:47 UTC 2021
> -----Original Message-----
> From: Daniel Vetter <daniel at ffwll.ch>
> Sent: Friday, July 30, 2021 6:26 PM
> To: Kasireddy, Vivek <vivek.kasireddy at intel.com>
> Cc: dri-devel at lists.freedesktop.org; Daniel Vetter <daniel at ffwll.ch>; Gerd
> Hoffmann <kraxel at redhat.com>; Pekka Paalanen <ppaalanen at gmail.com>;
> Simon Ser <contact at emersion.fr>; Michel Dänzer <michel at daenzer.net>;
> Zhang, Tina <tina.zhang at intel.com>; Kim, Dongwon
> <dongwon.kim at intel.com>
> Subject: Re: [RFC v1 0/4] drm: Add support for
> DRM_CAP_DEFERRED_OUT_FENCE capability
>
> On Thu, Jul 29, 2021 at 01:16:55AM -0700, Vivek Kasireddy wrote:
> > By separating the OUT_FENCE signalling from pageflip completion allows
> > a Guest compositor to start a new repaint cycle with a new buffer
> > instead of waiting for the old buffer to be free.
> >
> > This work is based on the idea/suggestion from Simon and Pekka.
> >
> > This capability can be a solution for this issue:
> > https://gitlab.freedesktop.org/wayland/weston/-/issues/514
> >
> > Corresponding Weston MR:
> > https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/668
>
> Uh I kinda wanted to discuss this a bit more before we jump into typing code,
> but well I guess not that much work yet.
>
> So maybe I'm not understanding the problem, but I think the fundamental
> underlying issue is that with KMS you can have at most 2 buffers in-flight,
> due to our queue depth limit of 1 pending flip.
>
> Unfortunately that means for virtual hw where it takes a few more
> steps/vblanks until the framebuffer actually shows up on screen and is
> scanned out, we suffer deeply. The usual fix for that is to drop the latency
> and increase throughput, and have more buffers in-flight. Which this patch
> tries to do.
>
> Now I think where we go wrong here is that we're trying to hack this up by
> defining different semantics for the out-fence and for the drm-event. Imo
> that's wrong, they're both meant to show eactly the same thing:
> - when is the new frame actually visible to the user (as in, eyeballs in a
> human head, preferrably, not the time when we've handed the buffer off
> to the virtual hw)
> - when is the previous buffer no longer being used by the scanout hw
>
> We do cheat a bit right now in so far that we assume they're both the same,
> as in, panel-side latency is currently the compositor's problem to figure out.
>
> So for virtual hw I think the timestamp and even completion really need to
> happen only when the buffer has been pushed through the entire
> virtualization chain, i.e. ideally we get the timestamp from the kms driver
> from the host side. Currently that's not done, so this is most likely quite
> broken already (virtio relies on the no-vblank auto event sending, which
> definitely doesn't wait for anything, or I'm completely missing something).
Agree. One lesson we got from previous direct-display related work is that using host hardware event is kind of "must". Otherwise, problems like flickering or tearing or frame drop will come out. Besides, as the wayland-ui is working as a weston client, it needs to have more than 2 buffers to support the full-frame redraw. I tried the Weston-simple-dmabuf-egl with 2 buffers and it was bad.
BR,
Tina
>
> I think instead of hacking up some ill-defined 1.5 queue depth support, what
> we should do is support queue depth > 1 properly. So:
>
> - Change atomic to support queue depth > 1, this needs to be a per-driver
> thing due to a bunch of issues in driver code. Essentially drivers must
> never look at obj->state pointers, and only ever look up state through
> the passed-in drm_atomic_state * update container.
>
> - Aside: virtio should loose all it's empty hooks, there's no point in
> that.
>
> - We fix virtio to send out the completion event at the end of this entire
> pipeline, i.e. virtio code needs to take care of sending out the
> crtc_state->event correctly.
>
> - We probably also want some kind of (maybe per-crtc) recommended queue
> depth property so compositors know how many buffers to keep in flight.
> Not sure about that.
>
> It's a bit more work, but also a lot less hacking around infrastructure in
> dubious ways.
>
> Thoughts?
>
> Cheers, Daniel
>
> >
> > Cc: Daniel Vetter <daniel at ffwll.ch>
> > Cc: Gerd Hoffmann <kraxel at redhat.com>
> > Cc: Pekka Paalanen <ppaalanen at gmail.com>
> > Cc: Simon Ser <contact at emersion.fr>
> > Cc: Michel Dänzer <michel at daenzer.net>
> > Cc: Tina Zhang <tina.zhang at intel.com>
> > Cc: Dongwon Kim <dongwon.kim at intel.com>
> >
> > Vivek Kasireddy (4):
> > drm: Add a capability flag to support deferred out_fence signalling
> > virtio-gpu uapi: Add VIRTIO_GPU_F_OUT_FENCE feature
> > drm/virtio: Add VIRTIO_GPU_CMD_RESOURCE_OUT_FENCE cmd
> > drm/virtio: Probe and implement VIRTIO_GPU_F_OUT_FENCE feature
> >
> > drivers/gpu/drm/drm_file.c | 11 +++---
> > drivers/gpu/drm/drm_ioctl.c | 3 ++
> > drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 +
> > drivers/gpu/drm/virtio/virtgpu_drv.c | 1 +
> > drivers/gpu/drm/virtio/virtgpu_drv.h | 6 ++++
> > drivers/gpu/drm/virtio/virtgpu_fence.c | 9 +++++
> > drivers/gpu/drm/virtio/virtgpu_kms.c | 10 ++++--
> > drivers/gpu/drm/virtio/virtgpu_plane.c | 44 +++++++++++++++++++++++-
> > drivers/gpu/drm/virtio/virtgpu_vq.c | 17 +++++++++
> > include/drm/drm_mode_config.h | 9 +++++
> > include/uapi/drm/drm.h | 1 +
> > include/uapi/linux/virtio_gpu.h | 12 +++++++
> > 12 files changed, 117 insertions(+), 7 deletions(-)
> >
> > --
> > 2.30.2
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
More information about the dri-devel
mailing list