[git pull] drm for 5.8-rc1
Daniel Vetter
daniel at ffwll.ch
Tue Sep 1 07:13:47 UTC 2020
On Tue, Aug 18, 2020 at 04:37:51PM +0200, Thierry Reding wrote:
> On Fri, Aug 14, 2020 at 07:25:17PM +0200, Daniel Vetter wrote:
> > On Fri, Aug 14, 2020 at 7:17 PM Daniel Stone <daniel at fooishbar.org> wrote:
> > >
> > > Hi,
> > >
> > > On Fri, 14 Aug 2020 at 17:22, Thierry Reding <thierry.reding at gmail.com> wrote:
> > > > I suspect that the reason why this works in X but not in Wayland is
> > > > because X passes the right usage flags, whereas Weston may not. But I'll
> > > > have to investigate more in order to be sure.
> > >
> > > Weston allocates its own buffers for displaying the result of
> > > composition through GBM with USE_SCANOUT, which is definitely correct.
> > >
> > > Wayland clients (common to all compositors, in Mesa's
> > > src/egl/drivers/dri2/platform_wayland.c) allocate with USE_SHARED but
> > > _not_ USE_SCANOUT, which is correct in that they are guaranteed to be
> > > shared, but not guaranteed to be scanned out. The expectation is that
> > > non-scanout-compatible buffers would be rejected by gbm_bo_import if
> > > not drmModeAddFB2.
> > >
> > > One difference between Weston and all other compositors (GNOME Shell,
> > > KWin, Sway, etc) is that Weston uses KMS planes for composition when
> > > it can (i.e. when gbm_bo_import from dmabuf + drmModeAddFB2 from
> > > gbm_bo handle + atomic check succeed), but the other compositors only
> > > use the GPU. So if you have different assumptions about the layout of
> > > imported buffers between the GPU and KMS, that would explain a fair
> > > bit.
> >
> > Yeah non-modifiered multi-gpu (of any kind) is pretty much hopeless I
> > think. I guess the only option is if the tegra mesa driver forces
> > linear and an extra copy on everything that's USE_SHARED or
> > USE_SCANOUT.
>
> I ended up trying this, but this fails for the X case, unfortunately,
> because there doesn't seem to be a good synchronization point at which
> the de-tiling blit could be done. Weston and kmscube end up calling a
> gallium driver's ->flush_resource() implementation, but that never
> happens for X and glamor.
>
> But after looking into this some more, I don't think that's even the
> problem that we're facing here. The root of the problem that causes the
> glxgears crash that Karol was originally reporting is because we end up
> allocating the glxgears pixmaps using the dri3 loader from Mesa. But the
> dri3 loader will unconditionally pass both __DRI_IMAGE_USE_SHARE and
> __DRI_IMAGE_USE_SCANOUT, irrespective of whether the buffer will end up
> being scanned out directly or whether it will be composited onto the
> root window.
>
> What exactly happens depends on whether I run glxgears in fullscreen
> mode or windowed mode. In windowed mode, the glxgears buffers will be
> composited onto the root window, so there's no need for the buffers to
> be scanout-capable. If I modify the dri3 loader to not pass those flags
> I can make this work just fine.
>
> When I run glxgears in fullscreen mode, the modesetting driver ends up
> wanting to display the glxgears buffer directly on screen, without
> compositing it onto the root window. This ends up working if I leave out
> the _USE_SHARE and _USE_SCANOUT flags, but I notice that the kernel then
> complains about being unable to create a framebuffer, which in turn is
> caused by the fact that those buffers are not exported (the Tegra Mesa
> driver only exports/imports buffers that are meant for scanout, under
> the assumption that those are the only ones that will ever need to be
> used by KMS) and therefore Tegra DRM doesn't have a valid handle for
> them.
>
> So I think an ideal solution would probably be for glxgears to somehow
> pass better usage information when allocating buffers, but I suspect
> that that's just not possible, or would be way too much work and require
> additional protocol at the DRI level, so it's not really a good option
> when all we want to fix is backwards-compatibility with pre-modifiers
> userspace.
>
> Given that glamor also doesn't have any synchronization points, I don't
> see how I can implement the de-tiling blit reliably. I was wondering if
> it shouldn't be possible to flush the framebuffer resource (and perform
> the blit) at presentation time, but I couldn't find a good entry point
> to do this.
>
> One other solution that occurred to me was to reintroduce an old IOCTL
> that we used to have in the Tegra DRM driver. That IOCTL was meant to
> attach tiling meta data to an imported buffer and was basically a
> simplified, driver-specific way of doing framebuffer modifiers. That's
> a very ugly solution, but it would allow us to be backwards-compatible
> with pre-modifiers userspace and even use an optimal path for rendering
> and scanning out. The only prerequisite would be that the driver IOCTL
> was implemented and that a recent enough Mesa was used to make use of
> it. I don't like this very much because framebuffer modifiers are a much
> more generic solution, but all of the other options above are pretty
> much just as ugly.
>
> One other idea that I haven't explored yet is to be a little more clever
> about the export/import dance that we do for buffers. Currently we
> export/import at allocation time, and that seems to cause a bit of a
> problem, like the lack of valid GEM handles for some buffers (such as in
> the glxgears fullscreen use-case discussed above). I wonder if perhaps
> deferring the export/import dance until the handles are actually
> required may be a better way to do this. With such a solution, even if a
> buffer is allocated for scanout, it won't actually be imported/exported
> if the client ends up being composited onto the root window. Import and
> export would be limited to buffers that truly are going to be used for
> drmModeAddFB2(). I'll give that a shot and see if that gets me closer to
> my goal.
(back from vacations)
I think right thing to do is *shrug*, please use modifiers. They're meant
to solve these kind of problems for real, adding more hacks to paper over
userspace not using modifiers doesn't seem like a good idea.
Wrt dri3, since we do client-side allocations and don't have modifiers, we
have to pessimistically assume we'll get scanned out. Modifiers and
relevant protocol is fixing this again, but for tegra where we essentially
can't get this right that leaves us in a very tough spot.
So yeah I think "use modifiers" is the answer.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the dri-devel
mailing list