[git pull] drm for 5.8-rc1

James Jones jajones at nvidia.com
Tue Sep 1 14:42:35 UTC 2020


On 9/1/20 3:59 AM, Karol Herbst wrote:
> On Tue, Sep 1, 2020 at 9:13 AM Daniel Vetter <daniel at ffwll.ch> wrote:
>>
>> On Tue, Aug 18, 2020 at 04:37:51PM +0200, Thierry Reding wrote:
>>> On Fri, Aug 14, 2020 at 07:25:17PM +0200, Daniel Vetter wrote:
>>>> On Fri, Aug 14, 2020 at 7:17 PM Daniel Stone <daniel at fooishbar.org> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On Fri, 14 Aug 2020 at 17:22, Thierry Reding <thierry.reding at gmail.com> wrote:
>>>>>> I suspect that the reason why this works in X but not in Wayland is
>>>>>> because X passes the right usage flags, whereas Weston may not. But I'll
>>>>>> have to investigate more in order to be sure.
>>>>>
>>>>> Weston allocates its own buffers for displaying the result of
>>>>> composition through GBM with USE_SCANOUT, which is definitely correct.
>>>>>
>>>>> Wayland clients (common to all compositors, in Mesa's
>>>>> src/egl/drivers/dri2/platform_wayland.c) allocate with USE_SHARED but
>>>>> _not_ USE_SCANOUT, which is correct in that they are guaranteed to be
>>>>> shared, but not guaranteed to be scanned out. The expectation is that
>>>>> non-scanout-compatible buffers would be rejected by gbm_bo_import if
>>>>> not drmModeAddFB2.
>>>>>
>>>>> One difference between Weston and all other compositors (GNOME Shell,
>>>>> KWin, Sway, etc) is that Weston uses KMS planes for composition when
>>>>> it can (i.e. when gbm_bo_import from dmabuf + drmModeAddFB2 from
>>>>> gbm_bo handle + atomic check succeed), but the other compositors only
>>>>> use the GPU. So if you have different assumptions about the layout of
>>>>> imported buffers between the GPU and KMS, that would explain a fair
>>>>> bit.
>>>>
>>>> Yeah non-modifiered multi-gpu (of any kind) is pretty much hopeless I
>>>> think. I guess the only option is if the tegra mesa driver forces
>>>> linear and an extra copy on everything that's USE_SHARED or
>>>> USE_SCANOUT.
>>>
>>> I ended up trying this, but this fails for the X case, unfortunately,
>>> because there doesn't seem to be a good synchronization point at which
>>> the de-tiling blit could be done. Weston and kmscube end up calling a
>>> gallium driver's ->flush_resource() implementation, but that never
>>> happens for X and glamor.
>>>
>>> But after looking into this some more, I don't think that's even the
>>> problem that we're facing here. The root of the problem that causes the
>>> glxgears crash that Karol was originally reporting is because we end up
>>> allocating the glxgears pixmaps using the dri3 loader from Mesa. But the
>>> dri3 loader will unconditionally pass both __DRI_IMAGE_USE_SHARE and
>>> __DRI_IMAGE_USE_SCANOUT, irrespective of whether the buffer will end up
>>> being scanned out directly or whether it will be composited onto the
>>> root window.
>>>
>>> What exactly happens depends on whether I run glxgears in fullscreen
>>> mode or windowed mode. In windowed mode, the glxgears buffers will be
>>> composited onto the root window, so there's no need for the buffers to
>>> be scanout-capable. If I modify the dri3 loader to not pass those flags
>>> I can make this work just fine.
>>>
>>> When I run glxgears in fullscreen mode, the modesetting driver ends up
>>> wanting to display the glxgears buffer directly on screen, without
>>> compositing it onto the root window. This ends up working if I leave out
>>> the _USE_SHARE and _USE_SCANOUT flags, but I notice that the kernel then
>>> complains about being unable to create a framebuffer, which in turn is
>>> caused by the fact that those buffers are not exported (the Tegra Mesa
>>> driver only exports/imports buffers that are meant for scanout, under
>>> the assumption that those are the only ones that will ever need to be
>>> used by KMS) and therefore Tegra DRM doesn't have a valid handle for
>>> them.
>>>
>>> So I think an ideal solution would probably be for glxgears to somehow
>>> pass better usage information when allocating buffers, but I suspect
>>> that that's just not possible, or would be way too much work and require
>>> additional protocol at the DRI level, so it's not really a good option
>>> when all we want to fix is backwards-compatibility with pre-modifiers
>>> userspace.
>>>
>>> Given that glamor also doesn't have any synchronization points, I don't
>>> see how I can implement the de-tiling blit reliably. I was wondering if
>>> it shouldn't be possible to flush the framebuffer resource (and perform
>>> the blit) at presentation time, but I couldn't find a good entry point
>>> to do this.
>>>
>>> One other solution that occurred to me was to reintroduce an old IOCTL
>>> that we used to have in the Tegra DRM driver. That IOCTL was meant to
>>> attach tiling meta data to an imported buffer and was basically a
>>> simplified, driver-specific way of doing framebuffer modifiers. That's
>>> a very ugly solution, but it would allow us to be backwards-compatible
>>> with pre-modifiers userspace and even use an optimal path for rendering
>>> and scanning out. The only prerequisite would be that the driver IOCTL
>>> was implemented and that a recent enough Mesa was used to make use of
>>> it. I don't like this very much because framebuffer modifiers are a much
>>> more generic solution, but all of the other options above are pretty
>>> much just as ugly.
>>>
>>> One other idea that I haven't explored yet is to be a little more clever
>>> about the export/import dance that we do for buffers. Currently we
>>> export/import at allocation time, and that seems to cause a bit of a
>>> problem, like the lack of valid GEM handles for some buffers (such as in
>>> the glxgears fullscreen use-case discussed above). I wonder if perhaps
>>> deferring the export/import dance until the handles are actually
>>> required may be a better way to do this. With such a solution, even if a
>>> buffer is allocated for scanout, it won't actually be imported/exported
>>> if the client ends up being composited onto the root window. Import and
>>> export would be limited to buffers that truly are going to be used for
>>> drmModeAddFB2(). I'll give that a shot and see if that gets me closer to
>>> my goal.
>>
>> (back from vacations)
>>
>> I think right thing to do is *shrug*, please use modifiers. They're meant
>> to solve these kind of problems for real, adding more hacks to paper over
>> userspace not using modifiers doesn't seem like a good idea.
>>
>> Wrt dri3, since we do client-side allocations and don't have modifiers, we
>> have to pessimistically assume we'll get scanned out. Modifiers and
>> relevant protocol is fixing this again, but for tegra where we essentially
>> can't get this right that leaves us in a very tough spot.
>>
>> So yeah I think "use modifiers" is the answer.
>> -Daniel
> 
> Right.. the issue is just that we don't have any X release fixing it
> and some compositors (mutter) don't do the right thing by default
> either :/ I will ask around for mutter, but for X we really need to do
> a release I think, just I've heard about regressions we need to fix
> first.

Yes, I have at least one more fix for modifiers that I need to write up 
on top of whatever's at ToT for X before someone cuts a release, as 
suggested (much) earlier in this thread.

Thanks,
-James

>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> http://blog.ffwll.ch
>>
> 


More information about the dri-devel mailing list