Internal DRI subsystem locking and contention between connector commits

Pekka Paalanen ppaalanen at gmail.com
Fri Oct 7 09:18:36 UTC 2022


On Thu, 6 Oct 2022 12:03:56 +0000
"Hoosier, Matt" <Matt.Hoosier at garmin.com> wrote:

> I have a DRM master implementing a purpose-built compositor for a
> dedicated use-case. It drives several different connectors, each on
> its own vsync cadence (there's no clone mode happening here).
> 
> The goal is to have commits to each connector occur completely
> without respect to whatever is happening on the other connectors.
> There's a different thread issuing the DRI ioctl's for each connector.
> 
> In the compositor, each connector is treated like its own little
> universe; a disjoint set of CRTCs and planes is earmarked for use by
> each of the connectors. One intention for this is to avoid sharing
> resources in a way that would introduce implicit synchronization
> points between the two connector's event loops. So, atomic commits
> made to one connector never attempt to use a resource that's ever
> been used in a commit to a different connector. This may be relevant
> to a question I'll ask a bit later below about resource locking
> contention.
> 
> For some time, I've been noticing that even test-only atomic commits
> done on connector A will sometimes block for many frame-times.
> Analysis with the DRI driver implementor has shown that the atomic
> commits to A--whether DRM_MODE_ATOMIC_TEST_ONLY or
> DRM_MODE_ATOMIC_NONBLOCK--are getting stuck in the ioctl entry code
> waiting for a DRI mutex.
> 
> It turns out that during these unexpected delays, the DRI driver's
> commit thread holds that mutex while servicing a commit to connector
> B. It does this while it waits for the fences to fire for all
> framebuffer IDs referred to by the pending connector B scene. So the
> commit to connector A can't be tested or enqueued until the commit to
> B is completely finished. The driver author reckons that this is
> unavoidable because every DRM_IOCTL_MODE_ATOMIC ioctl  needs to
> acquire the same global singleton DRM connection_mutex in order to
> query or manipulate the connector.
> 
> The result is that it's quite difficult to guarantee a framerate on
> connector A, because unrelated activity performed on connector B can
> hold global locks for an unpredictable amount of time.
> 
> The first question would be: does this story sound consistent? If so,
> then a couple more questions follow.
> 
> Is this kind of implicit interlocking expected? Is there any way to
> avoid the pending commits getting serialized like that on the kernel
> side?

Hi Matt,

Ville actually mentioned something very much like that recently, see
the thread at:
https://lore.kernel.org/dri-devel/20220916163331.6849-1-ville.syrjala@linux.intel.com/

If even non-blocking commits can stall test-only commits, that could be
a problem for Weston too. Weston being single-threaded wouldn't help.


Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20221007/7fd8a6cd/attachment.sig>


More information about the dri-devel mailing list