[Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

Tue Apr 27 12:06:02 UTC 2021

Correct, we wouldn't have synchronization between device with and 
without user queues any more.

That could only be a problem for A+I Laptops.

Memory management will just work with preemption fences which pause the 
user queues of a process before evicting something. That will be a 
dma_fence, but also a well known approach.

Christian.

Am 27.04.21 um 13:49 schrieb Marek Olšák:
> If we don't use future fences for DMA fences at all, e.g. we don't use 
> them for memory management, it can work, right? Memory management can 
> suspend user queues anytime. It doesn't need to use DMA fences. There 
> might be something that I'm missing here.
>
> What would we lose without DMA fences? Just inter-device 
> synchronization? I think that might be acceptable.
>
> The only case when the kernel will wait on a future fence is before a 
> page flip. Everything today already depends on userspace not hanging 
> the gpu, which makes everything a future fence.
>
> Marek
>
> On Tue., Apr. 27, 2021, 04:02 Daniel Vetter, <daniel at ffwll.ch 
> <mailto:daniel at ffwll.ch>> wrote:
>
>     On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
>     > Thanks everybody. The initial proposal is dead. Here are some
>     thoughts on
>     > how to do it differently.
>     >
>     > I think we can have direct command submission from userspace via
>     > memory-mapped queues ("user queues") without changing window
>     systems.
>     >
>     > The memory management doesn't have to use GPU page faults like HMM.
>     > Instead, it can wait for user queues of a specific process to go
>     idle and
>     > then unmap the queues, so that userspace can't submit anything.
>     Buffer
>     > evictions, pinning, etc. can be executed when all queues are
>     unmapped
>     > (suspended). Thus, no BO fences and page faults are needed.
>     >
>     > Inter-process synchronization can use timeline semaphores.
>     Userspace will
>     > query the wait and signal value for a shared buffer from the
>     kernel. The
>     > kernel will keep a history of those queries to know which process is
>     > responsible for signalling which buffer. There is only the
>     wait-timeout
>     > issue and how to identify the culprit. One of the solutions is
>     to have the
>     > GPU send all GPU signal commands and all timed out wait commands
>     via an
>     > interrupt to the kernel driver to monitor and validate userspace
>     behavior.
>     > With that, it can be identified whether the culprit is the
>     waiting process
>     > or the signalling process and which one. Invalid signal/wait
>     parameters can
>     > also be detected. The kernel can force-signal only the
>     semaphores that time
>     > out, and punish the processes which caused the timeout or used
>     invalid
>     > signal/wait parameters.
>     >
>     > The question is whether this synchronization solution is robust
>     enough for
>     > dma_fence and whatever the kernel and window systems need.
>
>     The proper model here is the preempt-ctx dma_fence that amdkfd uses
>     (without page faults). That means dma_fence for synchronization is
>     doa, at
>     least as-is, and we're back to figuring out the winsys problem.
>
>     "We'll solve it with timeouts" is very tempting, but doesn't work.
>     It's
>     akin to saying that we're solving deadlock issues in a locking
>     design by
>     doing a global s/mutex_lock/mutex_lock_timeout/ in the kernel. Sure it
>     avoids having to reach the reset button, but that's about it.
>
>     And the fundamental problem is that once you throw in userspace
>     command
>     submission (and syncing, at least within the userspace driver,
>     otherwise
>     there's kinda no point if you still need the kernel for
>     cross-engine sync)
>     means you get deadlocks if you still use dma_fence for sync under
>     perfectly legit use-case. We've discussed that one ad nauseam last
>     summer:
>
>     https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences
>     <https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences>
>
>     See silly diagramm at the bottom.
>
>     Now I think all isn't lost, because imo the first step to getting
>     to this
>     brave new world is rebuilding the driver on top of userspace
>     fences, and
>     with the adjusted cmd submit model. You probably don't want to use
>     amdkfd,
>     but port that as a context flag or similar to render nodes for
>     gl/vk. Of
>     course that means you can only use this mode in headless, without
>     glx/wayland winsys support, but it's a start.
>     -Daniel
>
>     >
>     > Marek
>     >
>     > On Tue, Apr 20, 2021 at 4:34 PM Daniel Stone
>     <daniel at fooishbar.org <mailto:daniel at fooishbar.org>> wrote:
>     >
>     > > Hi,
>     > >
>     > > On Tue, 20 Apr 2021 at 20:30, Daniel Vetter <daniel at ffwll.ch
>     <mailto:daniel at ffwll.ch>> wrote:
>     > >
>     > >> The thing is, you can't do this in drm/scheduler. At least
>     not without
>     > >> splitting up the dma_fence in the kernel into separate memory
>     fences
>     > >> and sync fences
>     > >
>     > >
>     > > I'm starting to think this thread needs its own glossary ...
>     > >
>     > > I propose we use 'residency fence' for execution fences which
>     enact
>     > > memory-residency operations, e.g. faulting in a page
>     ultimately depending
>     > > on GPU work retiring.
>     > >
>     > > And 'value fence' for the pure-userspace model suggested by
>     timeline
>     > > semaphores, i.e. fences being (*addr == val) rather than being
>     able to look
>     > > at ctx seqno.
>     > >
>     > > Cheers,
>     > > Daniel
>     > > _______________________________________________
>     > > mesa-dev mailing list
>     > > mesa-dev at lists.freedesktop.org
>     <mailto:mesa-dev at lists.freedesktop.org>
>     > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>     <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>     > >
>
>     -- 
>     Daniel Vetter
>     Software Engineer, Intel Corporation
>     http://blog.ffwll.ch <http://blog.ffwll.ch>
>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20210427/33ec37b0/attachment.htm>