[RFC PATCH 0/6] Common preempt fences and semantics

Matthew Brost matthew.brost at intel.com
Wed Nov 13 02:30:58 UTC 2024


On Tue, Nov 12, 2024 at 06:27:31PM -0800, Matthew Brost wrote:
> On Tue, Nov 12, 2024 at 12:09:52PM +0100, Christian König wrote:
> > Am 12.11.24 um 04:29 schrieb Matthew Brost:
> > > On Mon, Nov 11, 2024 at 02:42:02PM +0100, Christian König wrote:
> > > > Am 09.11.24 um 18:29 schrieb Matthew Brost:
> > > > > The motivation for this series comes from pending UMD submission work by
> > > > > AMD [1], ARM [3], and the Xe team, who are also beginning to look at
> > > > > this. Sima has suggested [4] some common driver preemptive fences and
> > > > > semantics, which we all agree on. This is the first attempt to implement
> > > > > them, based on Xe's existing long-running preemptive fences.
> > > > > 
> > > > > The semantics are fairly simple: preemptive fences attached to a
> > > > > dma-resv must wait to enable signaling (and thus preempt device
> > > > > execution) until all fences in other slots have been signaled. These
> > > > > semantics are necessary to avoid deadlocks when preempting a device
> > > > > while a user submission is pending, and resuming device execution
> > > > > depends on the user submission completing. The semantics align with
> > > > > Christian's view [5], which I also arrived at independently (with a
> > > > > little help from Sima).
> > > > Ah! I was really wondering for a moment why you wanted to add a separate
> > > > dma_resv usage for that. But I strongly think that this approach won't work.
> > > > 
> > > > The assumption that completion fences which depend on a preemption fence are
> > > > always part of the same dma_resv object is most likely false in some cases.
> > > > 
> > > > At least for the AMD design what can easily happen is that we put a
> > > > completion fence only into a drm_syncobj for explicit synchronization and
> > > > then engines or different devices which still use kernel submissions depend
> > > > on it. That can go boom really trivially.
> > > > 
> > > > What we do instead is to wait for the latest issued completion fence while
> > > > holding a mutex to prevent creating new ones before stopping the threads and
> > > wrt to a mutex...
> > > 
> > > A current code reference would be nice. I looked some of the code on
> > > list and dma-fencing rules are broken...
> > > 
> > > e.g., This patch [1], takes 'uq_mgr->userq_mutex' in the dma fencing path.
> > > This patch [2], takes 'uq_mgr->userq_mutex' and then dma-resv lock /
> > > allocates memory. That is clearly not allowed.
> > 
> > No that is unproblematic. The dma_resv is only locked after the preemption
> > fence is already signaled.
> > 
> 
> That's kinda cheating, fragile, and a pretty poor practice IMO. Your
> driver though.
> 
> > The problem is currently that we have separated the signaling worker from
> > the resume worker. E.g. the mutex is dropped in between.
> > 
> 
> Again, can you share any code refs? See below, will share work shortly.
> 
> > > 
> > > Perhaps this is fixed in your current code base though.
> > > 
> > > [1] https://patchwork.freedesktop.org/patch/593210/?series=133345&rev=1
> > > [2] https://patchwork.freedesktop.org/patch/593211/?series=133345&rev=1
> > > 
> > > > signaling the preemption fence.
> > > > 
> > > Right, that works and is functionally equivalent to the intended purpose
> > > here.
> > > 
> > > I left out a key detail: when a user fence is converted into a dma-fence
> > > and exported in a syncobj or syncfile, it must also be installed in all
> > > of the VM's dma-resv bookkeeping slots (i.e., in VM's dma-resv and all
> > > extobj dma-resv mapped in the VM).
> > 
> > I don't think that this is something we should do.
> > 
> > > Before exporting a dma-fence, the VM must be validated + resumed, and
> > > you must hold all dma-resv locks, so the above is trivial.
> > 
> > Hui? Why should we hold all the dma-resv locks for that?
> > 
> > We only hold the userq_mutex to make sure that no user fence is created
> > while we are in the signaling path of the eviction fence.
> > 
> 
> I reason that the user fence -> DMA fence IOCTL (and vice versa) is
> essentially another version of the rebind worker, which also performs
> the fence transformation and installs fences in the preempt slots to
> guard against unwanted preemption while a DMA fence has been exported.
> It seems to work quite nicely.
> 

Ugh, typos. s/installs fences in the preempt slots/installs dma-fences in the bookkeep slots/

Matt

> > > If you're using gpuvm, just call drm_gpuvm_resv_add_fence. I assume AMD has a
> > > similarly simple call.
> > 
> > Nope, we try to avoid locking all BOs in the VM as hard as we can.
> > 
> 
> Why? Calling in to perform fence conversion shouldn't be all that
> frequent and simplifies things.
> 
> Also, it's likely that only a few locks are involved, as not too many
> external BOs are mapped within a VM (i.e., most BOs share the VM's
> dma-resv lock).
> 
> > > Now the ordering works inherently in dma-resv and the scheduler. e.g. No
> > > need to track the last completion fences explictly in preempt fences.
> > 
> > I really don't think that this is a good approach. Explicitly keeping the
> > last completion fence in the pre-emption fence is basically a must have as
> > far as I can see.
> > 
> > The approach you take here looks like a really ugly hack to me.
> > 
> 
> Well, I have to disagree; it seems like a pretty solid, common design.
> 
> Anyway, I think I have this more or less working. I want to run this by
> the Mesa team a bit to ensure I haven't missed anything, and will likely
> post something shortly after. 
> 
> We can discuss this more after I post and perhaps solicit other
> opinions, weighing the pros and cons of the approaches here. I do think
> they function roughly the same, so something commonly agreed upon would
> be good. Sharing a bit of code, if possible, is always a plus too.
> 
> Matt
> 
> > Regards,
> > Christian.
> > 
> > > 
> > > I'm pretty sure if converted your code this solution it would work,
> > > there are however a couple of bugs in this post which I have fixed.
> > > 
> > > This is a common solution based on existing components which I'd lean
> > > towards rather than some new component.
> > > 
> > > Matt
> > > 
> > > > That code could of course be made some driver independent component.
> > > > 
> > > > Regards,
> > > > Christian.
> > > > 
> > > > > Implemented via the new dma-resv slot DMA_RESV_USAGE_PREEMPT, a common
> > > > > embedded base preemptive fence class with driver operations, and updates
> > > > > to the scheduler to adhere to these semantics.
> > > > > 
> > > > > The current Xe long-running preemptive fences have been converted to the
> > > > > new common code, and UMD submission is expected to follow (hopefully)
> > > > > shortly thereafter in a subsequent series.
> > > > > 
> > > > > Not intended to be presented as the final solution, but rather to
> > > > > kickstart serious discussions on this topic.
> > > > > 
> > > > > Matt
> > > > > 
> > > > > [1] https://patchwork.freedesktop.org/series/113675/
> > > > > [2] https://patchwork.freedesktop.org/series/114385/
> > > > > [3] https://patchwork.freedesktop.org/series/137924/
> > > > > [4] https://patchwork.kernel.org/project/dri-devel/cover/20240828172605.19176-1-mihail.atanassov@arm.com/#26011577
> > > > > [5] https://patchwork.kernel.org/project/dri-devel/cover/20240828172605.19176-1-mihail.atanassov@arm.com/#26011853
> > > > > 
> > > > > Matthew Brost (6):
> > > > >     dma-resv: Add DMA_RESV_USAGE_PREEMPT
> > > > >     drm/sched: Teach scheduler about DMA_RESV_USAGE_PREEMPT
> > > > >     dma-fence: Add dma_fence_preempt base class
> > > > >     drm/sched: Teach scheduler about dma_fence_prempt type
> > > > >     drm/xe: Use DMA_RESV_USAGE_PREEMPT for preempt fences
> > > > >     drm/xe: Use dma_fence_preempt base class
> > > > > 
> > > > >    drivers/dma-buf/Makefile                    |   2 +-
> > > > >    drivers/dma-buf/dma-fence-preempt.c         | 102 ++++++++++++++++++++
> > > > >    drivers/dma-buf/dma-resv.c                  |  43 ++++++---
> > > > >    drivers/dma-buf/st-dma-resv.c               |   2 +-
> > > > >    drivers/gpu/drm/scheduler/sched_entity.c    |  29 +++++-
> > > > >    drivers/gpu/drm/scheduler/sched_main.c      |  50 +++++++---
> > > > >    drivers/gpu/drm/xe/xe_bo.c                  |  22 +----
> > > > >    drivers/gpu/drm/xe/xe_guc_submit.c          |   3 +
> > > > >    drivers/gpu/drm/xe/xe_hw_engine_group.c     |   4 +-
> > > > >    drivers/gpu/drm/xe/xe_migrate.c             |   4 +-
> > > > >    drivers/gpu/drm/xe/xe_preempt_fence.c       |  81 +++++-----------
> > > > >    drivers/gpu/drm/xe/xe_preempt_fence.h       |   2 +-
> > > > >    drivers/gpu/drm/xe/xe_preempt_fence_types.h |  11 +--
> > > > >    drivers/gpu/drm/xe/xe_pt.c                  |  12 +--
> > > > >    drivers/gpu/drm/xe/xe_vm.c                  |  22 +----
> > > > >    include/drm/gpu_scheduler.h                 |  15 +++
> > > > >    include/linux/dma-fence-preempt.h           |  54 +++++++++++
> > > > >    include/linux/dma-fence.h                   |   1 +
> > > > >    include/linux/dma-resv.h                    |  24 +++--
> > > > >    19 files changed, 326 insertions(+), 157 deletions(-)
> > > > >    create mode 100644 drivers/dma-buf/dma-fence-preempt.c
> > > > >    create mode 100644 include/linux/dma-fence-preempt.h
> > > > > 
> > 


More information about the dri-devel mailing list