[Intel-gfx] [Linaro-mm-sig] [RFC] Cross-driver ww transaction lock lists
Daniel Vetter
daniel at ffwll.ch
Thu Apr 15 14:40:47 UTC 2021
On Thu, Apr 15, 2021 at 4:02 PM Thomas Hellström (Intel)
<thomas_os at shipmail.org> wrote:
>
>
> On 4/15/21 3:37 PM, Daniel Vetter wrote:
> > On Tue, Apr 13, 2021 at 09:57:06AM +0200, Christian König wrote:
> >>
> >> Am 13.04.21 um 09:50 schrieb Thomas Hellström:
> >>> Hi!
> >>>
> >>> During the dma_resv conversion of the i915 driver, we've been using ww
> >>> transaction lock lists to keep track of ww_mutexes that are locked
> >>> during the transaction so that they can be batch unlocked at suitable
> >>> locations. Including also the LMEM/VRAM eviction we've ended up with
> >>> two static lists per transaction context; one typically unlocked at the
> >>> end of transaction and one initialized before and unlocked after each
> >>> buffer object validate. This enables us to do sleeping locking at
> >>> eviction and keep objects locked on the eviction list until we
> >>> eventually succeed allocating memory (modulo minor flaws of course).
> >>>
> >>> It would be beneficial with the i915 TTM conversion to be able to
> >>> introduce a similar functionality that would work in ttm but also
> >>> cross-driver in, for example move_notify. It would also be beneficial
> >>> to be able to put any dma_resv ww mutex on the lists, and not require
> >>> it to be embedded in a particular object type.
> >>>
> >>> I started scetching on some utilities for this. For TTM, for example,
> >>> the idea would be to pass a list head for the ww transaction lock list
> >>> in the ttm_operation_ctx. A function taking a ww_mutex could then
> >>> either attach a grabbed lock to the list for batch unlocking, or be
> >>> responsible for unlocking when the lock's scope is exited. It's also
> >>> possible to create sublists if so desired. I believe the below would be
> >>> sufficient to cover the i915 functionality.
> >>>
> >>> Any comments and suggestions appreciated!
> >> ah yes Daniel and I haven been discussing something like this for years.
> >>
> >> I also came up with rough implementation, but we always ran into lifetime
> >> issues.
> >>
> >> In other words the ww_mutexes which are on the list would need to be kept
> >> alive until unlocked.
> >>
> >> Because of this we kind of backed up and said we would need this on the GEM
> >> level instead of working with dma_resv objects.
> > Yeah there's a few funny concerns here that make this awkward:
> > - For simplicity doing these helpers at the gem level should make things a
> > bit easier, because then we have a standard way to drop the reference.
> > It does mean that the only thing you can lock like this are gem objects,
> > but I think that's fine. At least for a first cut.
> >
> > - This is a bit awkward for vmwgfx, but a) Zack has mentioned he's looking
> > into adopting gem bo internally to be able to drop a pile of code and
> > stop making vmwgfx the only special-case we have b) drivers which don't
> > need this won't need this, so should be fine.
> >
> > The other awkward thing I guess is that ttm would need to use the
> > embedded kref from the gem bo, but that should be transparent I think.
> >
> > - Next up is dma-buf: For i915 we'd like to do the same eviction trick
> > also through p2p dma-buf callbacks, so that this works the same as
> > eviction/reservation within a gpu. But for these internal bo you might
> > not have a dma-buf, so we can't just lift the trick to the dma-buf
> > level. But I think if we pass e.g. a struct list_head and a callback to
> > unreference/unlock all the buffers in there to the exporter, plus
> > similar for the slowpath lock, then that should be doable without
> > glorious layering inversions between dma-buf and gem.
> >
> > I think for dma-buf it should even be ok if this requires that we
> > allocate an entire structure with kmalloc or something, allocating
> > memory while holding dma_resv is ok.
>
> Yes, the thing here with the suggested helpers is that you would just
> embed a trans_lockitem struct in the gem object (and defines the gem
> object ops). Otherwise and for passing to dma-buf this is pretty much
> exactly what you are suggesting, but the huge benefit of encapsulating
> the needed members like this is that when we need to change something we
> change it in just one place.
>
> For anything that doesn't have a gem object (dma-buf, vmwgfx or
> whatever) you have the choice of either allocating a struct
> trans_lockitem or embed it wherever you prefer. In particular, this is
> beneficial where you have a single dma-resv class ww-mutex sitting
> somewhere in the way and you don't want to needlessly have a gem object
> that embeds it.
The thing is, everyone who actually uses dma_resv_lock has a
gem_buffer_object underneath. So it feels a bit like flexibility for
no real need, and I think it would make it slightly more awkard for
gem drivers to neatly integrate into their cs path. The lockitem
struct works, but it is a bit cumbersome.
Also if we add some wrappers to e.g. add a gem_bo to the ctx, then if
we decide to slip the lockitem in there, we still only need to touch
the helper code, and not all drivers.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-gfx
mailing list