[Nouveau] [PATCH drm-misc-next v5 4/6] drm/gpuvm: track/lock/validate external/evicted objects

Tue Oct 3 18:57:29 UTC 2023

On 10/3/23 18:55, Danilo Krummrich wrote:
> It seems like we're mostly aligned on this series, except for the key
> controversy we're discussing for a few versions now: locking of the internal
> lists. Hence, let's just re-iterate the options we have to get this out of the
> way.
>
> (1) The spinlock dance. This basically works for every use case, updating the VA
>      space from the IOCTL, from the fence signaling path or anywhere else.
>      However, it has the downside of requiring spin_lock() / spin_unlock() for
>      *each* list element when locking all external objects and validating all
>      evicted objects. Typically, the amount of extobjs and evicted objects
>      shouldn't be excessive, but there might be exceptions, e.g. Xe.
>
> (2) The dma-resv lock dance. This is convinient for drivers updating the VA
>      space from a VM_BIND ioctl() and is especially efficient if such drivers
>      have a huge amount of external and/or evicted objects to manage. However,
>      the downsides are that it requires a few tricks in drivers updating the VA
>      space from the fence signaling path (e.g. job_run()). Design wise, I'm still
>      skeptical that it is a good idea to protect internal data structures with
>      external locks in a way that it's not clear to callers that a certain
>      function would access one of those resources and hence needs protection.
>      E.g. it is counter intuitive that drm_gpuvm_bo_put() would require both the
>      dma-resv lock of the corresponding object and the VM's dma-resv lock held.
>      (Additionally, there were some concerns from amdgpu regarding flexibility in
>      terms of using GPUVM for non-VM_BIND uAPIs and compute, however, AFAICS
>      those discussions did not complete and to me it's still unclear why it
>      wouldn't work.)
>
> (3) Simply use an internal mutex per list. This adds a tiny (IMHO negligible)
>      overhead for drivers updating the VA space from a VM_BIND ioctl(), namely
>      a *single* mutex_lock()/mutex_unlock() when locking all external objects
>      and validating all evicted objects. And it still requires some tricks for
>      drivers updating the VA space from the fence signaling path. However, it's
>      as simple as it can be and hence way less error prone as well as
>      self-contained and hence easy to use. Additionally, it's flexible in a way
>      that we don't have any expections on drivers to already hold certain locks
>      that the driver in some situation might not be able to acquire in the first
>      place.
>
> (4) Arbitrary combinations of the above. For instance, the current V5 implements
>      both (1) and (2) (as either one or the other). But also (1) and (3) (as in
>      (1) additionally to (3)) would be an option, where a driver could opt-in for
>      the spinlock dance in case it updates the VA space from the fence signaling
>      path.
>
> I also considered a few other options as well, however, they don't seem to be
> flexible enough. For instance, as by now we could use SRCU for the external
> object list. However, this falls apart once a driver wants to remove and re-add
> extobjs for the same VM_BO instance. (For the same reason it wouldn't work for
> evicted objects.)
>
> Personally, after seeing the weird implications of (1), (2) and a combination of
> both, I tend to go with (3). Optionally, with an opt-in for (1). The reason for
> the latter is that with (3) the weirdness of (1) by its own mostly disappears.
>
> Please let me know what you think, and, of course, other ideas than the
> mentioned ones above are still welcome.
>
> - Danilo
>
Here are the locking principles Daniel put together and Dave once called 
out for us to be applying when reviewing DRM code. These were prompted 
by very fragile and hard to understand locking patterns in the i915 
driver and I think the xe vm_bind locking design was made with these in 
mind, (not sure exactly who wrote what, though so can't say for sure).

https://blog.ffwll.ch/2022/07/locking-engineering.html
https://blog.ffwll.ch/2022/08/locking-hierarchy.html

At least to me, this motivates using the resv design unless we strictly 
need lower level locks that are taken in the eviction paths or userptr 
invalidation paths, but doesn't rule out spinlocks or lock dropping 
tricks where these are really necessary. But pretty much rules out RCU / 
SRCU from what I can tell.

It also calls for documenting how individual members of structs are 
protected when ever possible.

Thanks,
Thomas