[Intel-gfx] [PATCH 00/37] Replace obj->mm.lock with reservation_ww_class

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Thu Sep 3 14:25:05 UTC 2020


On 10/08/2020 10:51, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-08-06 10:21:38)
>>
>> On 05/08/2020 17:22, Thomas Hellström (Intel) wrote:
>>> Hi, Chris,
>>>
>>>
>>> On 8/5/20 2:21 PM, Chris Wilson wrote:
>>>> Long story short, we need to manage evictions using dma_resv & dma_fence
>>>> tracking. The backing storage will then be managed using the ww_mutex
>>>> borrowed from (and shared via) obj->base.resv, rather than the current
>>>> obj->mm.lock.
>>>>
>>>> Skipping over the breadcrumbs,
>>>
>>> While perhaps needed fixes, could we submit them as a separate series,
>>> since they, from what I can tell, are not a direct part of the locking
>>> rework, and some of them were actually part of a series that Dave NaK'ed
>>> and may require additional justification?
>>>
>>>
>>>>    the first step is to remove the final
>>>> crutches of struct_mutex from execbuf and to broaden the hold for the
>>>> dma-resv to guard not just publishing the dma-fences, but for the
>>>> duration of the execbuf submission (holding all objects and their
>>>> backing store from the point of acquisition to publishing of the final
>>>> GPU work, after which the guard is delegated to the dma-fences).
>>>>
>>>> This is of course made complicated by our history. On top of the user's
>>>> objects, we also have the HW/kernel objects with their own lifetimes,
>>>> and a bunch of auxiliary objects used for working around unhappy HW and
>>>> for providing the legacy relocation mechanism. We add every auxiliary
>>>> object to the list of user objects required, and attempt to acquire them
>>>> en masse. Since all the objects can be known a priori, we can build a
>>>> list of those objects and pass that to a routine that can resolve the
>>>> -EDEADLK (and evictions). [To avoid relocations imposing a penalty on
>>>> sane userspace that avoids them, we do not touch any relocations until
>>>> necessary, at will point we have to unroll the state, and rebuild a new
>>>> list with more auxiliary buffers to accommodate the extra
>>>> copy_from_user].
>>>> More examples are included as to how we can break down operations
>>>> involving multiple objects into an acquire phase prior to those
>>>> operations, keeping the -EDEADLK handling under control.
>>>>
>>>> execbuf is the unique interface in that it deals with multiple user
>>>> and kernel buffers. After that, we have callers that in principle care
>>>> about accessing a single buffer, and so can be migrated over to a helper
>>>> that permits only holding one such buffer at a time. That enables us to
>>>> swap out obj->mm.lock for obj->base.resv->lock, and use lockdep to spot
>>>> illegal nesting, and to throw away the temporary pins by replacing them
>>>> with holding the ww_mutex for the duration instead.
>>>>
>>>> What's changed? Some patch splitting and we need to pull in Matthew's
>>>> patch to map the page directories under the ww_mutex.
>>>
>>> I would still like to see a justification for the newly introduced async
>>> work, as opposed to add it as an optimizing / regression fixing series
>>> follow the locking rework. That async work introduces a bunch of code
>>> complexity and it would be beneficial to see a discussion of the
>>> tradeoffs and how it alignes with the upstream proposed dma-fence
>>> annotations
>>
>> On the topic of annotations, maybe do a trybot run with them enabled
>> with the latest series and then see what pops up.
> 
> It didn't change since the run Daniel did. In that run there were two
> splats I found,
> 
> vm->mutex -> dma_fence_is_signaled/dma_fence_signal -> async work + vm->mutex

Coming back to this after a few weeks - was there any further discussion 
on the dma_fence_is_signaled problem? I don't think Daniel you ever 
replied, or at least I did not see it. Can the signaling section in 
there possibly remain and not be obviously incorrect?

Regards,

Tvrtko

> 
> This is the overconstraint Daniel mentioned, it's an entirely false
> coupling from the assumption that the async work runs inside the
> parent's signaling context. Gut feeling says that should be resolvable
> by using lockdep's dependency analysis between local execution/signal
> contexts.
> 
> The other was
> 
> userptr mmu_notifier -> i915_vma_unbind + i915_vma_revoke -> mmu_notifier
> [with the interplay between the two global lockmap's, two strikes?]
> 
> And that is the well known false positive for userptr that we break by
> never allowing the userptr to in be the dev->mapping.
> 
> A path not exercised but it should find is synchronous fence eviction.
> While binding a fresh fence ensures the vma is idle before starting, we
> do not do the same for the fence we steal. That's why care was taken for
> execbuf, and it just means teaching the synchronous path to similarly
> move the wait outside of the mutex (the simplest will be to wrap the
> async handling).
> -Chris
> 


More information about the Intel-gfx mailing list