[Intel-gfx] [PATCH 28/66] drm/i915/gem: Replace i915_gem_object.mm.mutex with reservation_ww_class
Thomas Hellström (Intel)
thomas_os at shipmail.org
Tue Jul 28 11:17:15 UTC 2020
On 7/16/20 5:53 PM, Tvrtko Ursulin wrote:
>
> On 15/07/2020 16:43, Maarten Lankhorst wrote:
>> Op 15-07-2020 om 13:51 schreef Chris Wilson:
>>> Our goal is to pull all memory reservations (next iteration
>>> obj->ops->get_pages()) under a ww_mutex, and to align those
>>> reservations
>>> with other drivers, i.e. control all such allocations with the
>>> reservation_ww_class. Currently, this is under the purview of the
>>> obj->mm.mutex, and while obj->mm remains an embedded struct we can
>>> "simply" switch to using the reservation_ww_class obj->base.resv->lock
>>>
>>> The major consequence is the impact on the shrinker paths as the
>>> reservation_ww_class is used to wrap allocations, and a ww_mutex does
>>> not support subclassing so we cannot do our usual trick of knowing that
>>> we never recurse inside the shrinker and instead have to finish the
>>> reclaim with a trylock. This may result in us failing to release the
>>> pages after having released the vma. This will have to do until a
>>> better
>>> idea comes along.
>>>
>>> However, this step only converts the mutex over and continues to treat
>>> everything as a single allocation and pinning the pages. With the
>>> ww_mutex in place we can remove the temporary pinning, as we can then
>>> reserve all storage en masse.
>>>
>>> One last thing to do: kill the implict page pinning for active vma.
>>> This will require us to invalidate the vma->pages when the backing
>>> store
>>> is removed (and we expect that while the vma is active, we mark the
>>> backing store as active so that it cannot be removed while the HW is
>>> busy.)
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>
> [snip]
>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>>> index dc8f052a0ffe..4e928103a38f 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>>> @@ -47,10 +47,7 @@ static bool unsafe_drop_pages(struct
>>> drm_i915_gem_object *obj,
>>> if (!(shrink & I915_SHRINK_BOUND))
>>> flags = I915_GEM_OBJECT_UNBIND_TEST;
>>> - if (i915_gem_object_unbind(obj, flags) == 0)
>>> - __i915_gem_object_put_pages(obj);
>>> -
>>> - return !i915_gem_object_has_pages(obj);
>>> + return i915_gem_object_unbind(obj, flags) == 0;
>>> }
>>> static void try_to_writeback(struct drm_i915_gem_object *obj,
>>> @@ -199,14 +196,14 @@ i915_gem_shrink(struct drm_i915_private *i915,
>>> spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
>>> - if (unsafe_drop_pages(obj, shrink)) {
>>> - /* May arrive from get_pages on another bo */
>>> - mutex_lock(&obj->mm.lock);
>>> + if (unsafe_drop_pages(obj, shrink) &&
>>> + i915_gem_object_trylock(obj)) {
>
>> Why trylock? Because of the nesting? In that case, still use ww ctx
>> if provided please
>
> By "if provided" you mean for code paths where we are calling the
> shrinker ourselves, as opposed to reclaim, like shmem_get_pages?
>
> That indeed sounds like the right thing to do, since all the get_pages
> from execbuf are in the reservation phase, collecting a list of GEM
> objects to lock, the ones to shrink sound like should be on that list.
>
>>> + __i915_gem_object_put_pages(obj);
>>> if (!i915_gem_object_has_pages(obj)) {
>>> try_to_writeback(obj, shrink);
>>> count += obj->base.size >> PAGE_SHIFT;
>>> }
>>> - mutex_unlock(&obj->mm.lock);
>>> + i915_gem_object_unlock(obj);
>>> }
>>> scanned += obj->base.size >> PAGE_SHIFT;
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
>>> index ff72ee2fd9cd..ac12e1c20e66 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
>>> @@ -265,7 +265,6 @@ i915_gem_object_set_tiling(struct
>>> drm_i915_gem_object *obj,
>>> * pages to prevent them being swapped out and causing corruption
>>> * due to the change in swizzling.
>>> */
>>> - mutex_lock(&obj->mm.lock);
>>> if (i915_gem_object_has_pages(obj) &&
>>> obj->mm.madv == I915_MADV_WILLNEED &&
>>> i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
>>> @@ -280,7 +279,6 @@ i915_gem_object_set_tiling(struct
>>> drm_i915_gem_object *obj,
>>> obj->mm.quirked = true;
>>> }
>>> }
>>> - mutex_unlock(&obj->mm.lock);
>>> spin_lock(&obj->vma.lock);
>>> for_each_ggtt_vma(vma, obj) {
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
>>> index e946032b13e4..80907c00c6fd 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
>>> @@ -129,8 +129,15 @@ userptr_mn_invalidate_range_start(struct
>>> mmu_notifier *_mn,
>>> ret = i915_gem_object_unbind(obj,
>>> I915_GEM_OBJECT_UNBIND_ACTIVE |
>>> I915_GEM_OBJECT_UNBIND_BARRIER);
>>> - if (ret == 0)
>>> - ret = __i915_gem_object_put_pages(obj);
>>> + if (ret == 0) {
>>> + /* ww_mutex and mmu_notifier is fs_reclaim tainted */
>>> + if (i915_gem_object_trylock(obj)) {
>>> + ret = __i915_gem_object_put_pages(obj);
>>> + i915_gem_object_unlock(obj);
>>> + } else {
>>> + ret = -EAGAIN;
>>> + }
>>> + }
>>
>> I'm not sure upstream will agree with this kind of API:
>>
>> 1. It will deadlock when RT tasks are used.
>
> It will or it can? Which part? It will break out of the loop if
> trylock fails.
>
>>
>> 2. You start throwing -EAGAIN because you don't have the correct
>> ordering of locking, this needs fixing first.
>
> Is it about correct ordering of locks or something else? If memory
> allocation is allowed under dma_resv.lock, then the opposite order
> cannot be taken in any case.
>
> I've had a brief look at the amdgpu solution and maybe I misunderstood
> something, but it looks like a BKL approach with the device level
> notifier_lock. Their userptr notifier blocks on that one, not on
> dma_resv lock, but that also means their command submission
> (amdgpu_cs_submit) blocks on the same lock while obtaining backing store.
If I read Christian right, it blocks on that lock only just before
command submission to validate that sequence number. If there is a
mismatch, it needs to rerun CS. I'm not sure how common userptr buffers
are, but if a device-wide mutex hurts too much, There are perhaps more
fine-grained solutions. (Like an rw semaphore, and unlock before the
fence wait in the notifier: CS which are unaffected shouldn't need to
wait...).
>
> So it looks like a big hammer approach not directly related to the
> story of dma_resv locking. Maybe we could do the same big hammer
> approach, although I am not sure how it is deadlock free.
>
> What happens for instance if someone submits an userptr batch which
> gets unmapped while amdgpu_cs_submit is holding the notifier_lock?
My understanding is the unmapping operation blocks on the notifier_lock
in the mmu notifier?
/Thomas
>
> If you understand amdgpu better please share some insights. I
> certainly only looked at it briefly today so may be wrong.
>
> Regards,
>
> Tvrtko
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
More information about the Intel-gfx
mailing list