[Intel-gfx] [PATCH] drm/i915: Switch obj->mm.lock lockdep annotations on its head
Tang, CQ
cq.tang at intel.com
Tue Nov 5 18:38:08 UTC 2019
> -----Original Message-----
> From: Daniel Vetter <daniel.vetter at ffwll.ch>
> Sent: Tuesday, November 5, 2019 2:02 AM
> To: Intel Graphics Development <intel-gfx at lists.freedesktop.org>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>; Auld, Matthew
> <matthew.auld at intel.com>; Chris Wilson <chris at chris-wilson.co.uk>; Tang,
> CQ <cq.tang at intel.com>; Ursulin, Tvrtko <tvrtko.ursulin at intel.com>; Joonas
> Lahtinen <joonas.lahtinen at linux.intel.com>; Vetter, Daniel
> <daniel.vetter at intel.com>
> Subject: [PATCH] drm/i915: Switch obj->mm.lock lockdep annotations on its
> head
>
> The trouble with having a plain nesting flag for locks which do not naturally
> nest (unlike block devices and their partitions, which is the original motivation
> for nesting levels) is that lockdep will never spot a true deadlock if you screw
> up.
>
> This patch is an attempt at trying better, by highlighting a bit more the actual
> nature of the nesting that's going on. Essentially we have two kinds of
> objects:
>
> - objects without pages allocated, which cannot be on any lru and are
> hence inaccessible to the shrinker.
>
> - objects which have pages allocated, which are on an lru, and which
> the shrinker can decide to throw out.
>
> For the former type of object, memory allcoations while holding
> obj->mm.lock are permissible. For the latter they are not. And
> get/put_pages transitions between the two types of objects.
>
> This is still not entirely fool-proof since the rules might chance.
> But as long as we run such a code ever at runtime lockdep should be able to
> observe the inconsistency and complain (like with any other lockdep class
> that we've split up in multiple classes). But there are a few clear benefits:
>
> - We can drop the nesting flag parameter from
> __i915_gem_object_put_pages, because that function by definition is
> never going allocate memory, and calling it on an object which
> doesn't have its pages allocated would be a bug.
>
> - We strictly catch more bugs, since there's not only one place in the
> entire tree which is annotated with the special class. All the
> other places that had explicit lockdep nesting annotations we're now
> going to leave up to lockdep again.
>
> - Specifically this catches stuff like calling get_pages from
> put_pages (which isn't really a good idea, if we can call get_pages
> so could the shrinker). I've seen patches do exactly that.
If we don't allow get_pages from put_pages, then we need to think a new way to swap the pages freed by put_pages.
In the lmem swapping case, put_pages can't just free the pages, it needs to save the pages to somewhere else.
The saving operation requires to call get_pages because we need temp objects for blitter engine to do the copying.
Can we use another thread to do the async copying?
--CQ
>
> Of course I fully expect CI will show me for the fool I am with this one here :-)
>
> v2: There can only be one (lockdep only has a cache for the first subclass, not
> for deeper ones, and we don't want to make these locks even slower). Still
> separate enums for better documentation.
>
> Real fix: don forget about phys objs and pin_map(), and fix the shrinker to
> have the right annotations ... silly me.
>
> v3: Forgot usertptr too ...
>
> v4: Improve comment for pages_pin_count, drop the IMPORTANT comment
> and instead prime lockdep (Chris).
>
> v5: Appease checkpatch, no double empty lines (Chris)
>
> v6: More rebasing over selftest changes. Also somehow I forgot to push this
> patch :-/
>
> Also format comments consistently while at it.
>
> v7: Fix typo in commit message (Joonas)
>
> Also drop the priming, with the lmem merge we now have allocations while
> holding the lmem lock, which wreaks the generic priming I've done in earlier
> patches. Should probably be resurrected when lmem is fixed. See
>
> commit 232a6ebae419193f5b8da4fa869ae5089ab105c2
> Author: Matthew Auld <matthew.auld at intel.com>
> Date: Tue Oct 8 17:01:14 2019 +0100
>
> drm/i915: introduce intel_memory_region
>
> I'm keeping the priming patch locally so it wont get lost.
>
> Cc: Matthew Auld <matthew.auld at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: "Tang, CQ" <cq.tang at intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk> (v5)
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com> (v6)
> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> ---
> drivers/gpu/drm/i915/gem/i915_gem_object.c | 4 +++-
> drivers/gpu/drm/i915/gem/i915_gem_object.h | 17 ++++++++++++++---
> .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 +++++-
> drivers/gpu/drm/i915/gem/i915_gem_pages.c | 9 ++++-----
> drivers/gpu/drm/i915/gem/i915_gem_phys.c | 2 +-
> drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 5 ++---
> drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 4 ++--
> drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 14 +++++++-------
> .../drm/i915/selftests/intel_memory_region.c | 4 ++--
> 9 files changed, 40 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> index a50296cce0d8..db103d3c8760 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> @@ -22,6 +22,8 @@
> *
> */
>
> +#include <linux/sched/mm.h>
> +
> #include "display/intel_frontbuffer.h"
> #include "gt/intel_gt.h"
> #include "i915_drv.h"
> @@ -186,7 +188,7 @@ static void __i915_gem_free_objects(struct
> drm_i915_private *i915,
> GEM_BUG_ON(!list_empty(&obj->lut_list));
>
> atomic_set(&obj->mm.pages_pin_count, 0);
> - __i915_gem_object_put_pages(obj, I915_MM_NORMAL);
> + __i915_gem_object_put_pages(obj);
> GEM_BUG_ON(i915_gem_object_has_pages(obj));
> bitmap_free(obj->bit_17);
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 458cd51331f1..edaf7126a84d 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -319,11 +319,22 @@ i915_gem_object_unpin_pages(struct
> drm_i915_gem_object *obj)
>
> enum i915_mm_subclass { /* lockdep subclass for obj-
> >mm.lock/struct_mutex */
> I915_MM_NORMAL = 0,
> - I915_MM_SHRINKER /* called "recursively" from direct-reclaim-
> esque */
> + /*
> + * Only used by struct_mutex, when called "recursively" from
> + * direct-reclaim-esque. Safe because there is only every one
> + * struct_mutex in the entire system.
> + */
> + I915_MM_SHRINKER = 1,
> + /*
> + * Used for obj->mm.lock when allocating pages. Safe because the
> object
> + * isn't yet on any LRU, and therefore the shrinker can't deadlock on
> + * it. As soon as the object has pages, obj->mm.lock nests within
> + * fs_reclaim.
> + */
> + I915_MM_GET_PAGES = 1,
> };
>
> -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> - enum i915_mm_subclass subclass);
> +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
> void i915_gem_object_truncate(struct drm_i915_gem_object *obj); void
> i915_gem_object_writeback(struct drm_i915_gem_object *obj);
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index 96008374a412..15f8297dc34e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -162,7 +162,11 @@ struct drm_i915_gem_object {
> atomic_t bind_count;
>
> struct {
> - struct mutex lock; /* protects the pages and their use */
> + /*
> + * Protects the pages and their use. Do not use directly, but
> + * instead go through the pin/unpin interfaces.
> + */
> + struct mutex lock;
> atomic_t pages_pin_count;
> atomic_t shrink_pin;
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> index 29f4c2850745..f402c2c415c2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> @@ -106,7 +106,7 @@ int __i915_gem_object_get_pages(struct
> drm_i915_gem_object *obj) {
> int err;
>
> - err = mutex_lock_interruptible(&obj->mm.lock);
> + err = mutex_lock_interruptible_nested(&obj->mm.lock,
> +I915_MM_GET_PAGES);
> if (err)
> return err;
>
> @@ -190,8 +190,7 @@ __i915_gem_object_unset_pages(struct
> drm_i915_gem_object *obj)
> return pages;
> }
>
> -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> - enum i915_mm_subclass subclass)
> +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> {
> struct sg_table *pages;
> int err;
> @@ -202,7 +201,7 @@ int __i915_gem_object_put_pages(struct
> drm_i915_gem_object *obj,
> GEM_BUG_ON(atomic_read(&obj->bind_count));
>
> /* May be called by shrinker from within get_pages() (on another bo)
> */
> - mutex_lock_nested(&obj->mm.lock, subclass);
> + mutex_lock(&obj->mm.lock);
> if (unlikely(atomic_read(&obj->mm.pages_pin_count))) {
> err = -EBUSY;
> goto unlock;
> @@ -308,7 +307,7 @@ void *i915_gem_object_pin_map(struct
> drm_i915_gem_object *obj,
> if (!i915_gem_object_type_has(obj, flags))
> return ERR_PTR(-ENXIO);
>
> - err = mutex_lock_interruptible(&obj->mm.lock);
> + err = mutex_lock_interruptible_nested(&obj->mm.lock,
> +I915_MM_GET_PAGES);
> if (err)
> return ERR_PTR(err);
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> index 8043ff63d73f..b1b7c1b3038a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> @@ -164,7 +164,7 @@ int i915_gem_object_attach_phys(struct
> drm_i915_gem_object *obj, int align)
> if (err)
> return err;
>
> - mutex_lock(&obj->mm.lock);
> + mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES);
>
> if (obj->mm.madv != I915_MADV_WILLNEED) {
> err = -EFAULT;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> index fd3ce6da8497..066b3df677e8 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> @@ -57,7 +57,7 @@ static bool unsafe_drop_pages(struct
> drm_i915_gem_object *obj,
> flags = I915_GEM_OBJECT_UNBIND_ACTIVE;
>
> if (i915_gem_object_unbind(obj, flags) == 0)
> - __i915_gem_object_put_pages(obj, I915_MM_SHRINKER);
> + __i915_gem_object_put_pages(obj);
>
> return !i915_gem_object_has_pages(obj); } @@ -209,8 +209,7 @@
> i915_gem_shrink(struct drm_i915_private *i915,
>
> if (unsafe_drop_pages(obj, shrink)) {
> /* May arrive from get_pages on another bo
> */
> - mutex_lock_nested(&obj->mm.lock,
> - I915_MM_SHRINKER);
> + mutex_lock(&obj->mm.lock);
> if (!i915_gem_object_has_pages(obj)) {
> try_to_writeback(obj, shrink);
> count += obj->base.size >>
> PAGE_SHIFT; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> index 1e045c337044..ee65c6acf0e2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> @@ -131,7 +131,7 @@ userptr_mn_invalidate_range_start(struct
> mmu_notifier *_mn,
> ret = i915_gem_object_unbind(obj,
>
> I915_GEM_OBJECT_UNBIND_ACTIVE);
> if (ret == 0)
> - ret = __i915_gem_object_put_pages(obj,
> I915_MM_SHRINKER);
> + ret = __i915_gem_object_put_pages(obj);
> i915_gem_object_put(obj);
> if (ret)
> return ret;
> @@ -483,7 +483,7 @@ __i915_gem_userptr_get_pages_worker(struct
> work_struct *_work)
> }
> }
>
> - mutex_lock(&obj->mm.lock);
> + mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES);
> if (obj->userptr.work == &work->work) {
> struct sg_table *pages = ERR_PTR(ret);
>
> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> index 688c49a24f32..5c9583349077 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> @@ -517,7 +517,7 @@ static int
> igt_mock_memory_region_huge_pages(void *arg)
> i915_vma_unpin(vma);
> i915_vma_close(vma);
>
> - __i915_gem_object_put_pages(obj,
> I915_MM_NORMAL);
> + __i915_gem_object_put_pages(obj);
> i915_gem_object_put(obj);
> }
> }
> @@ -650,7 +650,7 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
> i915_vma_close(vma);
>
> i915_gem_object_unpin_pages(obj);
> - __i915_gem_object_put_pages(obj, I915_MM_NORMAL);
> + __i915_gem_object_put_pages(obj);
> i915_gem_object_put(obj);
> }
>
> @@ -678,7 +678,7 @@ static void close_object_list(struct list_head *objects,
>
> list_del(&obj->st_link);
> i915_gem_object_unpin_pages(obj);
> - __i915_gem_object_put_pages(obj, I915_MM_NORMAL);
> + __i915_gem_object_put_pages(obj);
> i915_gem_object_put(obj);
> }
> }
> @@ -948,7 +948,7 @@ static int igt_mock_ppgtt_64K(void *arg)
> i915_vma_close(vma);
>
> i915_gem_object_unpin_pages(obj);
> - __i915_gem_object_put_pages(obj,
> I915_MM_NORMAL);
> + __i915_gem_object_put_pages(obj);
> i915_gem_object_put(obj);
> }
> }
> @@ -1301,7 +1301,7 @@ static int igt_ppgtt_exhaust_huge(void *arg)
> }
>
> i915_gem_object_unpin_pages(obj);
> - __i915_gem_object_put_pages(obj,
> I915_MM_NORMAL);
> + __i915_gem_object_put_pages(obj);
> i915_gem_object_put(obj);
> }
> }
> @@ -1442,7 +1442,7 @@ static int igt_ppgtt_smoke_huge(void *arg)
> }
> out_unpin:
> i915_gem_object_unpin_pages(obj);
> - __i915_gem_object_put_pages(obj, I915_MM_NORMAL);
> + __i915_gem_object_put_pages(obj);
> out_put:
> i915_gem_object_put(obj);
>
> @@ -1530,7 +1530,7 @@ static int igt_ppgtt_sanity_check(void *arg)
> err = igt_write_huge(ctx, obj);
>
> i915_gem_object_unpin_pages(obj);
> - __i915_gem_object_put_pages(obj,
> I915_MM_NORMAL);
> + __i915_gem_object_put_pages(obj);
> i915_gem_object_put(obj);
>
> if (err) {
> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> index 19e1cca8f143..95d609abd39b 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> @@ -32,7 +32,7 @@ static void close_objects(struct intel_memory_region
> *mem,
> if (i915_gem_object_has_pinned_pages(obj))
> i915_gem_object_unpin_pages(obj);
> /* No polluting the memory region between tests */
> - __i915_gem_object_put_pages(obj, I915_MM_NORMAL);
> + __i915_gem_object_put_pages(obj);
> list_del(&obj->st_link);
> i915_gem_object_put(obj);
> }
> @@ -122,7 +122,7 @@ igt_object_create(struct intel_memory_region *mem,
> static void igt_object_release(struct drm_i915_gem_object *obj) {
> i915_gem_object_unpin_pages(obj);
> - __i915_gem_object_put_pages(obj, I915_MM_NORMAL);
> + __i915_gem_object_put_pages(obj);
> list_del(&obj->st_link);
> i915_gem_object_put(obj);
> }
> --
> 2.24.0.rc2
More information about the Intel-gfx
mailing list