[Intel-gfx] [PATCH] drm/i915: Hide unshrinkable context objects from the shrinker
Chris Wilson
chris at chris-wilson.co.uk
Mon Jul 22 21:51:54 UTC 2019
Quoting Tvrtko Ursulin (2019-07-22 13:08:42)
>
> On 19/07/2019 18:21, Chris Wilson wrote:
> > The shrinker cannot touch objects used by the contexts (logical state
> > and ring). Currently we mark those as "pin_global" to let the shrinker
> > skip over them, however, if we remove them from the shrinker lists
> > entirely, we don't event have to include them in our shrink accounting.
> >
> > By keeping the unshrinkable objects in our shrinker tracking, we report
> > a large number of objects available to be shrunk, and leave the shrinker
> > deeply unsatisfied when we fail to reclaim those. The shrinker will
> > persist in trying to reclaim the unavailable objects, forcing the system
> > into a livelock (not even hitting the dread oomkiller).
> >
> > v2: Extend unshrinkable protection for perma-pinned scratch and guc
> > allocations (Tvrtko)
> > v3: Notice that we should be pinned when marking unshrinkable and so the
> > link cannot be empty; merge duplicate paths.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > ---
> > drivers/gpu/drm/i915/gem/i915_gem_object.c | 11 +---
> > drivers/gpu/drm/i915/gem/i915_gem_object.h | 4 ++
> > drivers/gpu/drm/i915/gem/i915_gem_pages.c | 13 +----
> > drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 58 ++++++++++++++++++++
> > drivers/gpu/drm/i915/gt/intel_context.c | 4 +-
> > drivers/gpu/drm/i915/gt/intel_gt.c | 3 +-
> > drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 17 +++---
> > drivers/gpu/drm/i915/gt/uc/intel_guc.c | 2 +-
> > drivers/gpu/drm/i915/i915_debugfs.c | 3 +-
> > drivers/gpu/drm/i915/i915_vma.c | 16 ++++++
> > drivers/gpu/drm/i915/i915_vma.h | 4 ++
> > 11 files changed, 102 insertions(+), 33 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> > index d5197a2a106f..4ea97fca9c35 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> > @@ -63,6 +63,8 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
> > spin_lock_init(&obj->vma.lock);
> > INIT_LIST_HEAD(&obj->vma.list);
> >
> > + INIT_LIST_HEAD(&obj->mm.link);
> > +
> > INIT_LIST_HEAD(&obj->lut_list);
> > INIT_LIST_HEAD(&obj->batch_pool_link);
> >
> > @@ -273,14 +275,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > * or else we may oom whilst there are plenty of deferred
> > * freed objects.
> > */
> > - if (i915_gem_object_has_pages(obj) &&
> > - i915_gem_object_is_shrinkable(obj)) {
> > - unsigned long flags;
> > -
> > - spin_lock_irqsave(&i915->mm.obj_lock, flags);
> > - list_del_init(&obj->mm.link);
> > - spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
> > - }
> > + i915_gem_object_make_unshrinkable(obj);
> >
> > /*
> > * Since we require blocking on struct_mutex to unbind the freed
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > index 67aea07ea019..3714cf234d64 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > @@ -394,6 +394,10 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > unsigned int flags);
> > void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma);
> >
> > +void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj);
> > +void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj);
> > +void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj);
> > +
> > static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
> > {
> > if (obj->cache_dirty)
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> > index b36ad269f4ea..92ad3cc220e3 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> > @@ -153,24 +153,13 @@ static void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj)
> > struct sg_table *
> > __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
> > {
> > - struct drm_i915_private *i915 = to_i915(obj->base.dev);
> > struct sg_table *pages;
> >
> > pages = fetch_and_zero(&obj->mm.pages);
> > if (IS_ERR_OR_NULL(pages))
> > return pages;
> >
> > - if (i915_gem_object_is_shrinkable(obj)) {
> > - unsigned long flags;
> > -
> > - spin_lock_irqsave(&i915->mm.obj_lock, flags);
> > -
> > - list_del(&obj->mm.link);
> > - i915->mm.shrink_count--;
> > - i915->mm.shrink_memory -= obj->base.size;
> > -
> > - spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
> > - }
> > + i915_gem_object_make_unshrinkable(obj);
> >
> > if (obj->mm.mapping) {
> > void *ptr;
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > index 3f4c6bdcc3c3..5ab7df53c2a0 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > @@ -530,3 +530,61 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915,
> > if (unlock)
> > mutex_release(&i915->drm.struct_mutex.dep_map, 0, _RET_IP_);
> > }
> > +
> > +#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
> > +
> > +void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj)
> > +{
> > + /*
> > + * We can only be called while the pages are pinned or when
> > + * the pages are released. If pinned, we should only be called
> > + * from a single caller under controlled conditions; and on release
> > + * only one caller may release us. Neither the two may cross.
> > + */
> > + if (!list_empty(&obj->mm.link)) { /* pinned by caller */
>
> It's making me nervous. Are you avoiding checking under the lock just as
> an optimisation? It's not on any hot paths, or at least not very hot?
> Ring/context pin and that..
Because it's called from inside the obj->mm.lock with pin_count==0, and
outside when pinned by the caller. This portion is easy as an atomic
read, it's the later elided check inside the lock that requires the
thought. I consider i915->mm.obj_lock taken frequently enough to be a
concern, especially in the context of this patch where we hit a livelock
in the shrinker that leaves it running 100%.
-Chris
More information about the Intel-gfx
mailing list