[Intel-gfx] [CI] drm/i915: Use a ctor for TYPESAFE_BY_RCU i915_request

Fri Nov 22 20:50:06 UTC 2019

Quoting Ruhl, Michael J (2019-11-22 20:37:59)
> >-----Original Message-----
> >From: Intel-gfx <intel-gfx-bounces at lists.freedesktop.org> On Behalf Of Chris
> >Wilson
> >Sent: Friday, November 22, 2019 2:03 AM
> >To: intel-gfx at lists.freedesktop.org
> >Subject: [Intel-gfx] [CI] drm/i915: Use a ctor for TYPESAFE_BY_RCU
> >i915_request
> >
> >As we start peeking into requests for longer and longer, e.g.
> >incorporating use of spinlocks when only protected by an
> >rcu_read_lock(), we need to be careful in how we reset the request when
> >recycling and need to preserve any barriers that may still be in use as
> >the request is reset for reuse.
> >
> >Quoting Linus Torvalds:
> >
> >> If there is refcounting going on then why use SLAB_TYPESAFE_BY_RCU?
> >
> >  .. because the object can be accessed (by RCU) after the refcount has
> >  gone down to zero, and the thing has been released.
> >
> >  That's the whole and only point of SLAB_TYPESAFE_BY_RCU.
> >
> >  That flag basically says:
> >
> >  "I may end up accessing this object *after* it has been free'd,
> >  because there may be RCU lookups in flight"
> >
> >  This has nothing to do with constructors. It's ok if the object gets
> >  reused as an object of the same type and does *not* get
> >  re-initialized, because we're perfectly fine seeing old stale data.
> >
> >  What it guarantees is that the slab isn't shared with any other kind
> >  of object, _and_ that the underlying pages are free'd after an RCU
> >  quiescent period (so the pages aren't shared with another kind of
> >  object either during an RCU walk).
> >
> >  And it doesn't necessarily have to have a constructor, because the
> >  thing that a RCU walk will care about is
> >
> >    (a) guaranteed to be an object that *has* been on some RCU list (so
> >    it's not a "new" object)
> >
> >    (b) the RCU walk needs to have logic to verify that it's still the
> >    *same* object and hasn't been re-used as something else.
> >
> >  In contrast, a SLAB_TYPESAFE_BY_RCU memory gets free'd and re-used
> >  immediately, but because it gets reused as the same kind of object,
> >  the RCU walker can "know" what parts have meaning for re-use, in a way
> >  it couidn't if the re-use was random.
> >
> >  That said, it *is* subtle, and people should be careful.
> >
> >> So the re-use might initialize the fields lazily, not necessarily using a ctor.
> >
> >  If you have a well-defined refcount, and use "atomic_inc_not_zero()"
> >  to guard the speculative RCU access section, and use
> >  "atomic_dec_and_test()" in the freeing section, then you should be
> >  safe wrt new allocations.
> >
> >  If you have a completely new allocation that has "random stale
> >  content", you know that it cannot be on the RCU list, so there is no
> >  speculative access that can ever see that random content.
> >
> >  So the only case you need to worry about is a re-use allocation, and
> >  you know that the refcount will start out as zero even if you don't
> >  have a constructor.
> >
> >  So you can think of the refcount itself as always having a zero
> >  constructor, *BUT* you need to be careful with ordering.
> >
> >  In particular, whoever does the allocation needs to then set the
> >  refcount to a non-zero value *after* it has initialized all the other
> >  fields. And in particular, it needs to make sure that it uses the
> >  proper memory ordering to do so.
> >
> >  NOTE! One thing to be very worried about is that re-initializing
> >  whatever RCU lists means that now the RCU walker may be walking on the
> >  wrong list so the walker may do the right thing for this particular
> >  entry, but it may miss walking *other* entries. So then you can get
> >  spurious lookup failures, because the RCU walker never walked all the
> >  way to the end of the right list. That ends up being a much more
> >  subtle bug.
> >
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >---
> > drivers/gpu/drm/i915/i915_request.c   | 49 ++++++++++++++++++---------
> > drivers/gpu/drm/i915/i915_scheduler.c |  6 ++++
> > drivers/gpu/drm/i915/i915_scheduler.h |  1 +
> > drivers/gpu/drm/i915/i915_sw_fence.c  |  8 +++++
> > drivers/gpu/drm/i915/i915_sw_fence.h  |  2 ++
> > 5 files changed, 50 insertions(+), 16 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_request.c
> >b/drivers/gpu/drm/i915/i915_request.c
> >index 00011f9533b6..a558f64186fa 100644
> >--- a/drivers/gpu/drm/i915/i915_request.c
> >+++ b/drivers/gpu/drm/i915/i915_request.c
> >@@ -188,7 +188,7 @@ static void free_capture_list(struct i915_request
> >*request)
> > {
> >       struct i915_capture_list *capture;
> >
> >-      capture = request->capture_list;
> >+      capture = fetch_and_zero(&request->capture_list);
> >       while (capture) {
> >               struct i915_capture_list *next = capture->next;
> >
> >@@ -214,7 +214,7 @@ static void remove_from_engine(struct i915_request
> >*rq)
> >               spin_lock(&engine->active.lock);
> >               locked = engine;
> >       }
> >-      list_del(&rq->sched.link);
> >+      list_del_init(&rq->sched.link);
> >       spin_unlock_irq(&locked->active.lock);
> > }
> >
> >@@ -586,6 +586,21 @@ request_alloc_slow(struct intel_timeline *tl, gfp_t
> >gfp)
> >       return kmem_cache_alloc(global.slab_requests, gfp);
> > }
> >
> >+static void __i915_request_ctor(void *arg)
> >+{
> >+      struct i915_request *rq = arg;
> >+
> >+      spin_lock_init(&rq->lock);
> >+      i915_sched_node_init(&rq->sched);
> >+      i915_sw_fence_init(&rq->submit, submit_notify);
> >+      i915_sw_fence_init(&rq->semaphore, semaphore_notify);
> >+
> >+      rq->file_priv = NULL;
> >+      rq->capture_list = NULL;
> >+
> >+      INIT_LIST_HEAD(&rq->execute_cb);
> >+}
> 
> My understanding of the cache ctor is that it will only get invoked when
> the cache is expanded.  It does not get called on each kmem_cache_alloc().
> 
> I.e. if you do an _alloc() and there is nothing available in the cache, a new
> slab is created, and the ctor is called.

Correct.

> Is that the subtly that you are referring to?
> 
> Is doing this ctor only when new cache slabs are created sufficient?

That is precisely the point. As we only protect some accesses with RCU,
the request we are peeking at may be reused while we access it. The
challenge is to keep all such access valid. E.g. consider that the whole
of i915_gem_busy_ioctl() is performed only protected by the
rcu_read_lock(), and yet we need the results to be accurate even though
the requests be reallocated in the middle of it.

The situation that spurred the ctor was taking a spinlock in an rcu path
and finding it was re-initialised in the middle of that.
-Chris