[Intel-gfx] [CI] drm/i915: Use a ctor for TYPESAFE_BY_RCU i915_request

Fri Nov 22 21:08:42 UTC 2019

>-----Original Message-----
>From: Chris Wilson <chris at chris-wilson.co.uk>
>Sent: Friday, November 22, 2019 3:50 PM
>To: Ruhl, Michael J <michael.j.ruhl at intel.com>; intel-
>gfx at lists.freedesktop.org
>Subject: RE: [Intel-gfx] [CI] drm/i915: Use a ctor for TYPESAFE_BY_RCU
>i915_request
>
>Quoting Ruhl, Michael J (2019-11-22 20:37:59)
>> >-----Original Message-----
>> >From: Intel-gfx <intel-gfx-bounces at lists.freedesktop.org> On Behalf Of
>Chris
>> >Wilson
>> >Sent: Friday, November 22, 2019 2:03 AM
>> >To: intel-gfx at lists.freedesktop.org
>> >Subject: [Intel-gfx] [CI] drm/i915: Use a ctor for TYPESAFE_BY_RCU
>> >i915_request
>> >
>> >As we start peeking into requests for longer and longer, e.g.
>> >incorporating use of spinlocks when only protected by an
>> >rcu_read_lock(), we need to be careful in how we reset the request when
>> >recycling and need to preserve any barriers that may still be in use as
>> >the request is reset for reuse.
>> >
>> >Quoting Linus Torvalds:
>> >
>> >> If there is refcounting going on then why use SLAB_TYPESAFE_BY_RCU?
>> >
>> >  .. because the object can be accessed (by RCU) after the refcount has
>> >  gone down to zero, and the thing has been released.
>> >
>> >  That's the whole and only point of SLAB_TYPESAFE_BY_RCU.
>> >
>> >  That flag basically says:
>> >
>> >  "I may end up accessing this object *after* it has been free'd,
>> >  because there may be RCU lookups in flight"
>> >
>> >  This has nothing to do with constructors. It's ok if the object gets
>> >  reused as an object of the same type and does *not* get
>> >  re-initialized, because we're perfectly fine seeing old stale data.
>> >
>> >  What it guarantees is that the slab isn't shared with any other kind
>> >  of object, _and_ that the underlying pages are free'd after an RCU
>> >  quiescent period (so the pages aren't shared with another kind of
>> >  object either during an RCU walk).
>> >
>> >  And it doesn't necessarily have to have a constructor, because the
>> >  thing that a RCU walk will care about is
>> >
>> >    (a) guaranteed to be an object that *has* been on some RCU list (so
>> >    it's not a "new" object)
>> >
>> >    (b) the RCU walk needs to have logic to verify that it's still the
>> >    *same* object and hasn't been re-used as something else.
>> >
>> >  In contrast, a SLAB_TYPESAFE_BY_RCU memory gets free'd and re-used
>> >  immediately, but because it gets reused as the same kind of object,
>> >  the RCU walker can "know" what parts have meaning for re-use, in a way
>> >  it couidn't if the re-use was random.
>> >
>> >  That said, it *is* subtle, and people should be careful.
>> >
>> >> So the re-use might initialize the fields lazily, not necessarily using a ctor.
>> >
>> >  If you have a well-defined refcount, and use "atomic_inc_not_zero()"
>> >  to guard the speculative RCU access section, and use
>> >  "atomic_dec_and_test()" in the freeing section, then you should be
>> >  safe wrt new allocations.
>> >
>> >  If you have a completely new allocation that has "random stale
>> >  content", you know that it cannot be on the RCU list, so there is no
>> >  speculative access that can ever see that random content.
>> >
>> >  So the only case you need to worry about is a re-use allocation, and
>> >  you know that the refcount will start out as zero even if you don't
>> >  have a constructor.
>> >
>> >  So you can think of the refcount itself as always having a zero
>> >  constructor, *BUT* you need to be careful with ordering.
>> >
>> >  In particular, whoever does the allocation needs to then set the
>> >  refcount to a non-zero value *after* it has initialized all the other
>> >  fields. And in particular, it needs to make sure that it uses the
>> >  proper memory ordering to do so.
>> >
>> >  NOTE! One thing to be very worried about is that re-initializing
>> >  whatever RCU lists means that now the RCU walker may be walking on
>the
>> >  wrong list so the walker may do the right thing for this particular
>> >  entry, but it may miss walking *other* entries. So then you can get
>> >  spurious lookup failures, because the RCU walker never walked all the
>> >  way to the end of the right list. That ends up being a much more
>> >  subtle bug.
>> >
>> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> >Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> >---
>> > drivers/gpu/drm/i915/i915_request.c   | 49 ++++++++++++++++++---------
>> > drivers/gpu/drm/i915/i915_scheduler.c |  6 ++++
>> > drivers/gpu/drm/i915/i915_scheduler.h |  1 +
>> > drivers/gpu/drm/i915/i915_sw_fence.c  |  8 +++++
>> > drivers/gpu/drm/i915/i915_sw_fence.h  |  2 ++
>> > 5 files changed, 50 insertions(+), 16 deletions(-)
>> >
>> >diff --git a/drivers/gpu/drm/i915/i915_request.c
>> >b/drivers/gpu/drm/i915/i915_request.c
>> >index 00011f9533b6..a558f64186fa 100644
>> >--- a/drivers/gpu/drm/i915/i915_request.c
>> >+++ b/drivers/gpu/drm/i915/i915_request.c
>> >@@ -188,7 +188,7 @@ static void free_capture_list(struct i915_request
>> >*request)
>> > {
>> >       struct i915_capture_list *capture;
>> >
>> >-      capture = request->capture_list;
>> >+      capture = fetch_and_zero(&request->capture_list);
>> >       while (capture) {
>> >               struct i915_capture_list *next = capture->next;
>> >
>> >@@ -214,7 +214,7 @@ static void remove_from_engine(struct
>i915_request
>> >*rq)
>> >               spin_lock(&engine->active.lock);
>> >               locked = engine;
>> >       }
>> >-      list_del(&rq->sched.link);
>> >+      list_del_init(&rq->sched.link);
>> >       spin_unlock_irq(&locked->active.lock);
>> > }
>> >
>> >@@ -586,6 +586,21 @@ request_alloc_slow(struct intel_timeline *tl, gfp_t
>> >gfp)
>> >       return kmem_cache_alloc(global.slab_requests, gfp);
>> > }
>> >
>> >+static void __i915_request_ctor(void *arg)
>> >+{
>> >+      struct i915_request *rq = arg;
>> >+
>> >+      spin_lock_init(&rq->lock);
>> >+      i915_sched_node_init(&rq->sched);
>> >+      i915_sw_fence_init(&rq->submit, submit_notify);
>> >+      i915_sw_fence_init(&rq->semaphore, semaphore_notify);
>> >+
>> >+      rq->file_priv = NULL;
>> >+      rq->capture_list = NULL;
>> >+
>> >+      INIT_LIST_HEAD(&rq->execute_cb);
>> >+}
>>
>> My understanding of the cache ctor is that it will only get invoked when
>> the cache is expanded.  It does not get called on each kmem_cache_alloc().
>>
>> I.e. if you do an _alloc() and there is nothing available in the cache, a new
>> slab is created, and the ctor is called.
>
>Correct.
>
>> Is that the subtly that you are referring to?
>>
>> Is doing this ctor only when new cache slabs are created sufficient?
>
>That is precisely the point. As we only protect some accesses with RCU,
>the request we are peeking at may be reused while we access it. The
>challenge is to keep all such access valid. E.g. consider that the whole
>of i915_gem_busy_ioctl() is performed only protected by the
>rcu_read_lock(), and yet we need the results to be accurate even though
>the requests be reallocated in the middle of it.
>
>The situation that spurred the ctor was taking a spinlock in an rcu path
>and finding it was re-initialised in the middle of that.

To requote:

> > This has nothing to do with constructors. It's ok if the object gets
> >  reused as an object of the same type and does *not* get
> >  re-initialized, because we're perfectly fine seeing old stale data.

Ok.  I think I am following this.  You are using the side effect of the ctor
to protect against old data from being re-initialized until after RCU is
done with it.

Thanks for the clarification.

Mike

>-Chris