[Intel-gfx] [PATCH 12/18] drm/i915: Allow userspace to clone contexts on creation

Thu Mar 21 14:38:33 UTC 2019

Quoting Tvrtko Ursulin (2019-03-20 13:13:45)
> 
> On 19/03/2019 11:57, Chris Wilson wrote:
> > A usecase arose out of handling context recovery in mesa, whereby they
> > wish to recreate a context with fresh logical state but preserving all
> > other details of the original. Currently, they create a new context and
> > iterate over which bits they want to copy across, but it would much more
> > convenient if they were able to just pass in a target context to clone
> > during creation. This essentially extends the setparam during creation
> > to pull the details from a target context instead of the user supplied
> > parameters.
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/i915_gem_context.c | 154 ++++++++++++++++++++++++
> >   include/uapi/drm/i915_drm.h             |  14 +++
> >   2 files changed, 168 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index fc1f64e19507..f36648329074 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -1500,8 +1500,162 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
> >       return ctx_setparam(arg->ctx, &local.param);
> >   }
> >   
> > +static int clone_flags(struct i915_gem_context *dst,
> > +                    struct i915_gem_context *src)
> > +{
> > +     dst->user_flags = src->user_flags;
> > +     return 0;
> > +}
> > +
> > +static int clone_schedattr(struct i915_gem_context *dst,
> > +                        struct i915_gem_context *src)
> > +{
> > +     dst->sched = src->sched;
> > +     return 0;
> > +}
> > +
> > +static int clone_sseu(struct i915_gem_context *dst,
> > +                   struct i915_gem_context *src)
> > +{
> > +     const struct intel_sseu default_sseu =
> > +             intel_device_default_sseu(dst->i915);
> > +     struct intel_engine_cs *engine;
> > +     enum intel_engine_id id;
> > +
> > +     for_each_engine(engine, dst->i915, id) {
> 
> Hm in the load balancing patch this needs to be extended so the veng ce 
> is also handled here.
> 
> Possibly even when adding engine map the loop needs to iterate the map 
> and not for_each_engine?

One problem is that it is hard to match a veng in one context with
another context, there may even be several :|

And then in clone_engines, we create a fresh virtual engine. So a nasty
interoperation with clone_engines.

Bleugh.

> > +             struct intel_context *ce;
> > +             struct intel_sseu sseu;
> > +
> > +             ce = intel_context_lookup(src, engine);
> > +             if (!ce)
> > +                     continue;
> > +
> > +             sseu = ce->sseu;
> > +             if (!memcmp(&sseu, &default_sseu, sizeof(sseu)))
> 
> Could memcmp against &ce->sseu directly and keep src_ce and dst_ce so 
> you can copy over without a temporary copy on stack?

At one point, the locking favoured making a local sseu to avoid
overlapping locks. Hmm, sseu = ce->sseu could still tear. Pedantically
that copy should be locked.

> > +                     continue;
> > +
> > +             ce = intel_context_pin_lock(dst, engine);
> > +             if (IS_ERR(ce))
> > +                     return PTR_ERR(ce);
> > +
> > +             ce->sseu = sseu;
> > +             intel_context_pin_unlock(ce);
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static int clone_timeline(struct i915_gem_context *dst,
> > +                       struct i915_gem_context *src)
> > +{
> > +     if (src->timeline) {
> > +             GEM_BUG_ON(src->timeline == dst->timeline);
> > +
> > +             if (dst->timeline)
> > +                     i915_timeline_put(dst->timeline);
> > +             dst->timeline = i915_timeline_get(src->timeline);
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static int clone_vm(struct i915_gem_context *dst,
> > +                 struct i915_gem_context *src)
> > +{
> > +     struct i915_hw_ppgtt *ppgtt;
> > +
> > +     rcu_read_lock();
> > +     do {
> > +             ppgtt = READ_ONCE(src->ppgtt);
> > +             if (!ppgtt)
> > +                     break;
> > +
> > +             if (!kref_get_unless_zero(&ppgtt->ref))
> > +                     continue;
> > +
> > +             /*
> > +              * This ppgtt may have be reallocated between
> > +              * the read and the kref, and reassigned to a third
> > +              * context. In order to avoid inadvertent sharing
> > +              * of this ppgtt with that third context (and not
> > +              * src), we have to confirm that we have the same
> > +              * ppgtt after passing through the strong memory
> > +              * barrier implied by a successful
> > +              * kref_get_unless_zero().
> > +              *
> > +              * Once we have acquired the current ppgtt of src,
> > +              * we no longer care if it is released from src, as
> > +              * it cannot be reallocated elsewhere.
> > +              */
> > +
> > +             if (ppgtt == READ_ONCE(src->ppgtt))
> > +                     break;
> > +
> > +             i915_ppgtt_put(ppgtt);
> > +     } while (1);
> > +     rcu_read_unlock();
> 
> I still have the same problem. What if you added here:
> 
> GEM_BUG_ON(ppgtt != READ_ONCE(src->ppgtt));
> 
> Could it trigger? If so what is the point in the last check in the loop 
> above?

Yes, it can trigger, as there is no outer mutex guarding the assignment
of src->ppgtt with our read. And that is why the check has to exist --
because it can be reassigned during the first read and before we acquire
the kref, and so ppgtt may have been freed and then reallocated and
assigned to a new ctx during that interval. We don't care that
src->ppgtt gets updated after we have taken a copy, we care that ppgtt
may get reused on another ctx entirely.
-Chris