[Intel-gfx] [PATCH 37/40] drm/i915: Allow a context to define its set of engines
Chris Wilson
chris at chris-wilson.co.uk
Fri Sep 28 20:22:45 UTC 2018
Quoting Tvrtko Ursulin (2018-09-27 12:28:47)
>
> On 19/09/2018 20:55, Chris Wilson wrote:
> > Over the last few years, we have debated how to extend the user API to
> > support an increase in the number of engines, that may be sparse and
> > even be heterogeneous within a class (not all video decoders created
> > equal). We settled on using (class, instance) tuples to identify a
> > specific engine, with an API for the user to construct a map of engines
> > to capabilities. Into this picture, we then add a challenge of virtual
> > engines; one user engine that maps behind the scenes to any number of
> > physical engines. To keep it general, we want the user to have full
> > control over that mapping. To that end, we allow the user to constrain a
> > context to define the set of engines that it can access, order fully
> > controlled by the user via (class, instance). With such precise control
> > in context setup, we can continue to use the existing execbuf uABI of
> > specifying a single index; only now it doesn't automagically map onto
> > the engines, it uses the user defined engine map from the context.
> >
> > The I915_EXEC_DEFAULT slot is left empty, and invalid for use by
> > execbuf. It's use will be revealed in the next patch.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> > drivers/gpu/drm/i915/i915_gem_context.c | 88 ++++++++++++++++++++++
> > drivers/gpu/drm/i915/i915_gem_context.h | 4 +
> > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 22 ++++--
> > include/uapi/drm/i915_drm.h | 27 +++++++
> > 4 files changed, 135 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index a8570a07b3b7..313471253f51 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -90,6 +90,7 @@
> > #include <drm/i915_drm.h>
> > #include "i915_drv.h"
> > #include "i915_trace.h"
> > +#include "i915_user_extensions.h"
> > #include "intel_workarounds.h"
> >
> > #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
> > @@ -223,6 +224,8 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
> > ce->ops->destroy(ce);
> > }
> >
> > + kfree(ctx->engines);
> > +
> > if (ctx->timeline)
> > i915_timeline_put(ctx->timeline);
> >
> > @@ -948,6 +951,87 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
> > return ret;
> > }
> >
> > +struct set_engines {
> > + struct i915_gem_context *ctx;
> > + struct intel_engine_cs **engines;
> > + unsigned int nengine;
> > +};
> > +
> > +static const i915_user_extension_fn set_engines__extensions[] = {
> > +};
>
> This is OK unless someone one day gets the desire to make the extension
> namespace sparse. I was thinking on how to put some warnings in the code
> to warn about it, but I think that's for later. Namespace is also per
> parent ioctl, another thing which would perhaps need enforcing.
Unless you intend to go extremely sparse, I'd just leave it with the
usual [NAME1] = func1, and skipping over NULLs. We can always add
alternatives if need be (I'd actually been meaning to convert execbuf3
over to this scheme to flesh it out a bit more).
> > +static int set_engines(struct i915_gem_context *ctx,
> > + struct drm_i915_gem_context_param *args)
> > +{
> > + struct i915_context_param_engines __user *user;
> > + struct set_engines set = {
> > + .ctx = ctx,
> > + .engines = NULL,
> > + .nengine = -1,
> > + };
> > + unsigned int n;
> > + u64 size, extensions;
>
> Size is u32 in the uAPI, so either that or unsigned int would do here.
>
> > + int err;
> > +
> > + user = u64_to_user_ptr(args->value);
> > + size = args->size;
> > + if (!size)
>
> args->sizes = sizeof(*user);
>
> ... if you want to allow size probing via set param, or we add a
> get_param (too much work for nothing?) and only allow probing from there?
It's a variable array, so a little trickier.
> > + goto out;
> > +
> > + if (size < sizeof(struct i915_context_param_engines))
> > + return -EINVAL;
> > +
> > + size -= sizeof(struct i915_context_param_engines);
> > + if (size % sizeof(*user->class_instance))
> > + return -EINVAL;
> > +
> > + set.nengine = size / sizeof(*user->class_instance);
> > + if (set.nengine == 0 || set.nengine >= I915_EXEC_RING_MASK)
> > + return -EINVAL;
> > +
> > + set.engines = kmalloc_array(set.nengine + 1,
> > + sizeof(*set.engines),
> > + GFP_KERNEL);
> > + if (!set.engines)
> > + return -ENOMEM;
> > +
> > + set.engines[0] = NULL;
>
> /* Reserve the I915_EXEC_DEFAULT slot. */
>
> > + for (n = 0; n < set.nengine; n++) {
> > + u32 class, instance;
>
> I will later recommend we use u16 for class/instance. I think we settled
> for that in the meantime.
>
> > +
> > + if (get_user(class, &user->class_instance[n].class) ||
> > + get_user(instance, &user->class_instance[n].instance)) {
> > + kfree(set.engines);
> > + return -EFAULT;
> > + }
> > +
> > + set.engines[n + 1] =
> > + intel_engine_lookup_user(ctx->i915, class, instance);
> > + if (!set.engines[n + 1]) {
> > + kfree(set.engines);
> > + return -ENOENT;
> > + }
> > + }
> > +
> > + err = -EFAULT;
> > + if (!get_user(extensions, &user->extensions))
> > + err = i915_user_extensions(u64_to_user_ptr(extensions),
> > + set_engines__extensions,
> > + ARRAY_SIZE(set_engines__extensions),
> > + &set);
> > + if (err) {
>
> Do we need an user extensions destructor table for a more
> compartmentalized design? This actually applies to the
> i915_user_extensions() implementation.. it would need to take two
> function tables I think and unwind until the one that failed.
Yeah, dtor for unwind is intriguing.
> > + kfree(set.engines);
> > + return err;
> > + }
> > +
> > +out:
> > + kfree(ctx->engines);
> > + ctx->engines = set.engines;
> > + ctx->nengine = set.nengine + 1;
> > +
> > + return 0;
> > +}
> > +
> > int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> > struct drm_file *file)
> > {
> > @@ -1011,6 +1095,10 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> > }
> > break;
> >
> > + case I915_CONTEXT_PARAM_ENGINES:
> > + ret = set_engines(ctx, args);
> > + break;
> > +
> > default:
> > ret = -EINVAL;
> > break;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> > index 770043449ba6..9f89119a6566 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> > @@ -67,6 +67,8 @@ struct i915_gem_context {
> > /** file_priv: owning file descriptor */
> > struct drm_i915_file_private *file_priv;
> >
> > + struct intel_engine_cs **engines;
> > +
> > struct i915_timeline *timeline;
> >
> > /**
> > @@ -135,6 +137,8 @@ struct i915_gem_context {
> > #define CONTEXT_CLOSED 1
> > #define CONTEXT_FORCE_SINGLE_SUBMISSION 2
> >
> > + unsigned int nengine;
>
> Why put related fields apart?
Packing.
>
> > +
> > /**
> > * @hw_id: - unique identifier for the context
> > *
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 109486a6ef07..faedfdca875d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -2014,13 +2014,23 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
> > };
> >
> > static struct intel_engine_cs *
> > -eb_select_engine(struct drm_i915_private *dev_priv,
> > +eb_select_engine(struct i915_execbuffer *eb,
> > struct drm_file *file,
> > struct drm_i915_gem_execbuffer2 *args)
> > {
> > unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK;
> > struct intel_engine_cs *engine;
> >
> > + if (eb->ctx->engines) {
> > + if (user_ring_id >= eb->ctx->nengine) {
> > + DRM_DEBUG("execbuf with unknown ring: %u\n",
> > + user_ring_id);
> > + return NULL;
> > + }
> > +
> > + return eb->ctx->engines[user_ring_id];
> > + }
> > +
> > if (user_ring_id > I915_USER_RINGS) {
> > DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
> > return NULL;
> > @@ -2033,11 +2043,11 @@ eb_select_engine(struct drm_i915_private *dev_priv,
> > return NULL;
> > }
> >
> > - if (user_ring_id == I915_EXEC_BSD && HAS_BSD2(dev_priv)) {
> > + if (user_ring_id == I915_EXEC_BSD && HAS_BSD2(eb->i915)) {
> > unsigned int bsd_idx = args->flags & I915_EXEC_BSD_MASK;
> >
> > if (bsd_idx == I915_EXEC_BSD_DEFAULT) {
> > - bsd_idx = gen8_dispatch_bsd_engine(dev_priv, file);
> > + bsd_idx = gen8_dispatch_bsd_engine(eb->i915, file);
> > } else if (bsd_idx >= I915_EXEC_BSD_RING1 &&
> > bsd_idx <= I915_EXEC_BSD_RING2) {
> > bsd_idx >>= I915_EXEC_BSD_SHIFT;
> > @@ -2048,9 +2058,9 @@ eb_select_engine(struct drm_i915_private *dev_priv,
> > return NULL;
> > }
> >
> > - engine = dev_priv->engine[_VCS(bsd_idx)];
> > + engine = eb->i915->engine[_VCS(bsd_idx)];
> > } else {
> > - engine = dev_priv->engine[user_ring_map[user_ring_id]];
> > + engine = eb->i915->engine[user_ring_map[user_ring_id]];
> > }
> >
> > if (!engine) {
> > @@ -2260,7 +2270,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > if (unlikely(err))
> > goto err_destroy;
> >
> > - eb.engine = eb_select_engine(eb.i915, file, args);
> > + eb.engine = eb_select_engine(&eb, file, args);
> > if (!eb.engine) {
> > err = -EINVAL;
> > goto err_engine;
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index 65b0b84419f3..d41b4c673af4 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -1508,9 +1508,36 @@ struct drm_i915_gem_context_param {
> > #define I915_CONTEXT_MAX_USER_PRIORITY 1023 /* inclusive */
> > #define I915_CONTEXT_DEFAULT_PRIORITY 0
> > #define I915_CONTEXT_MIN_USER_PRIORITY -1023 /* inclusive */
> > +
> > +/*
> > + * I915_CONTEXT_PARAM_ENGINES:
> > + *
> > + * Bind this context to operate on this subset of available engines. Henceforth,
> > + * the I915_EXEC_RING selector for DRM_IOCTL_I915_GEM_EXECBUFFER2 operates as
> > + * an index into this array of engines; I915_EXEC_DEFAULT selecting engine[0]
> > + * and upwards. The array created is offset by 1, such that by default
> > + * I915_EXEC_DEFAULT is left empty, to be filled in as directed. Slots 1...N
> > + * are then filled in using the specified (class, instance).
>
> Should we allow specifying the default index? I am wondering if that
> would make life easier to any userspace.
Could do, certainly gets rid of the clumsy nengines+1, but the skip for
unknown class may make up for the clumsyness :)
-Chris
More information about the Intel-gfx
mailing list