[Intel-gfx] [RFC v4 1/2] drm/i915: Engine discovery uAPI

Chris Wilson chris at chris-wilson.co.uk
Wed Jun 28 13:21:54 UTC 2017


Quoting Tvrtko Ursulin (2017-06-28 14:15:05)
> 
> On 28/06/2017 12:27, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-06-26 16:47:41)
> >> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>
> >> Engine discovery uAPI allows userspace to probe for engine
> >> configuration and features without needing to maintain the
> >> internal PCI id based database.
> >>
> >> This enables removal of code duplications across userspace
> >> components.
> >>
> >> Probing is done via the new DRM_IOCTL_I915_GEM_ENGINE_INFO ioctl
> >> which returns the number and information on the specified engine
> >> class.
> >>
> >> Currently only general engine configuration and HEVC feature of
> >> the VCS engine can be probed but the uAPI is designed to be
> >> generic and extensible.
> >>
> >> Code is based almost exactly on the earlier proposal on the
> >> topic by Jon Bloomfield. Engine class and instance refactoring
> >> made recently by Daniele Ceraolo Spurio enabled this to be
> >> implemented in an elegant fashion.
> >>
> >> To probe configuration userspace sets the engine class it wants
> >> to query (struct drm_i915_gem_engine_info) and provides an array
> >> of drm_i915_engine_info structs which will be filled in by the
> >> driver. Userspace also has to tell i915 how many elements are in
> >> the array, and the driver will report back the total number of
> >> engine instances in any case.
> >>
> >> v2:
> >>   * Add a version field and consolidate to one engine count.
> >>     (Chris Wilson)
> >>   * Rename uAPI flags for VCS engines to DRM_I915_ENGINE_CLASS_VIDEO.
> >>     (Gong Zhipeng)
> >>
> >> v3:
> >>   * Drop the DRM_ prefix from uAPI enums. (Chris Wilson)
> >>   * Only reserve 8-bits for the engine info for time being.
> >>
> >> v4:
> >>   * Made user_class_map static.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >> Cc: Ben Widawsky <ben at bwidawsk.net>
> >> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >> Cc: Daniel Vetter <daniel.vetter at intel.com>
> >> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> >> Cc: Jon Bloomfield <jon.bloomfield at intel.com>
> >> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin at intel.com>
> >> Cc: Oscar Mateo <oscar.mateo at intel.com>
> >> Cc: "Gong, Zhipeng" <zhipeng.gong at intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/i915_drv.c        |  1 +
> >>   drivers/gpu/drm/i915/i915_drv.h        |  3 ++
> >>   drivers/gpu/drm/i915/intel_engine_cs.c | 64 ++++++++++++++++++++++++++++++++++
> >>   include/uapi/drm/i915_drm.h            | 41 ++++++++++++++++++++++
> >>   4 files changed, 109 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> >> index e9d8e9ee51b2..44dd2f37937f 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.c
> >> +++ b/drivers/gpu/drm/i915/i915_drv.c
> >> @@ -2724,6 +2724,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
> >>          DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
> >>          DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
> >>          DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
> >> +       DRM_IOCTL_DEF_DRV(I915_GEM_ENGINE_INFO, i915_gem_engine_info_ioctl, DRM_RENDER_ALLOW),
> >>   };
> >>   
> >>   static struct drm_driver driver = {
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >> index 427d10c7eca5..496565e1753f 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -3598,6 +3598,9 @@ void i915_oa_init_reg_state(struct intel_engine_cs *engine,
> >>                              struct i915_gem_context *ctx,
> >>                              uint32_t *reg_state);
> >>   
> >> +int i915_gem_engine_info_ioctl(struct drm_device *dev, void *data,
> >> +                              struct drm_file *file);
> >> +
> >>   /* i915_gem_evict.c */
> >>   int __must_check i915_gem_evict_something(struct i915_address_space *vm,
> >>                                            u64 min_size, u64 alignment,
> >> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> >> index 3b46c1f7b88b..a98669f6ad85 100644
> >> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> >> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> >> @@ -25,6 +25,7 @@
> >>   #include "i915_drv.h"
> >>   #include "intel_ringbuffer.h"
> >>   #include "intel_lrc.h"
> >> +#include <uapi/drm/i915_drm.h>
> >>   
> >>   /* Haswell does have the CXT_SIZE register however it does not appear to be
> >>    * valid. Now, docs explain in dwords what is in the context object. The full
> >> @@ -1332,6 +1333,69 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
> >>          }
> >>   }
> >>   
> >> +static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
> >> +       [I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
> >> +       [I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
> >> +       [I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
> >> +       [I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
> >> +       [I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
> >> +};
> >> +
> >> +int i915_gem_engine_info_ioctl(struct drm_device *dev, void *data,
> >> +                              struct drm_file *file)
> >> +{
> >> +       struct drm_i915_private *i915 = to_i915(dev);
> >> +       struct drm_i915_gem_engine_info *args = data;
> >> +       struct drm_i915_engine_info __user *user_info =
> >> +                                       u64_to_user_ptr(args->info_ptr);
> >> +       unsigned int info_size = args->num_engines;
> >> +       struct drm_i915_engine_info info;
> >> +       struct intel_engine_cs *engine;
> >> +       enum intel_engine_id id;
> >> +       u8 class;
> >> +
> >> +       if (args->rsvd)
> >> +               return -EINVAL;
> >> +
> >> +       switch (args->engine_class) {
> >> +       case I915_ENGINE_CLASS_OTHER:
> >> +       case I915_ENGINE_CLASS_RENDER:
> >> +       case I915_ENGINE_CLASS_COPY:
> >> +       case I915_ENGINE_CLASS_VIDEO:
> >> +       case I915_ENGINE_CLASS_VIDEO_ENHANCE:
> >> +               class = user_class_map[args->engine_class];
> >> +               break;
> >> +       case I915_ENGINE_CLASS_MAX:
> >> +       default:
> >> +               return -EINVAL;
> >> +       };
> >> +
> >> +       args->num_engines = 0;
> >> +
> >> +       if (args->version != 1) {
> >> +               args->version = 1;
> >> +               return 0;
> > 
> > My gut feeling says we want to report both the version and count in the
> > first pass. Otherwise we have to do a version check, followed by a count
> > query, followed by the actual retrieval. If we can combine the version +
> > count check into one pass, it may be a little less hassle?
> 
> Could do, I don't see any downsides to it at the moment.
> 
> But the only benefit seems it would be in cases where userspace would 
> want to allocate the info array to the exact size, rather than just 
> using a large enough number.
> 
> In both cases userspace has to check both return code from the ioctl and 
> that the version number matches the requested one. And in both cases the 
> whole operation can be done with just one ioctl. Or if you want to 
> allocate the exact size info array in both cases you need two ioctls.
> 
> info.version = 1;
> info.class = <some class>;
> info.num_engines = <a large number should be enough for everyone>;
> ret = ioctl(&info);
> if (ret || info.version != 1)
>         goto try_a_lower_version;       
> for_each_engine(info) {
>         ...
> }

Even if you do preallocate, you need to be prepared for the kernel to
report something is off. I agree that my recommended approach would be
to go with a static allocation that is good enough and fallback to the
multi-pass query with dynamic allocation. If you design with that in
mind, we get the required behaviour correct and then include the
optimistic preallocation as an optimisation. But spell out how userspace
is meant to work, and remember the forward-compatibility problem :)
-Chris


More information about the Intel-gfx mailing list