[Intel-gfx] [RFC v4 1/2] drm/i915: Engine discovery uAPI

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Jun 28 13:19:15 UTC 2017


On 28/06/2017 14:15, Tvrtko Ursulin wrote:
> On 28/06/2017 12:27, Chris Wilson wrote:
>> Quoting Tvrtko Ursulin (2017-06-26 16:47:41)
>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>
>>> Engine discovery uAPI allows userspace to probe for engine
>>> configuration and features without needing to maintain the
>>> internal PCI id based database.
>>>
>>> This enables removal of code duplications across userspace
>>> components.
>>>
>>> Probing is done via the new DRM_IOCTL_I915_GEM_ENGINE_INFO ioctl
>>> which returns the number and information on the specified engine
>>> class.
>>>
>>> Currently only general engine configuration and HEVC feature of
>>> the VCS engine can be probed but the uAPI is designed to be
>>> generic and extensible.
>>>
>>> Code is based almost exactly on the earlier proposal on the
>>> topic by Jon Bloomfield. Engine class and instance refactoring
>>> made recently by Daniele Ceraolo Spurio enabled this to be
>>> implemented in an elegant fashion.
>>>
>>> To probe configuration userspace sets the engine class it wants
>>> to query (struct drm_i915_gem_engine_info) and provides an array
>>> of drm_i915_engine_info structs which will be filled in by the
>>> driver. Userspace also has to tell i915 how many elements are in
>>> the array, and the driver will report back the total number of
>>> engine instances in any case.
>>>
>>> v2:
>>>   * Add a version field and consolidate to one engine count.
>>>     (Chris Wilson)
>>>   * Rename uAPI flags for VCS engines to DRM_I915_ENGINE_CLASS_VIDEO.
>>>     (Gong Zhipeng)
>>>
>>> v3:
>>>   * Drop the DRM_ prefix from uAPI enums. (Chris Wilson)
>>>   * Only reserve 8-bits for the engine info for time being.
>>>
>>> v4:
>>>   * Made user_class_map static.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: Ben Widawsky <ben at bwidawsk.net>
>>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Daniel Vetter <daniel.vetter at intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
>>> Cc: Jon Bloomfield <jon.bloomfield at intel.com>
>>> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin at intel.com>
>>> Cc: Oscar Mateo <oscar.mateo at intel.com>
>>> Cc: "Gong, Zhipeng" <zhipeng.gong at intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_drv.c        |  1 +
>>>   drivers/gpu/drm/i915/i915_drv.h        |  3 ++
>>>   drivers/gpu/drm/i915/intel_engine_cs.c | 64 
>>> ++++++++++++++++++++++++++++++++++
>>>   include/uapi/drm/i915_drm.h            | 41 ++++++++++++++++++++++
>>>   4 files changed, 109 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c 
>>> b/drivers/gpu/drm/i915/i915_drv.c
>>> index e9d8e9ee51b2..44dd2f37937f 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -2724,6 +2724,7 @@ static const struct drm_ioctl_desc 
>>> i915_ioctls[] = {
>>>          DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, 
>>> i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
>>>          DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, 
>>> i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
>>>          DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, 
>>> DRM_RENDER_ALLOW),
>>> +       DRM_IOCTL_DEF_DRV(I915_GEM_ENGINE_INFO, 
>>> i915_gem_engine_info_ioctl, DRM_RENDER_ALLOW),
>>>   };
>>>   static struct drm_driver driver = {
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index 427d10c7eca5..496565e1753f 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -3598,6 +3598,9 @@ void i915_oa_init_reg_state(struct 
>>> intel_engine_cs *engine,
>>>                              struct i915_gem_context *ctx,
>>>                              uint32_t *reg_state);
>>> +int i915_gem_engine_info_ioctl(struct drm_device *dev, void *data,
>>> +                              struct drm_file *file);
>>> +
>>>   /* i915_gem_evict.c */
>>>   int __must_check i915_gem_evict_something(struct i915_address_space 
>>> *vm,
>>>                                            u64 min_size, u64 alignment,
>>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
>>> b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> index 3b46c1f7b88b..a98669f6ad85 100644
>>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> @@ -25,6 +25,7 @@
>>>   #include "i915_drv.h"
>>>   #include "intel_ringbuffer.h"
>>>   #include "intel_lrc.h"
>>> +#include <uapi/drm/i915_drm.h>
>>>   /* Haswell does have the CXT_SIZE register however it does not 
>>> appear to be
>>>    * valid. Now, docs explain in dwords what is in the context 
>>> object. The full
>>> @@ -1332,6 +1333,69 @@ void intel_engines_mark_idle(struct 
>>> drm_i915_private *i915)
>>>          }
>>>   }
>>> +static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
>>> +       [I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
>>> +       [I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
>>> +       [I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
>>> +       [I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
>>> +       [I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
>>> +};
>>> +
>>> +int i915_gem_engine_info_ioctl(struct drm_device *dev, void *data,
>>> +                              struct drm_file *file)
>>> +{
>>> +       struct drm_i915_private *i915 = to_i915(dev);
>>> +       struct drm_i915_gem_engine_info *args = data;
>>> +       struct drm_i915_engine_info __user *user_info =
>>> +                                       u64_to_user_ptr(args->info_ptr);
>>> +       unsigned int info_size = args->num_engines;
>>> +       struct drm_i915_engine_info info;
>>> +       struct intel_engine_cs *engine;
>>> +       enum intel_engine_id id;
>>> +       u8 class;
>>> +
>>> +       if (args->rsvd)
>>> +               return -EINVAL;
>>> +
>>> +       switch (args->engine_class) {
>>> +       case I915_ENGINE_CLASS_OTHER:
>>> +       case I915_ENGINE_CLASS_RENDER:
>>> +       case I915_ENGINE_CLASS_COPY:
>>> +       case I915_ENGINE_CLASS_VIDEO:
>>> +       case I915_ENGINE_CLASS_VIDEO_ENHANCE:
>>> +               class = user_class_map[args->engine_class];
>>> +               break;
>>> +       case I915_ENGINE_CLASS_MAX:
>>> +       default:
>>> +               return -EINVAL;
>>> +       };
>>> +
>>> +       args->num_engines = 0;
>>> +
>>> +       if (args->version != 1) {
>>> +               args->version = 1;
>>> +               return 0;
>>
>> My gut feeling says we want to report both the version and count in the
>> first pass. Otherwise we have to do a version check, followed by a count
>> query, followed by the actual retrieval. If we can combine the version +
>> count check into one pass, it may be a little less hassle?
> 
> Could do, I don't see any downsides to it at the moment.
> 
> But the only benefit seems it would be in cases where userspace would 
> want to allocate the info array to the exact size, rather than just 
> using a large enough number.

Sorry, that was one outdated paragraph. From the one below it follows 
that the only benefit is offering a little bit more info on a version 
mismatch. But since we can do the whole query with one ioctl as below, I 
don't see that it would be useful. But could implement it just as well 
since as I said I don't see a downside to it.

Regards,

Tvrtko

> In both cases userspace has to check both return code from the ioctl and 
> that the version number matches the requested one. And in both cases the 
> whole operation can be done with just one ioctl. Or if you want to 
> allocate the exact size info array in both cases you need two ioctls.
> 
> info.version = 1;
> info.class = <some class>;
> info.num_engines = <a large number should be enough for everyone>;
> ret = ioctl(&info);
> if (ret || info.version != 1)
>      goto try_a_lower_version;
> for_each_engine(info) {
>      ...
> }
> 
> Regards,
> 
> Tvrtko


More information about the Intel-gfx mailing list