[Intel-gfx] [RFC v4 1/2] drm/i915: Engine discovery uAPI
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Wed Jun 28 13:19:15 UTC 2017
On 28/06/2017 14:15, Tvrtko Ursulin wrote:
> On 28/06/2017 12:27, Chris Wilson wrote:
>> Quoting Tvrtko Ursulin (2017-06-26 16:47:41)
>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>
>>> Engine discovery uAPI allows userspace to probe for engine
>>> configuration and features without needing to maintain the
>>> internal PCI id based database.
>>>
>>> This enables removal of code duplications across userspace
>>> components.
>>>
>>> Probing is done via the new DRM_IOCTL_I915_GEM_ENGINE_INFO ioctl
>>> which returns the number and information on the specified engine
>>> class.
>>>
>>> Currently only general engine configuration and HEVC feature of
>>> the VCS engine can be probed but the uAPI is designed to be
>>> generic and extensible.
>>>
>>> Code is based almost exactly on the earlier proposal on the
>>> topic by Jon Bloomfield. Engine class and instance refactoring
>>> made recently by Daniele Ceraolo Spurio enabled this to be
>>> implemented in an elegant fashion.
>>>
>>> To probe configuration userspace sets the engine class it wants
>>> to query (struct drm_i915_gem_engine_info) and provides an array
>>> of drm_i915_engine_info structs which will be filled in by the
>>> driver. Userspace also has to tell i915 how many elements are in
>>> the array, and the driver will report back the total number of
>>> engine instances in any case.
>>>
>>> v2:
>>> * Add a version field and consolidate to one engine count.
>>> (Chris Wilson)
>>> * Rename uAPI flags for VCS engines to DRM_I915_ENGINE_CLASS_VIDEO.
>>> (Gong Zhipeng)
>>>
>>> v3:
>>> * Drop the DRM_ prefix from uAPI enums. (Chris Wilson)
>>> * Only reserve 8-bits for the engine info for time being.
>>>
>>> v4:
>>> * Made user_class_map static.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: Ben Widawsky <ben at bwidawsk.net>
>>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Daniel Vetter <daniel.vetter at intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
>>> Cc: Jon Bloomfield <jon.bloomfield at intel.com>
>>> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin at intel.com>
>>> Cc: Oscar Mateo <oscar.mateo at intel.com>
>>> Cc: "Gong, Zhipeng" <zhipeng.gong at intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_drv.c | 1 +
>>> drivers/gpu/drm/i915/i915_drv.h | 3 ++
>>> drivers/gpu/drm/i915/intel_engine_cs.c | 64
>>> ++++++++++++++++++++++++++++++++++
>>> include/uapi/drm/i915_drm.h | 41 ++++++++++++++++++++++
>>> 4 files changed, 109 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>>> b/drivers/gpu/drm/i915/i915_drv.c
>>> index e9d8e9ee51b2..44dd2f37937f 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -2724,6 +2724,7 @@ static const struct drm_ioctl_desc
>>> i915_ioctls[] = {
>>> DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM,
>>> i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
>>> DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM,
>>> i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
>>> DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl,
>>> DRM_RENDER_ALLOW),
>>> + DRM_IOCTL_DEF_DRV(I915_GEM_ENGINE_INFO,
>>> i915_gem_engine_info_ioctl, DRM_RENDER_ALLOW),
>>> };
>>> static struct drm_driver driver = {
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index 427d10c7eca5..496565e1753f 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -3598,6 +3598,9 @@ void i915_oa_init_reg_state(struct
>>> intel_engine_cs *engine,
>>> struct i915_gem_context *ctx,
>>> uint32_t *reg_state);
>>> +int i915_gem_engine_info_ioctl(struct drm_device *dev, void *data,
>>> + struct drm_file *file);
>>> +
>>> /* i915_gem_evict.c */
>>> int __must_check i915_gem_evict_something(struct i915_address_space
>>> *vm,
>>> u64 min_size, u64 alignment,
>>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c
>>> b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> index 3b46c1f7b88b..a98669f6ad85 100644
>>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> @@ -25,6 +25,7 @@
>>> #include "i915_drv.h"
>>> #include "intel_ringbuffer.h"
>>> #include "intel_lrc.h"
>>> +#include <uapi/drm/i915_drm.h>
>>> /* Haswell does have the CXT_SIZE register however it does not
>>> appear to be
>>> * valid. Now, docs explain in dwords what is in the context
>>> object. The full
>>> @@ -1332,6 +1333,69 @@ void intel_engines_mark_idle(struct
>>> drm_i915_private *i915)
>>> }
>>> }
>>> +static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
>>> + [I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
>>> + [I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
>>> + [I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
>>> + [I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
>>> + [I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
>>> +};
>>> +
>>> +int i915_gem_engine_info_ioctl(struct drm_device *dev, void *data,
>>> + struct drm_file *file)
>>> +{
>>> + struct drm_i915_private *i915 = to_i915(dev);
>>> + struct drm_i915_gem_engine_info *args = data;
>>> + struct drm_i915_engine_info __user *user_info =
>>> + u64_to_user_ptr(args->info_ptr);
>>> + unsigned int info_size = args->num_engines;
>>> + struct drm_i915_engine_info info;
>>> + struct intel_engine_cs *engine;
>>> + enum intel_engine_id id;
>>> + u8 class;
>>> +
>>> + if (args->rsvd)
>>> + return -EINVAL;
>>> +
>>> + switch (args->engine_class) {
>>> + case I915_ENGINE_CLASS_OTHER:
>>> + case I915_ENGINE_CLASS_RENDER:
>>> + case I915_ENGINE_CLASS_COPY:
>>> + case I915_ENGINE_CLASS_VIDEO:
>>> + case I915_ENGINE_CLASS_VIDEO_ENHANCE:
>>> + class = user_class_map[args->engine_class];
>>> + break;
>>> + case I915_ENGINE_CLASS_MAX:
>>> + default:
>>> + return -EINVAL;
>>> + };
>>> +
>>> + args->num_engines = 0;
>>> +
>>> + if (args->version != 1) {
>>> + args->version = 1;
>>> + return 0;
>>
>> My gut feeling says we want to report both the version and count in the
>> first pass. Otherwise we have to do a version check, followed by a count
>> query, followed by the actual retrieval. If we can combine the version +
>> count check into one pass, it may be a little less hassle?
>
> Could do, I don't see any downsides to it at the moment.
>
> But the only benefit seems it would be in cases where userspace would
> want to allocate the info array to the exact size, rather than just
> using a large enough number.
Sorry, that was one outdated paragraph. From the one below it follows
that the only benefit is offering a little bit more info on a version
mismatch. But since we can do the whole query with one ioctl as below, I
don't see that it would be useful. But could implement it just as well
since as I said I don't see a downside to it.
Regards,
Tvrtko
> In both cases userspace has to check both return code from the ioctl and
> that the version number matches the requested one. And in both cases the
> whole operation can be done with just one ioctl. Or if you want to
> allocate the exact size info array in both cases you need two ioctls.
>
> info.version = 1;
> info.class = <some class>;
> info.num_engines = <a large number should be enough for everyone>;
> ret = ioctl(&info);
> if (ret || info.version != 1)
> goto try_a_lower_version;
> for_each_engine(info) {
> ...
> }
>
> Regards,
>
> Tvrtko
More information about the Intel-gfx
mailing list