[Intel-gfx] [RFC 2/2] drm/i915: Select engines via class and instance in execbuffer2
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Mon Apr 24 08:36:40 UTC 2017
On 18/04/2017 22:10, Chris Wilson wrote:
> On Tue, Apr 18, 2017 at 05:56:15PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> Building on top of the previous patch which exported the concept
>> of engine classes and instances, we can also use this instead of
>> the current awkward engine selection uAPI.
>>
>> This is primarily interesting for the VCS engine selection which
>> is a) currently done via disjoint set of flags, and b) the
>> current I915_EXEC_BSD flags has different semantics depending on
>> the underlying hardware which is bad.
>>
>> Proposed idea here is to reserve 16-bits of flags, to pass in
>> the engine class and instance (8 bits each), and a new flag
>> named I915_EXEC_CLASS_INSTACE to tell the kernel this new engine
>> selection API is in use.
>>
>> The new uAPI also removes access to the weak VCS engine
>> balancing as currently existing in the driver.
>>
>> Example usage to send a command to VCS0:
>>
>> eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 0);
>>
>> Or to send a command to VCS1:
>>
>> eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 1);
>
> To save a bit of space, we can use the ring selector as a class selector
> if bit18 is set, with 19-27 as instance. That limits us to 64 classes -
> hopefully not a problem for near future. At least I might have you sold
> you on a flexible execbuf3 by then.
I was considering re-using those bits yes. I was thinking about the pro
of keeping it completely separate but I suppose there is not much value
in that. So I can re-use the ring selector just as well and have a
smaller impact on number of bits left over.
> (As a digression, some cryptic notes for an implementation I did over Easter:
> /*
> * Execbuf3!
> *
> * ringbuffer
> * - per context
> * - per engine
We have this already so I am missing something I guess.
> * - PAGE_SIZE ctl [ro head, rw tai] + user pot
> * - kthread [i915/$ctx-$engine] (optional?)
No idea what these two are. :)
> * - assumes NO_RELOC-esque awareness
Ok ok NO_RELOC. :)
> *
> * SYNC flags [wait/signal], handle [semaphore/fence]
Sync fence in out just as today, but probably more?
> *
> * BIND handle, offset [user provided]
> * ALLOC[32,64] handle, flags, *offset [kernel provided, need RELOC]
> * RELOC[32,64] handle, target_handle, offset, delta
> * CLEAR flags, handle
> * UNBIND handle
Explicit VMA management? Separate ioctl maybe would be better?
> *
> * BATCH flags, handle, offset
> * [or SVM flags, address]
> * PIN flags (MAY_RELOC), count, handle[count]
> * FENCE flags, count, handle[count]
> * SUBMIT handle [fence/NULL with error]
> */
No idea again. :)
> At the moment it is just trying to do execbuf2, but more compactly and
> with fewer ioctls. But one of the main selling points is that we can
> extend the information passed around more freely than execbuf2.)
I have nothing against a better eb since I trust you know much better it
is needed and when. But I don't know how long it will take to get there.
This class/instance idea could be implemented quickly to solve the sore
point of VCS/VCS2 engine selection. But yeah, it is another uABI to keep
in that case.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list