[Intel-gfx] [RFC 2/2] drm/i915: Select engines via class and instance in execbuffer2

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Apr 24 08:36:40 UTC 2017


On 18/04/2017 22:10, Chris Wilson wrote:
> On Tue, Apr 18, 2017 at 05:56:15PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> Building on top of the previous patch which exported the concept
>> of engine classes and instances, we can also use this instead of
>> the current awkward engine selection uAPI.
>>
>> This is primarily interesting for the VCS engine selection which
>> is a) currently done via disjoint set of flags, and b) the
>> current I915_EXEC_BSD flags has different semantics depending on
>> the underlying hardware which is bad.
>>
>> Proposed idea here is to reserve 16-bits of flags, to pass in
>> the engine class and instance (8 bits each), and a new flag
>> named I915_EXEC_CLASS_INSTACE to tell the kernel this new engine
>> selection API is in use.
>>
>> The new uAPI also removes access to the weak VCS engine
>> balancing as currently existing in the driver.
>>
>> Example usage to send a command to VCS0:
>>
>>   eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 0);
>>
>> Or to send a command to VCS1:
>>
>>   eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 1);
>
> To save a bit of space, we can use the ring selector as a class selector
> if bit18 is set, with 19-27 as instance. That limits us to 64 classes -
> hopefully not a problem for near future. At least I might have you sold
> you on a flexible execbuf3 by then.

I was considering re-using those bits yes. I was thinking about the pro 
of keeping it completely separate but I suppose there is not much value 
in that. So I can re-use the ring selector just as well and have a 
smaller impact on number of bits left over.

> (As a digression, some cryptic notes for an implementation I did over Easter:
> /*
>  * Execbuf3!
>  *
>  * ringbuffer
>  *  - per context
>  *  - per engine

We have this already so I am missing something I guess.

>  *  - PAGE_SIZE ctl [ro head, rw tai] + user pot
>  *  - kthread [i915/$ctx-$engine] (optional?)

No idea what these two are. :)

>  *  - assumes NO_RELOC-esque awareness

Ok ok NO_RELOC. :)

>  *
>  * SYNC flags [wait/signal], handle [semaphore/fence]

Sync fence in out just as today, but probably more?

>  *
>  * BIND handle, offset [user provided]
>  * ALLOC[32,64] handle, flags, *offset [kernel provided, need RELOC]
>  * RELOC[32,64] handle, target_handle, offset, delta
>  * CLEAR flags, handle
>  * UNBIND handle

Explicit VMA management? Separate ioctl maybe would be better?

>  *
>  * BATCH flags, handle, offset
>  * [or SVM flags, address]
>  *   PIN flags (MAY_RELOC), count, handle[count]
>  *   FENCE flags, count, handle[count]
>  * SUBMIT handle [fence/NULL with error]
>  */

No idea again. :)

> At the moment it is just trying to do execbuf2, but more compactly and
> with fewer ioctls. But one of the main selling points is that we can
> extend the information passed around more freely than execbuf2.)

I have nothing against a better eb since I trust you know much better it 
is needed and when. But I don't know how long it will take to get there. 
This class/instance idea could be implemented quickly to solve the sore 
point of VCS/VCS2 engine selection. But yeah, it is another uABI to keep 
in that case.

Regards,

Tvrtko


More information about the Intel-gfx mailing list