[PATCH] drm/sched: Convert the GPU scheduler to variable number of run-queues

Luben Tuikov ltuikov89 at gmail.com
Wed Nov 1 01:46:11 UTC 2023


On 2023-10-31 09:33, Danilo Krummrich wrote:
> 
> On 10/26/23 19:25, Luben Tuikov wrote:
>> On 2023-10-26 12:39, Danilo Krummrich wrote:
>>> On 10/23/23 05:22, Luben Tuikov wrote:
>>>> The GPU scheduler has now a variable number of run-queues, which are set up at
>>>> drm_sched_init() time. This way, each driver announces how many run-queues it
>>>> requires (supports) per each GPU scheduler it creates. Note, that run-queues
>>>> correspond to scheduler "priorities", thus if the number of run-queues is set
>>>> to 1 at drm_sched_init(), then that scheduler supports a single run-queue,
>>>> i.e. single "priority". If a driver further sets a single entity per
>>>> run-queue, then this creates a 1-to-1 correspondence between a scheduler and
>>>> a scheduled entity.
>>>
>>> Generally, I'm fine with this patch and how it replaces / generalizes the single
>>> entity approach.
>>
>> Great!
>>
>>> However, I'm not quite sure how to properly use this. What is a driver supposed to
>>> do, which previously took advantage of DRM_SCHED_POLICY_SINGLE_ENTITY?
>>>
>>> Is it supposed to call drm_sched_init() with num_rqs=1? If so, what's the correct way
>>
>> Yes, you call drm_sched_init() with num_rqs set to 1.
>>
>>> to initialize the drm_sched_entity then? Calling drm_sched_entity_init() with priority=0?
>>
>> Yes, with priority set to 0.
>>
>> One unfortunate fact I noticed when doing this patch is that the numerical values
>> assigned to enum drm_sched_priority is that the names to values are upside down.
>> Instead of min being 0, normal:1, high:2, kernel:3, it should've been kernel:0 (highest),
>> high:1, normal:2, low:4, and so on.
>>
>> The reason for this is absolutely clear: if you had a single priority, it would be
>> 0, the kernel, one, highest one. This is similar to how lanes in a highway are counted:
>> you always have lane 1. Similarly to nice(1) and kernel priorities...
>>
>>> Any other priority consequently faults in drm_sched_job_arm().
>>
>> drm_sched_job_arm() faults on !ENTITY, but the "priority" is just
>> assigned to s_priority:
>> 	job->s_priority = entity->priority;
>>
>>
>>> While I might sound like a broken record (sorry for that), I really think everything
>>> related to Matt's series needs documentation, as in:
>>
>> Yes, I agree.
> 
> Great! Do you plan to send a subsequent patch adding some documentation for this one? I
> think it'd be good to get all the above documented.

A lot of this would be the magic sauce of drivers and hardware--as we've seen with Xe,
and it would be presumptuous of me to write down to the detail of what and how this
and that should be used.

So long as things are dynamic--as we've seen with the latest change in sched_rq--we let
drivers and hardware set the numbers and do their magic in their drivers and hardware.

Having said this, if something fundamental comes up to mind, I'd be sure to add a comment
there in--this applies to anyone else guys--don't be shy to post a patch adding comments
where you think there should be some.
-- 
Regards,
Luben
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0x4C15479431A334AF.asc
Type: application/pgp-keys
Size: 664 bytes
Desc: OpenPGP public key
URL: <https://lists.freedesktop.org/archives/lima/attachments/20231101/5d702832/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/lima/attachments/20231101/5d702832/attachment.sig>


More information about the lima mailing list