[PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

Luben Tuikov ltuikov89 at gmail.com
Sat Oct 28 03:51:47 UTC 2023


On 2023-10-27 12:41, Boris Brezillon wrote:
> On Fri, 27 Oct 2023 10:32:52 -0400
> Luben Tuikov <ltuikov89 at gmail.com> wrote:
> 
>> On 2023-10-27 04:25, Boris Brezillon wrote:
>>> Hi Danilo,
>>>
>>> On Thu, 26 Oct 2023 18:13:00 +0200
>>> Danilo Krummrich <dakr at redhat.com> wrote:
>>>   
>>>> Currently, job flow control is implemented simply by limiting the number
>>>> of jobs in flight. Therefore, a scheduler is initialized with a credit
>>>> limit that corresponds to the number of jobs which can be sent to the
>>>> hardware.
>>>>
>>>> This implies that for each job, drivers need to account for the maximum
>>>> job size possible in order to not overflow the ring buffer.
>>>>
>>>> However, there are drivers, such as Nouveau, where the job size has a
>>>> rather large range. For such drivers it can easily happen that job
>>>> submissions not even filling the ring by 1% can block subsequent
>>>> submissions, which, in the worst case, can lead to the ring run dry.
>>>>
>>>> In order to overcome this issue, allow for tracking the actual job size
>>>> instead of the number of jobs. Therefore, add a field to track a job's
>>>> credit count, which represents the number of credits a job contributes
>>>> to the scheduler's credit limit.
>>>>
>>>> Signed-off-by: Danilo Krummrich <dakr at redhat.com>
>>>> ---
>>>> Changes in V2:
>>>> ==============
>>>>   - fixed up influence on scheduling fairness due to consideration of a job's
>>>>     size
>>>>     - If we reach a ready entity in drm_sched_select_entity() but can't actually
>>>>       queue a job from it due to size limitations, just give up and go to sleep
>>>>       until woken up due to a pending job finishing, rather than continue to try
>>>>       other entities.
>>>>   - added a callback to dynamically update a job's credits (Boris)  
>>>
>>> This callback seems controversial. I'd suggest dropping it, so the
>>> patch can be merged.  
>>
>> Sorry, why is it controversial? (I did read back-and-forth above, but it wasn't clear
>> why it is /controversial/.)
> 
> That's a question for Christian, I guess. I didn't quite get what he
> was worried about, other than this hook introducing a new way for
> drivers to screw things up by returning funky/invalid credits (which we

It's up to the driver--they can test, test, test, and fix their code and so on.
We can only do so much and shouldn't be baby-sitting drivers ad nauseam. Drivers
can also not define this callback. :-)

> can report with WARN_ON()s). But let's be honest, there's probably a
> hundred different ways (if not more) drivers can shoot themselves in the
> foot with drm_sched already...

Yes, that's true. So there's no worries with this hook.

> 
>>
>> I believe only drivers are privy to changes in the credit availability as their
>> firmware and hardware executes new jobs and finishes others, and so this "update"
>> here is essential--leaving it only to prepare_job() wouldn't quite fulfill the vision
>> of why the credit mechanism introduced by this patch in the first place.
> 
> I kinda agree with you, even if I wouldn't so pessimistic as to how
> useful this patch would be without the ->update_job_credits() hook
> (it already makes the situation a lot better for Nouveau and probably
> other drivers too).

Sure, and that's a good thing.

The heart of the dynamic credit scheme this patch is introducing *is* update_job_credits()
callback. Without it, it's just about like the current job flow-control scheme we have with
varied job weights (credits). Remember, it is an optional callback and driver can choose NOT
to define it--simple. :-)

So, I'm very excited about this, and see a wide range of applications and tricks drivers
can do with the credit scheme (albeit had it been an "int" bwha-ha-ha-ha ]:-> ).

Have a good weekend everyone!
-- 
Regards,
Luben
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0x4C15479431A334AF.asc
Type: application/pgp-keys
Size: 664 bytes
Desc: OpenPGP public key
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20231027/f685e9a3/attachment-0001.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20231027/f685e9a3/attachment-0001.sig>


More information about the dri-devel mailing list