[RFC PATCH 00/10] Xe DRM scheduler and long running workload plans

Wed Apr 5 09:57:04 UTC 2023

Am 05.04.23 um 11:07 schrieb Daniel Vetter:
> [SNIP]
>> I would approach it from the complete other side. This component here is a
>> tool to decide what job should run next.
>>
>> How that is then signaled and run should not be part of the scheduler, but
>> another more higher level component.
>>
>> This way you also don't have a problem with not using DMA-fences as
>> dependencies as well as constrains for running more jobs.
> I think we're talking about two things here and mixing them up.
>
> For the dependencies I agree with you, and imo that higher level tool
> should probably just be an on-demand submit thread in userspace for the
> rare case where the kernel would need to sort out a dependency otherwise
> (due to running out of ringspace in the per-ctx ringbuffer).
>
> The other thing is the message passing stuff, and this is what I was
> talking about above. This has nothing to do with handling dependencies,
> but with talking to the gpu fw. Here the intel design issue is that the fw
> only provides a single queue, and it's in-order. Which means it
> fundamentally has the stalling issue you describe as a point against a
> message passing design. And fundamentally we need to be able to talk to
> the fw in the scheduler ->run_job callback.
>
> The proposal here for the message passing part is that since it has the
> stalling issue already anyway, and the scheduler needs to be involved
> anyway, it makes sense to integrated this (as an optional thing, only for
> drivers which have this kind of fw interface) into the scheduler.
> Otherwise you just end up with two layers for no reason and more ping-pong
> delay because the ->run_job needs to kick off the subordinate driver layer
> first. Note that for this case the optional message passing support in the
> drm/scheduler actually makes things better, because it allows you to cut
> out one layer.
>
> Of course if a driver with better fw interface uses this message passing
> support, then that's bad. Hence the big warning in the kerneldoc.

Well what I wanted to say is that if you design the dependency handling 
/ scheduler properly you don't need the message passing through it.

For example if the GPU scheduler component uses a work item to do it's 
handling instead of a kthread you could also let the driver specify the 
work queue where this work item is executed on.

When you design it like this the driver specifies the thread context of 
execution for it's job. In other words it can specify a single threaded 
firmware work queue as well.

When you then have other messages which needs to be passed to the 
firmware you can also use the same single threaded workqueue for this.

Drivers which have a different firmware interface would just use one of 
the system work queues instead.

This approach basically decouples the GPU scheduler component from the 
message passing functionality.

Regards,
Christian.

> -Daniel
>