[RFC 00/14] Deadline scheduler and other ideas

Tvrtko Ursulin tvrtko.ursulin at igalia.com
Wed Jan 8 15:13:39 UTC 2025


On 08/01/2025 08:31, Danilo Krummrich wrote:
> On Mon, Dec 30, 2024 at 04:52:45PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
> 
> "Deadline scheduler and other ideas"
> 
> There's a few patches that could be sent outside the scope of this series, e.g.
> the first one.
> 
> I think it would make sense to do so.

For now I'll keep them at the head of this RFC and as they get acked or 
r-b-ed I can easily send them standalone or re-ordered. Until then 
having the series separate would make the RFC not standalone.

>> <tldr>
>> Replacing FIFO with a flavour of deadline driven scheduling and removing round-
>> robin. Connecting the scheduler with dma-fence deadlines. First draft and
>> testing by different drivers and feedback would be nice. I was only able to test
>> it with amdgpu. Other drivers may not even compile.
> 
> What are the results from your tests with amdgpu? Do you have some measurements?

We already covered this in the thread with Philipp to a degree. Tl;dr; 
the main idea is whether we simplify the code and at least not regress.

I don't expect improvements on the amdgpu side with the workloads like 
games and benchmarks. I did not measure anything significant apart that 
priorities seem to work with the run queues removed.

Where something could show is if someone is aware of a workload where 
normal prio starves low. Since one part of the idea is that with the 
"deadline" scheme those should work a little bit more balanced.

Also again, feedback (including testing feedback from other drivers) 
would be great, and ideas of which workloads to test.

Btw I will send a respin in a day or so which will clean up some things 
and adds some more tiny bits.

Regards,

Tvrtko

>> </tldr>
>>
>> If I remember correctly Christian mentioned recently (give or take) that maybe
>> round-robin could be removed. That got me thinking how and what could be
>> improved and simplified. So I played a bit in the scheduler code and came up
>> with something which appears to not crash at least. Whether or not there are
>> significant advantages apart from maybe code consolidation and reduction is the
>> main thing to be determined.
>>
>> One big question is whether round-robin can really be removed. Does anyone use
>> it, rely on it, or what are even use cases where it is much better than FIFO.
>>
>> See "drm/sched: Add deadline policy" commit message for a short description on
>> what flavour of deadline scheduling it is. But in essence it should a more fair
>> FIFO where higher priority can not forever starve lower priorities.
>>
>> "drm/sched: Connect with dma-fence deadlines" wires up dma-fence deadlines to
>> the scheduler because it is easy and makes logical sense with this. And I
>> noticed userspace already uses it so why not wire it up fully.
>>
>> Otherwise the series is a bit of progression from consolidating RR into FIFO
>> code paths and going from there to deadline and then to a change in how
>> dependencies are handled. And code simplification to 1:1 run queue to scheduler
>> relationship, because deadline does not need per priority run queues.
>>
>> There is quite a bit of code to go throught here so I think it could be even
>> better if other drivers could give it a spin as is and see if some improvements
>> can be detected. Or at least no regressions.
> 
> Are there improvements with amdgpu?
> 
>>
>> Cc: Christian König <christian.koenig at amd.com>
>> Cc: Danilo Krummrich <dakr at redhat.com>
>> Cc: Matthew Brost <matthew.brost at intel.com>
>> Cc: Philipp Stanner <pstanner at redhat.com>
>>
>> Tvrtko Ursulin (14):
>>    drm/sched: Delete unused update_job_credits
>>    drm/sched: Remove idle entity from tree
>>    drm/sched: Implement RR via FIFO
>>    drm/sched: Consolidate entity run queue management
>>    drm/sched: Move run queue related code into a separate file
>>    drm/sched: Ignore own fence earlier
>>    drm/sched: Resolve same scheduler dependencies earlier
>>    drm/sched: Add deadline policy
>>    drm/sched: Remove FIFO and RR and simplify to a single run queue
>>    drm/sched: Queue all free credits in one worker invocation
>>    drm/sched: Connect with dma-fence deadlines
>>    drm/sched: Embed run queue singleton into the scheduler
>>    dma-fence: Add helper for custom fence context when merging fences
>>    drm/sched: Resolve all job dependencies in one go
>>
>>   drivers/dma-buf/dma-fence-unwrap.c          |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |   6 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  27 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h     |   5 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h   |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c     |   8 +-
>>   drivers/gpu/drm/scheduler/Makefile          |   2 +-
>>   drivers/gpu/drm/scheduler/sched_entity.c    | 316 ++++++-----
>>   drivers/gpu/drm/scheduler/sched_fence.c     |   5 +-
>>   drivers/gpu/drm/scheduler/sched_main.c      | 587 +++++---------------
>>   drivers/gpu/drm/scheduler/sched_rq.c        | 199 +++++++
>>   include/drm/gpu_scheduler.h                 |  74 ++-
>>   include/linux/dma-fence-unwrap.h            |  31 +-
>>   14 files changed, 606 insertions(+), 678 deletions(-)
>>   create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c
>>
>> -- 
>> 2.47.1
>>
> 


More information about the dri-devel mailing list