[PATCH v6 00/15] Fair DRM scheduler

Tue Jul 8 10:40:58 UTC 2025

You keep forgetting the "RFC" in your recent series's. Please add it
next time.

P.

On Tue, 2025-07-08 at 10:51 +0100, Tvrtko Ursulin wrote:
> As a summary, the new scheduling algorithm is insipired by the
> original Linux
> CFS and so far no scheduling regressions have been found. There are
> improvements
> in fairness and scheduling of interactive clients when running in
> parallel with
> a heavy GPU load (for example Pierre-Eric has one viewperf medical
> test which
> shows a nice improvement with amdgpu).
> 
> On the high level main advantages of the series are:
> 
>  1. Scheduling quality - schedules better than FIFO, solves priority
> starvation.
>  2. Code simplification - no more multiple run queues and multiple
> algorithms.
>  3. Virtual GPU time based scheduling enables relatively simple
> addition
>     of a scheduling cgroup controller in the future.
> 
> There is a little bit more detailed write up on the motivation and
> results in
> the form of a blog post which may be easier to read:
> https://blogs.igalia.com/tursulin/fair-er-drm-gpu-scheduler/
> 
> First patches add some unit tests which allow for easy evaluation of
> scheduling
> behaviour against different client submission patterns. From there
> onwards it is
> hopefully a natural progression of cleanups, enablers, adding the
> fair policy,
> and finally removing FIFO and RR and simplifying the code base due no
> more need
> for multiple run queues.
> 
> As a headline result I have tested three simultaneous clients on the
> Steam Deck:
> 
> One instance of a deferredmultisampling Vulkan demo running with low
> priority,
> one normal priority instance of the same demo, and the Unigine Heaven
> benchmark.
> 
> With the FIFO scheduler we can see that the low priority client is
> completely
> starved and the GPU time distribution between the other two clients
> is uneven:
> 
> https://people.igalia.com/tursulin/drm-sched-fair/fifo-starvation.png
> 
> Switching to the fair scheduler, GPU time distribution is almost
> equal and the
> low priority client does get a small share of the GPU:
> 
> https://people.igalia.com/tursulin/drm-sched-fair/fair-no-starvation.png
> 
> Moving onto the synthetic submission patterns, they are about two
> simultaneous
> clients which broadly cover the following categories:
> 
>  * Deep queue clients
>  * Hogs versus interactive
>  * Priority handling
> 
> Lets look at the results:
> 
> 1. Two normal priority deep queue clients.
> 
> These ones submit one second worth of 8ms jobs. As fast as they can,
> no
> dependencies etc. There is no difference in runtime between FIFO and
> fair but
> the latter allows both clients to progress with work more evenly:
> 
> https://people.igalia.com/tursulin/drm-sched-fair/normal-normal.png
> 
> (X axis is time, Y is submitted queue-depth, hence lowering of qd
> corresponds
>   with work progress for both clients, tested with both schedulers
> separately.)
> 
> 2. Same two clients but one is now low priority.
> 
> https://people.igalia.com/tursulin/drm-sched-fair/normal-low.png
> 
> Normal priority client is a solid line, low priority dotted. We can
> see how FIFO
> completely starves the low priority client until the normal priority
> is fully
> done. Only then the low priority client gets any GPU time.
> 
> In constrast, fair scheduler allows some GPU time to the low priority
> client.
> 
> 3. Same clients but now high versus normal priority.
> 
> Similar behaviour as in the previous one with normal a bit less de-
> prioritised
> relative to high, than low was against normal.
> 
> https://people.igalia.com/tursulin/drm-sched-fair/high-normal.png
> 
> 4. Heavy load vs interactive client.
> 
> Heavy client emits a 75% GPU load in the format of 3x 2.5ms jobs
> followed by a
> 2.5ms wait. Interactive client emits a 10% GPU load in the format of
> 1x 1ms job
> followed by a 9ms wait.
> 
> This simulates an interactive graphical client used on top of a
> relatively heavy
> background load but no GPU oversubscription.
> 
> Graphs show the interactive client only and from now on, instead of
> looking at
> the client's queue depth, we look at its "fps".
> 
> https://people.igalia.com/tursulin/drm-sched-fair/heavy-interactive.png
> 
> We can see that fair scheduler allows a higher fps for the
> interactive client
> which is good.
> 
> 5. An even heavier load vs interactive client.
> 
> This one is oversubscribing the GPU by submitting 4x 50ms jobs and
> waiting for
> only one microsecond before repeating the cycle. Interactive client
> is thje same
> 10% as above.
> 
> https://people.igalia.com/tursulin/drm-sched-fair/veryheavy-interactive.png
> 
> Here the difference is even more dramatic with fair scheduler
> enabling ~3x the
> framerate for the interactive client.
> 
> 6. Low priority GPU hog versus heavy-interactive.
> 
> Low priority client: 3x 2.5ms jobs client followed by a 0.5ms wait.
> Interactive client: 1x 0.5ms job followed by a 10ms wait.
> 
> https://people.igalia.com/tursulin/drm-sched-fair/lowhog-interactive.png
> 
> Slight win for the fair scheduler but could be just noise.
> 
> 7. Last set of test scenarios will have three subgroups.
> 
> In all cases we have two interactive (synchronous, single job at a
> time) clients
> with a 50% "duty cycle" GPU time usage.
> 
> Client 1: 1.5ms job + 1.5ms wait (aka short bursty)
> Client 2: 2.5ms job + 2.5ms wait (aka long bursty)
> 
> a) Both normal priority.
> 
> https://people.igalia.com/tursulin/drm-sched-fair/5050-short.png
> https://people.igalia.com/tursulin/drm-sched-fair/5050-long.png
> 
> Both schedulers favour the higher frequency duty cycle with fair
> giving it a
> little bit more which should be good for interactivity.
> 
> b) Normal vs low priority.
> 
> https://people.igalia.com/tursulin/drm-sched-fair/5050-normal-low-normal.png
> https://people.igalia.com/tursulin/drm-sched-fair/5050-normal-low-low.png
> 
> Fair scheduler gives a bit more GPU time to the normal priority
> client which is
> again good.
> 
> c) High vs normal priority.
> 
> https://people.igalia.com/tursulin/drm-sched-fair/5050-high-normal-high.png
> https://people.igalia.com/tursulin/drm-sched-fair/5050-high-normal-normal.png
> 
> Again, fair scheduler gives a bit more share to the higher priority
> client.
> 
> As before, I am looking for feedback, ideas for what other kinds of
> submission
> scenarios to test, testing on different GPUs and of course reviews.
> 
> v2:
>  * Fixed many rebase errors.
>  * Added some new patches.
>  * Dropped single shot dependecy handling.
> 
> v3:
>  * Added scheduling quality unit tests.
>  * Refined a tiny bit by adding some fairness.
>  * Dropped a few patches for now.
> 
> v4:
>  * Replaced deadline with fair!
>  * Refined scheduling quality unit tests.
>  * Pulled one cleanup patch earlier.
>  * Fixed "drm/sched: Avoid double re-lock on the job free path".
> 
> v5:
>  * Rebase on top of latest upstream DRM scheduler changes.
>  * Kerneldoc fixup.
>  * Improve commit message justification for one patch. (Philipp)
>  * Add comment in drm_sched_alloc_wq. (Christian)
> 
> v6:
>  * Rebase for "drm/sched: De-clutter drm_sched_init" getting merged.
>  * Avoid NULL rq dereference from a bad rebase. (Maira)
>  * Added some kerneldoc throughout. (Maira)
>  * Removed some lockdep annotations not belonging to one patch.
> (Maira)
>  * Use dma_fence_is_signaled in "drm/sched: Avoid double re-lock on
> the job free path". (Maira, Philipp)
> 
> Cc: Christian König <christian.koenig at amd.com>
> Cc: Danilo Krummrich <dakr at kernel.org>
> CC: Leo Liu <Leo.Liu at amd.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Philipp Stanner <phasta at kernel.org>
> Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer at amd.com>
> Cc: Michel Dänzer <michel.daenzer at mailbox.org>
> 
> Tvrtko Ursulin (15):
>   drm/sched: Add some scheduling quality unit tests
>   drm/sched: Add some more scheduling quality unit tests
>   drm/sched: Avoid double re-lock on the job free path
>   drm/sched: Consolidate drm_sched_job_timedout
>   drm/sched: Consolidate drm_sched_rq_select_entity_rr
>   drm/sched: Implement RR via FIFO
>   drm/sched: Consolidate entity run queue management
>   drm/sched: Move run queue related code into a separate file
>   drm/sched: Free all finished jobs at once
>   drm/sched: Account entity GPU time
>   drm/sched: Remove idle entity from tree
>   drm/sched: Add fair scheduling policy
>   drm/sched: Remove FIFO and RR and simplify to a single run queue
>   drm/sched: Queue all free credits in one worker invocation
>   drm/sched: Embed run queue singleton into the scheduler
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   6 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |  27 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h       |   5 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h     |   8 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c   |   8 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c       |   8 +-
>  drivers/gpu/drm/scheduler/Makefile            |   2 +-
>  drivers/gpu/drm/scheduler/sched_entity.c      | 131 ++-
>  drivers/gpu/drm/scheduler/sched_fence.c       |   2 +-
>  drivers/gpu/drm/scheduler/sched_internal.h    | 146 +++-
>  drivers/gpu/drm/scheduler/sched_main.c        | 548 +++---------
>  drivers/gpu/drm/scheduler/sched_rq.c          | 221 +++++
>  drivers/gpu/drm/scheduler/tests/Makefile      |   3 +-
>  .../gpu/drm/scheduler/tests/tests_scheduler.c | 815
> ++++++++++++++++++
>  include/drm/gpu_scheduler.h                   |  31 +-
>  15 files changed, 1396 insertions(+), 565 deletions(-)
>  create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c
>  create mode 100644 drivers/gpu/drm/scheduler/tests/tests_scheduler.c
>