[RFC PATCH] drm/scheduler: Set the FIFO schedulig policy as the default
Luben Tuikov
luben.tuikov at amd.com
Sat Oct 22 01:43:30 UTC 2022
Please comment on the v1 of this patch, which I sent right after.
Regards,
Luben
On 2022-10-21 21:09, Luben Tuikov wrote:
> The currently default Round-Robin GPU scheduling can result in starvation
> of entities which have a large number of jobs, over entities which have
> a very small number of jobs (single digit).
>
> This can be illustrated in the following diagram, where jobs are
> alphabetized to show their chronological order of arrival, where job A is
> the oldest, B is the second oldest, and so on, to J, the most recent job to
> arrive.
>
> ---> entities
> j | H-F-----A--E--I--
> o | --G-----B-----J--
> b | --------C--------
> s\/ --------D--------
>
> WLOG, asuming all jobs are "ready", then a R-R scheduling will execute them
> in the following order (a slice off of the top of the entities' list),
>
> H, F, A, E, I, G, B, J, C, D.
>
> However, to mitigate job starvation, we'd rather execute C and D before E,
> and so on, given, of course, that they're all ready to be executed.
>
> So, if all jobs are ready at this instant, the order of execution for this
> and the next 9 instances of picking the next job to execute, should really
> be,
>
> A, B, C, D, E, F, G, H, I, J,
>
> which is their chronological order. The only reason for this order to be
> broken, is if an older job is not yet ready, but a younger job is ready, at
> an instant of picking a new job to execute. For instance if job C wasn't
> ready at time 2, but job D was ready, then we'd pick job D, like this:
>
> 0 +1 +2 ...
> A, B, D, ...
>
> And from then on, C would be preferred before all other jobs, if it is ready
> at the time when a new job for execution is picked. So, if C became ready
> two steps later, the execution order would look like this:
>
> ......0 +1 +2 ...
> A, B, D, E, C, F, G, H, I, J
>
> This is what the FIFO GPU scheduling algorithm achieves. It uses a
> Red-Black tree to keep jobs sorted in chronological order, where picking
> the oldest job is O(1) (we use the "cached" structure), and balancing the
> tree is O(log n). IOW, it picks the *oldest ready* job to execute now.
>
> The implemntation is already in the kernel, and this commit only changes
> the default GPU scheduling algorithm to use.
>
> This was tested and achieves about 1% faster performance over the Round
> Robin algorithm.
>
> Cc: Christian König <christian.koenig at amd.com>
> Cc: Alex Deucher <Alexander.Deucher at amd.com>
> Cc: Direct Rendering Infrastructure - Development <dri-devel at lists.freedesktop.org>
> Signed-off-by: Luben Tuikov <luben.tuikov at amd.com>
> ---
> drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 2fab218d708279..d0ff9e11cb69fa 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -62,13 +62,13 @@
> #define to_drm_sched_job(sched_job) \
> container_of((sched_job), struct drm_sched_job, queue_node)
>
> -int drm_sched_policy = DRM_SCHED_POLICY_RR;
> +int drm_sched_policy = DRM_SCHED_POLICY_FIFO;
>
> /**
> * DOC: sched_policy (int)
> * Used to override default entities scheduling policy in a run queue.
> */
> -MODULE_PARM_DESC(sched_policy, "Specify schedule policy for entities on a runqueue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin (default), " __stringify(DRM_SCHED_POLICY_FIFO) " = use FIFO.");
> +MODULE_PARM_DESC(sched_policy, "Specify the scheduling policy for entities on a run-queue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin, " __stringify(DRM_SCHED_POLICY_FIFO) " = FIFO (default).");
> module_param_named(sched_policy, drm_sched_policy, int, 0444);
>
> static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a,
>
> base-commit: 16d2a3f2ad1d2b95bf9122c910c63b0efe74179d
More information about the dri-devel
mailing list