[PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread
Boris Brezillon
boris.brezillon at collabora.com
Tue Sep 12 14:28:38 UTC 2023
On Thu, 17 Aug 2023 13:13:31 +0200
Danilo Krummrich <dakr at redhat.com> wrote:
> I think that's a misunderstanding. I'm not trying to say that it is
> *always* beneficial to fill up the ring as much as possible. But I think
> it is under certain circumstances, exactly those circumstances I
> described for Nouveau.
>
> As mentioned, in Nouveau the size of a job is only really limited by the
> ring size, which means that one job can (but does not necessarily) fill
> up the whole ring. We both agree that this is inefficient, because it
> potentially results into the HW run dry due to hw_submission_limit == 1.
>
> I recognize you said that one should define hw_submission_limit and
> adjust the other parts of the equation accordingly, the options I see are:
>
> (1) Increase the ring size while keeping the maximum job size.
> (2) Decrease the maximum job size while keeping the ring size.
> (3) Let the scheduler track the actual job size rather than the maximum
> job size.
>
> (1) results into potentially wasted ring memory, because we're not
> always reaching the maximum job size, but the scheduler assumes so.
>
> (2) results into more IOCTLs from userspace for the same amount of IBs
> and more jobs result into more memory allocations and more work being
> submitted to the workqueue (with Matt's patches).
>
> (3) doesn't seem to have any of those draw backs.
>
> What would be your take on that?
>
> Actually, if none of the other drivers is interested into a more precise
> way of keeping track of the ring utilization, I'd be totally fine to do
> it in a driver specific way. However, unfortunately I don't see how this
> would be possible.
I'm not entirely sure, but I think PowerVR is pretty close to your
description: jobs size is dynamic size, and the ring buffer size is
picked by the driver at queue initialization time. What we did was to
set hw_submission_limit to an arbitrarily high value of 64k (we could
have used something like ringbuf_size/min_job_size instead), and then
have the control flow implemented with ->prepare_job() [1] (CCCB is the
PowerVR ring buffer). This allows us to maximize ring buffer utilization
while still allowing dynamic-size jobs.
>
> My proposal would be to just keep the hw_submission_limit (maybe rename
> it to submission_unit_limit) and add a submission_units field to struct
> drm_sched_job. By default a jobs submission_units field would be 0 and
> the scheduler would behave the exact same way as it does now.
>
> Accordingly, jobs with submission_units > 1 would contribute more than
> one unit to the submission_unit_limit.
>
> What do you think about that?
>
> Besides all that, you said that filling up the ring just enough to not
> let the HW run dry rather than filling it up entirely is desirable. Why
> do you think so? I tend to think that in most cases it shouldn't make
> difference.
[1]https://gitlab.freedesktop.org/frankbinns/powervr/-/blob/powervr-next/drivers/gpu/drm/imagination/pvr_queue.c#L502
More information about the dri-devel
mailing list