[PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread

Tue Sep 12 16:52:36 UTC 2023

On 9/12/23 16:49, Boris Brezillon wrote:
> On Tue, 12 Sep 2023 16:33:01 +0200
> Danilo Krummrich <dakr at redhat.com> wrote:
> 
>> On 9/12/23 16:28, Boris Brezillon wrote:
>>> On Thu, 17 Aug 2023 13:13:31 +0200
>>> Danilo Krummrich <dakr at redhat.com> wrote:
>>>    
>>>> I think that's a misunderstanding. I'm not trying to say that it is
>>>> *always* beneficial to fill up the ring as much as possible. But I think
>>>> it is under certain circumstances, exactly those circumstances I
>>>> described for Nouveau.
>>>>
>>>> As mentioned, in Nouveau the size of a job is only really limited by the
>>>> ring size, which means that one job can (but does not necessarily) fill
>>>> up the whole ring. We both agree that this is inefficient, because it
>>>> potentially results into the HW run dry due to hw_submission_limit == 1.
>>>>
>>>> I recognize you said that one should define hw_submission_limit and
>>>> adjust the other parts of the equation accordingly, the options I see are:
>>>>
>>>> (1) Increase the ring size while keeping the maximum job size.
>>>> (2) Decrease the maximum job size while keeping the ring size.
>>>> (3) Let the scheduler track the actual job size rather than the maximum
>>>> job size.
>>>>
>>>> (1) results into potentially wasted ring memory, because we're not
>>>> always reaching the maximum job size, but the scheduler assumes so.
>>>>
>>>> (2) results into more IOCTLs from userspace for the same amount of IBs
>>>> and more jobs result into more memory allocations and more work being
>>>> submitted to the workqueue (with Matt's patches).
>>>>
>>>> (3) doesn't seem to have any of those draw backs.
>>>>
>>>> What would be your take on that?
>>>>
>>>> Actually, if none of the other drivers is interested into a more precise
>>>> way of keeping track of the ring utilization, I'd be totally fine to do
>>>> it in a driver specific way. However, unfortunately I don't see how this
>>>> would be possible.
>>>
>>> I'm not entirely sure, but I think PowerVR is pretty close to your
>>> description: jobs size is dynamic size, and the ring buffer size is
>>> picked by the driver at queue initialization time. What we did was to
>>> set hw_submission_limit to an arbitrarily high value of 64k (we could
>>> have used something like ringbuf_size/min_job_size instead), and then
>>> have the control flow implemented with ->prepare_job() [1] (CCCB is the
>>> PowerVR ring buffer). This allows us to maximize ring buffer utilization
>>> while still allowing dynamic-size jobs.
>>
>> I guess this would work, but I think it would be better to bake this in,
>> especially if more drivers do have this need. I already have an
>> implementation [1] for doing that in the scheduler. My plan was to push
>> that as soon as Matt sends out V3.
>>
>> [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/commit/269f05d6a2255384badff8b008b3c32d640d2d95
> 
> PowerVR's ->can_fit_in_ringbuf() logic is a bit more involved in that
> native fences waits are passed to the FW, and those add to the job size.
> When we know our job is ready for execution (all non-native deps are
> signaled), we evict already signaled native-deps (or native fences) to
> shrink the job size further more, but that's something we need to
> calculate late if we want the job size to be minimal. Of course, we can
> always over-estimate the job size, but if we go for a full-blown
> drm_sched integration, I wonder if it wouldn't be preferable to have a
> ->get_job_size() callback returning the number of units needed by job,
> and have the core pick 1 when the hook is not implemented.
> 

Sure, why not. Sounds reasonable to me.