[PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

Boris Brezillon boris.brezillon at collabora.com
Fri Oct 27 16:31:58 UTC 2023


On Fri, 27 Oct 2023 16:23:24 +0200
Danilo Krummrich <dakr at redhat.com> wrote:

> On 10/27/23 10:25, Boris Brezillon wrote:
> > Hi Danilo,
> > 
> > On Thu, 26 Oct 2023 18:13:00 +0200
> > Danilo Krummrich <dakr at redhat.com> wrote:
> >   
> >> Currently, job flow control is implemented simply by limiting the number
> >> of jobs in flight. Therefore, a scheduler is initialized with a credit
> >> limit that corresponds to the number of jobs which can be sent to the
> >> hardware.
> >>
> >> This implies that for each job, drivers need to account for the maximum
> >> job size possible in order to not overflow the ring buffer.
> >>
> >> However, there are drivers, such as Nouveau, where the job size has a
> >> rather large range. For such drivers it can easily happen that job
> >> submissions not even filling the ring by 1% can block subsequent
> >> submissions, which, in the worst case, can lead to the ring run dry.
> >>
> >> In order to overcome this issue, allow for tracking the actual job size
> >> instead of the number of jobs. Therefore, add a field to track a job's
> >> credit count, which represents the number of credits a job contributes
> >> to the scheduler's credit limit.
> >>
> >> Signed-off-by: Danilo Krummrich <dakr at redhat.com>
> >> ---
> >> Changes in V2:
> >> ==============
> >>    - fixed up influence on scheduling fairness due to consideration of a job's
> >>      size
> >>      - If we reach a ready entity in drm_sched_select_entity() but can't actually
> >>        queue a job from it due to size limitations, just give up and go to sleep
> >>        until woken up due to a pending job finishing, rather than continue to try
> >>        other entities.
> >>    - added a callback to dynamically update a job's credits (Boris)  
> > 
> > This callback seems controversial. I'd suggest dropping it, so the
> > patch can be merged.  
> 
> I don't think we should drop it just for the sake of moving forward. If there are objections
> we'll discuss them. I've seen good reasons why the drivers you are working on require this,
> while, following the discussion, I have *not* seen any reasons to drop it. It helps simplifying
> driver code and doesn't introduce any complexity or overhead to existing drivers.

Up to you. I'm just saying, moving one step in the right direction is
better than being stuck, and it's not like adding this callback in a
follow-up patch is super complicated either. If you're confident that
we can get all parties to agree on keeping this hook, fine by me.


More information about the dri-devel mailing list