drm/sched: Replacement for drm_sched_resubmit_jobs() is deprecated

Tue May 2 11:36:07 UTC 2023

Hi Boris,

Am 02.05.23 um 13:19 schrieb Boris Brezillon:
> Hello Christian, Alex,
>
> As part of our transition to drm_sched for the powervr GPU driver, we
> realized drm_sched_resubmit_jobs(), which is used by all drivers
> relying on drm_sched right except amdgpu, has been deprecated.
> Unfortunately, commit 5efbe6aa7a0e ("drm/scheduler: deprecate
> drm_sched_resubmit_jobs") doesn't describe what drivers should do or use
> as an alternative.
>
> At the very least, for our implementation, we need to restore the
> drm_sched_job::parent pointers that were set to NULL in
> drm_sched_stop(), such that jobs submitted before the GPU recovery are
> considered active when drm_sched_start() is called. That could be done
> with a custom pending_list iteration restoring drm_sched_job::parent's
> pointer, but that seems odd to let the scheduler backend manipulate
> this list directly, and I suspect we need to do other checks, like the
> karma vs hang-limit thing, so we can flag the entity dirty and cancel
> all jobs being queued there if the entity has caused too many hangs.
>
> Now that drm_sched_resubmit_jobs() has been deprecated, that would be
> great if you could help us write a piece of documentation describing
> what should be done between drm_sched_stop() and drm_sched_start(), so
> new drivers don't come up with their own slightly different/broken
> version of the same thing.

Yeah, really good point! The solution is to not use drm_sched_stop() and 
drm_sched_start() either.

The general idea Daniel, the other Intel guys and me seem to have agreed 
on is to convert the scheduler thread into a work item.

This work item for pushing jobs to the hw can then be queued to the same 
workqueue we use for the timeout work item.

If this workqueue is now configured by your driver as single threaded 
you have a guarantee that only either the scheduler or the timeout work 
item is running at the same time. That in turn makes starting/stopping 
the scheduler for a reset completely superfluous.

Patches for this has already been floating on the mailing list, but 
haven't been committed yet. Since this is all WIP.

In general it's not really a good idea to change the scheduler and hw 
fences during GPU reset/recovery. The dma_fence implementation has a 
pretty strict state transition which clearly say that a dma_fence should 
never go back from signaled to unsignaled and when you start messing 
with that this is exactly what might happen.

What you can do is to save your hw state and re-start at the same 
location after handling the timeout.

Regards,
Christian.

>
> Thanks in advance for your help.
>
> Regards,
>
> Boris