[PATCH v6 1/2] drm/sched: Refactor ring mirror list handling.

Tue Mar 12 07:43:53 UTC 2019

On Thu, 27 Dec 2018 at 20:28, Andrey Grodzovsky
<andrey.grodzovsky at amd.com> wrote:
>
> Decauple sched threads stop and start and ring mirror
> list handling from the policy of what to do about the
> guilty jobs.
> When stoppping the sched thread and detaching sched fences
> from non signaled HW fenes wait for all signaled HW fences
> to complete before rerunning the jobs.
>
> v2: Fix resubmission of guilty job into HW after refactoring.
>
> v4:
> Full restart for all the jobs, not only from guilty ring.
> Extract karma increase into standalone function.
>
> v5:
> Rework waiting for signaled jobs without relying on the job
> struct itself as those might already be freed for non 'guilty'
> job's schedulers.
> Expose karma increase to drivers.
>
> v6:
> Use list_for_each_entry_safe_continue and drm_sched_process_job
> in case fence already signaled.
> Call drm_sched_increase_karma only once for amdgpu and add documentation.
>
> Suggested-by: Christian Koenig <Christian.Koenig at amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  20 ++-
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c    |  11 +-
>  drivers/gpu/drm/scheduler/sched_main.c     | 195 +++++++++++++++++++----------
>  drivers/gpu/drm/v3d/v3d_sched.c            |  12 +-
>  include/drm/gpu_scheduler.h                |   8 +-
>  5 files changed, 157 insertions(+), 89 deletions(-)
>
[snip]
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 445b2ef..f76d9ed 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -178,18 +178,22 @@ v3d_job_timedout(struct drm_sched_job *sched_job)
>         for (q = 0; q < V3D_MAX_QUEUES; q++) {
>                 struct drm_gpu_scheduler *sched = &v3d->queue[q].sched;
>
> -               kthread_park(sched->thread);
> -               drm_sched_hw_job_reset(sched, (sched_job->sched == sched ?
> +               drm_sched_stop(sched, (sched_job->sched == sched ?
>                                                sched_job : NULL));
> +
> +               if(sched_job)
> +                       drm_sched_increase_karma(sched_job);
>         }
>
>         /* get the GPU back into the init state */
>         v3d_reset(v3d);
>
> +       for (q = 0; q < V3D_MAX_QUEUES; q++)
> +               drm_sched_resubmit_jobs(sched_job->sched);

Hi Andrey,

I'm not sure of what was the original intent, but I guess it wasn't to
repeatedly call resubmit_jobs on that specific job's queue?

Regards,

Tomeu