[PATCH v3 1/5] drm/scheduler: rework job destruction

Eric Anholt eric at anholt.net
Mon Apr 15 21:17:37 UTC 2019


Andrey Grodzovsky <andrey.grodzovsky at amd.com> writes:

> From: Christian König <christian.koenig at amd.com>
>
> We now destroy finished jobs from the worker thread to make sure that
> we never destroy a job currently in timeout processing.
> By this we avoid holding lock around ring mirror list in drm_sched_stop
> which should solve a deadlock reported by a user.
>
> v2: Remove unused variable.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692
>
> Signed-off-by: Christian König <christian.koenig at amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  17 ++--
>  drivers/gpu/drm/etnaviv/etnaviv_dump.c     |   4 -
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   9 +-
>  drivers/gpu/drm/scheduler/sched_main.c     | 138 +++++++++++++++++------------
>  drivers/gpu/drm/v3d/v3d_sched.c            |   9 +-

Missing corresponding panfrost and lima updates.  You should probably
pull in drm-misc for hacking on the scheduler.

> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index ce7c737b..8efb091 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -232,11 +232,18 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>  
>  	/* block scheduler */
>  	for (q = 0; q < V3D_MAX_QUEUES; q++)
> -		drm_sched_stop(&v3d->queue[q].sched);
> +		drm_sched_stop(&v3d->queue[q].sched, sched_job);
>  
>  	if(sched_job)
>  		drm_sched_increase_karma(sched_job);
>  
> +	/*
> +	 * Guilty job did complete and hence needs to be manually removed
> +	 * See drm_sched_stop doc.
> +	 */
> +	if (list_empty(&sched_job->node))
> +		sched_job->sched->ops->free_job(sched_job);

If the if (sched_job) is necessary up above, then this should clearly be
under it.

But, can we please have a core scheduler thing we call here instead of
drivers all replicating it?

> +
>  	/* get the GPU back into the init state */
>  	v3d_reset(v3d);
>  
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190415/7551423e/attachment.sig>


More information about the amd-gfx mailing list