[PATCH] drm/scheduler: put killed job cleanup to worker
Grodzovsky, Andrey
Andrey.Grodzovsky at amd.com
Wed Jul 3 14:23:30 UTC 2019
On 7/3/19 6:28 AM, Lucas Stach wrote:
> drm_sched_entity_kill_jobs_cb() is called right from the last scheduled
> job finished fence signaling. As this might happen from IRQ context we
> now end up calling the GPU driver free_job callback in IRQ context, while
> all other paths call it from normal process context.
>
> Etnaviv in particular calls core kernel functions that are only valid to
> be called from process context when freeing the job. Other drivers might
> have similar issues, but I did not validate this. Fix this by punting
> the cleanup work into a work item, so the driver expectations are met.
>
> Signed-off-by: Lucas Stach <l.stach at pengutronix.de>
> ---
> drivers/gpu/drm/scheduler/sched_entity.c | 28 ++++++++++++++----------
> 1 file changed, 17 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 35ddbec1375a..ba4eb66784b9 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -202,23 +202,23 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
> }
> EXPORT_SYMBOL(drm_sched_entity_flush);
>
> -/**
> - * drm_sched_entity_kill_jobs - helper for drm_sched_entity_kill_jobs
> - *
> - * @f: signaled fence
> - * @cb: our callback structure
> - *
> - * Signal the scheduler finished fence when the entity in question is killed.
> - */
> +static void drm_sched_entity_kill_work(struct work_struct *work)
> +{
> + struct drm_sched_job *job = container_of(work, struct drm_sched_job,
> + finish_work);
> +
> + drm_sched_fence_finished(job->s_fence);
> + WARN_ON(job->s_fence->parent);
> + job->sched->ops->free_job(job);
> +}
> +
> static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
> struct dma_fence_cb *cb)
> {
> struct drm_sched_job *job = container_of(cb, struct drm_sched_job,
> finish_cb);
>
> - drm_sched_fence_finished(job->s_fence);
> - WARN_ON(job->s_fence->parent);
> - job->sched->ops->free_job(job);
> + schedule_work(&job->finish_work);
> }
>
> /**
> @@ -240,6 +240,12 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
> drm_sched_fence_scheduled(s_fence);
> dma_fence_set_error(&s_fence->finished, -ESRCH);
>
> + /*
> + * Replace regular finish work function with one that just
> + * kills the job.
> + */
> + job->finish_work.func = drm_sched_entity_kill_work;
I rechecked the latest code and finish_work was removed in ffae3e5
'drm/scheduler: rework job destruction'
Andrey
> +
> /*
> * When pipe is hanged by older entity, new entity might
> * not even have chance to submit it's first job to HW
More information about the dri-devel
mailing list