[PATCH] drm/amdgpu: print process info when job timeout

Christian König ckoenig.leichtzumerken at gmail.com
Tue Dec 18 12:14:34 UTC 2018


Am 18.12.18 um 02:42 schrieb Trigger Huang:
> When a job is timeout, try to print the related process information
> for debugging
>
> Signed-off-by: Trigger Huang <Trigger.Huang at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index e0af44f..915310d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -32,6 +32,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   {
>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> +	struct amdgpu_task_info ti = { 0 };

Please use memset for the initialization.

Apart from that the patch is Reviewed-by: Christian König 
<christian.koenig at amd.com>.

>   
>   	if (amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
> @@ -39,9 +40,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   		return;
>   	}
>   
> +	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>   	DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
>   		  job->base.sched->name, atomic_read(&ring->fence_drv.last_seq),
>   		  ring->fence_drv.sync_seq);
> +	DRM_ERROR("Process information: process %s pid %d thread %s pid %d\n",
> +		  ti.process_name, ti.tgid, ti.task_name, ti.pid);
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev))
>   		amdgpu_device_gpu_recover(ring->adev, job);



More information about the amd-gfx mailing list