[PATCH 07/10] drm/amdgpu: recovery hw jobs when gpu reset

Christian König deathsimple at vodafone.de
Thu Jun 30 08:17:36 UTC 2016


Am 30.06.2016 um 09:09 schrieb Chunming Zhou:
> Change-Id: If10da1e224d81a12fd4f8d760c48178adb9e82d0
> Signed-off-by: Chunming Zhou <David1.Zhou at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c     | 4 ++--
>   2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 60b6dd0..dc2fdac 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2001,8 +2001,9 @@ retry:
>   		struct amdgpu_ring *ring = adev->rings[i];
>   		if (!ring)
>   			continue;
> -		amdgpu_ring_restore(ring, ring_sizes[i], ring_data[i]);
> +		amd_sched_job_recovery(&ring->sched);
>   		kthread_unpark(ring->sched.thread);
> +		kfree(ring_data[i]);
>   		ring_sizes[i] = 0;
>   		ring_data[i] = NULL;
>   	}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index cced2f6..7393473 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -384,11 +384,11 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring,
>   		amdgpu_ring_emit_pipeline_sync(ring);
>   
>   	if (ring->funcs->emit_vm_flush &&
> -	    pd_addr != AMDGPU_VM_NO_FLUSH) {
> +	    (pd_addr != AMDGPU_VM_NO_FLUSH || amdgpu_vm_is_gpu_reset(adev, id))) {

Ah here it is now, thought you dropped that change. With the hardware 
fences now properly finished after a GPU reset that shouldn't be 
necessary any more.

This is especially important since we could have cases where we need a 
pipeline sync after the GPU reset on the restarted jobs.

>   		struct fence *fence;
>   
>   		trace_amdgpu_vm_flush(pd_addr, ring->idx, vm_id);
> -		amdgpu_ring_emit_vm_flush(ring, vm_id, pd_addr);
> +		amdgpu_ring_emit_vm_flush(ring, vm_id, id->pd_gpu_addr);

That should work, but why should we do this?

Regards,
Christian.

>   
>   		r = amdgpu_fence_emit(ring, &fence);
>   		if (r)



More information about the amd-gfx mailing list