[PATCH 1/2] drm/amdgpu: stop cp resume when compute ring test failed

Christian König ckoenig.leichtzumerken at gmail.com
Thu Apr 23 14:55:30 UTC 2020


Yeah, we certainly could try this again. But maybe split that up into 
individual patches for gfx7/8/9.

In other words make it easy to revert if this still doesn't work well on 
gfx7 or some other generation.

Christian.

Am 23.04.20 um 15:43 schrieb Zhang, Hawking:
> [AMD Official Use Only - Internal Distribution Only]
>
> Would you mind to enable this and try it again? The recent gpu reset testing on vega20 looks very positive.
>
> Regards,
> Hawking
> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken at gmail.com>
> Sent: Thursday, April 23, 2020 20:31
> To: Zhang, Hawking <Hawking.Zhang at amd.com>; amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH 1/2] drm/amdgpu: stop cp resume when compute ring test failed
>
> Am 23.04.20 um 11:01 schrieb Hawking Zhang:
>> driver should stop cp resume once compute ring test failed
> Mhm intentionally ignored those errors because the compute rings sometimes doesn't come up again after a GPU reset.
>
> We even have the necessary logic in the SW scheduler to redirect the jobs to another compute ring when one fails to come up again.
>
> Christian.
>
>> Change-Id: I4cd3328f38e0755d0c877484936132d204c9fe50
>> Signed-off-by: Hawking Zhang <Hawking.Zhang at amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 4 +++-
>>    drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 4 +++-
>>    drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +++-
>>    3 files changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>> index b2f10e3..fcee758 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>> @@ -3132,7 +3132,9 @@ static int gfx_v7_0_cp_compute_resume(struct
>> amdgpu_device *adev)
>>    
>>    	for (i = 0; i < adev->gfx.num_compute_rings; i++) {
>>    		ring = &adev->gfx.compute_ring[i];
>> -		amdgpu_ring_test_helper(ring);
>> +		r = amdgpu_ring_test_helper(ring);
>> +		if (r)
>> +			return r;
>>    	}
>>    
>>    	return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
>> index 6c56ced..8dc8e90 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
>> @@ -4781,7 +4781,9 @@ static int gfx_v8_0_cp_test_all_rings(struct
>> amdgpu_device *adev)
>>    
>>    	for (i = 0; i < adev->gfx.num_compute_rings; i++) {
>>    		ring = &adev->gfx.compute_ring[i];
>> -		amdgpu_ring_test_helper(ring);
>> +		r = amdgpu_ring_test_helper(ring);
>> +		if (r)
>> +			return r;
>>    	}
>>    
>>    	return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> index 09aa5f5..20937059 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> @@ -3846,7 +3846,9 @@ static int gfx_v9_0_cp_resume(struct
>> amdgpu_device *adev)
>>    
>>    	for (i = 0; i < adev->gfx.num_compute_rings; i++) {
>>    		ring = &adev->gfx.compute_ring[i];
>> -		amdgpu_ring_test_helper(ring);
>> +		r = amdgpu_ring_test_helper(ring);
>> +		if (r)
>> +			return r;
>>    	}
>>    
>>    	gfx_v9_0_enable_gui_idle_interrupt(adev, true);
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list