[PATCH 09/10] drm/amdgpu: ib test first after gpu reset

Christian König deathsimple at vodafone.de
Thu Jun 30 08:26:14 UTC 2016


Am 30.06.2016 um 10:14 schrieb zhoucm1:
>
>
> On 2016年06月30日 16:20, Christian König wrote:
>> Am 30.06.2016 um 09:09 schrieb Chunming Zhou:
>>> Change-Id: I5f88ed641b85822b8b76684ac623117756cc0295
>>> Signed-off-by: Chunming Zhou <David1.Zhou at amd.com>
>>
>> Again we should only do this when the GPU reset was successfully. 
>> Apart from that the change looks good to me.
> Yeah, I even want to remove it directly, since it takes a bit time. 
> what do you think of it?

Well, first of all we clearly need to add a timeout to all of the IB tests.

I had it a couple of times now that an engine won't come up on an 
engineering sample and so the driver loads waits forever on the IB 
tests. Same could probably happen after a GPU reset.

Apart from that we should still do some kind of test to figure out if an 
engine now works correctly again or not, but I'm not sure if the IB test 
is sufficient for that or not.

Regards,
Christian.

>
>
> Regards,
> David Zhou
>>
>> Christian.
>>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 ++++++++++----------
>>>   1 file changed, 10 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index dc2fdac..35cc529 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -1997,6 +1997,16 @@ retry:
>>>       /* restore scratch */
>>>       amdgpu_atombios_scratch_regs_restore(adev);
>>>   +    r = amdgpu_ib_ring_tests(adev);
>>> +    if (r) {
>>> +        dev_err(adev->dev, "ib ring test failed (%d).\n", r);
>>> +        if (saved) {
>>> +            saved = false;
>>> +            r = amdgpu_suspend(adev);
>>> +            goto retry;
>>> +        }
>>> +    }
>>> +
>>>       for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>>           struct amdgpu_ring *ring = adev->rings[i];
>>>           if (!ring)
>>> @@ -2008,16 +2018,6 @@ retry:
>>>           ring_data[i] = NULL;
>>>       }
>>>   -    r = amdgpu_ib_ring_tests(adev);
>>> -    if (r) {
>>> -        dev_err(adev->dev, "ib ring test failed (%d).\n", r);
>>> -        if (saved) {
>>> -            saved = false;
>>> -            r = amdgpu_suspend(adev);
>>> -            goto retry;
>>> -        }
>>> -    }
>>> -
>>>       if (amdgpu_device_has_dal_support(adev)) {
>>>           r = drm_atomic_helper_resume(adev->ddev, state);
>>>           amdgpu_dm_display_resume(adev);
>>
>



More information about the amd-gfx mailing list