[PATCH] drm/amdgpu: Move reset domain locking in DPC handler

Andrey Grodzovsky andrey.grodzovsky at amd.com
Thu Apr 14 14:31:26 UTC 2022


On 2022-04-14 02:40, Christian König wrote:
>
>
> Am 13.04.22 um 21:31 schrieb Andrey Grodzovsky:
>> Lock reset domain unconditionally because on resume
>> we unlock it unconditionally.
>> This solved mutex deadlock when handling both FATAL
>> and non FATAL PCI errors one after another.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +++++++-------
>>   1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 1cc488a767d8..c65f25e3a0fc 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -5531,18 +5531,18 @@ pci_ers_result_t 
>> amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
>>         adev->pci_channel_state = state;
>>   +    /*
>> +     * Locking adev->reset_domain->sem will prevent any external access
>> +     * to GPU during PCI error recovery
>> +     */
>> +    amdgpu_device_lock_reset_domain(adev->reset_domain);
>> +    amdgpu_device_set_mp1_state(adev);
>> +
>>       switch (state) {
>>       case pci_channel_io_normal:
>>           return PCI_ERS_RESULT_CAN_RECOVER;
>
> BTW: Where are we unlocking that again?


In amdgpu_pci_resume, but you made realize I can do this better.
I will be back with V2.

Andrey


>
>>       /* Fatal error, prepare for slot reset */
>>       case pci_channel_io_frozen:
>> -        /*
>> -         * Locking adev->reset_domain->sem will prevent any external 
>> access
>> -         * to GPU during PCI error recovery
>> -         */
>> -        amdgpu_device_lock_reset_domain(adev->reset_domain);
>> -        amdgpu_device_set_mp1_state(adev);
>> -
>>           /*
>>            * Block any work scheduling as we do for regular GPU reset
>>            * for the duration of the recovery
>


More information about the amd-gfx mailing list