[PATCH v4 8/8] Revert "PCI/ERR: Update error status after reset_link()"

Andrey Grodzovsky Andrey.Grodzovsky at amd.com
Wed Sep 2 19:54:00 UTC 2020


Yes, works also.

Can you provide me a formal patch that i can commit into our local amd staging 
tree with my patch set ?

Alex - is that how we want to do it, without this patch or reverting the 
original patch the feature is broken.

Andrey

On 9/2/20 3:00 PM, Kuppuswamy, Sathyanarayanan wrote:
>
>
> On 9/2/20 11:42 AM, Andrey Grodzovsky wrote:
>> This reverts commit 6d2c89441571ea534d6240f7724f518936c44f8d.
>>
>> In the code bellow
>>
>>                  pci_walk_bus(bus, report_frozen_detected, &status);
>> -               if (reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
>> +               status = reset_link(dev, service);
>>
>> status returned from report_frozen_detected is unconditionally masked
>> by status returned from reset_link which is wrong.
>>
>> This breaks error recovery implementation for AMDGPU driver
>> by masking PCI_ERS_RESULT_NEED_RESET returned from amdgpu_pci_error_detected
>> and hence skiping slot reset callback which is necessary for proper
>> ASIC recovery. Effectively no other callback besides resume callback will
>> be called after link reset the way it is implemented now regardless of what
>> value error_detected callback returns.
>>
>     }
>
> Instead of reverting this change, can you try following patch ?
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-pci%2F56ad4901-725f-7b88-2117-b124b28b027f%40linux.intel.com%2FT%2F%23me8029c04f63c21f9d1cb3b1ba2aeffbca3a60df5&data=02%7C01%7Candrey.grodzovsky%40amd.com%7C77325d6a2abc42d26ae608d84f726c51%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637346700170831846&sdata=JPo8lOXfjxpq%2BnmlVrSi93aZxGjIlbuh0rkZmNKkzQM%3D&reserved=0 
>
>


More information about the amd-gfx mailing list