[PATCH] drm/amdgpu: Move reset domain locking in DPC handler
Christian König
ckoenig.leichtzumerken at gmail.com
Thu Apr 14 06:40:18 UTC 2022
Am 13.04.22 um 21:31 schrieb Andrey Grodzovsky:
> Lock reset domain unconditionally because on resume
> we unlock it unconditionally.
> This solved mutex deadlock when handling both FATAL
> and non FATAL PCI errors one after another.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 1cc488a767d8..c65f25e3a0fc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5531,18 +5531,18 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
>
> adev->pci_channel_state = state;
>
> + /*
> + * Locking adev->reset_domain->sem will prevent any external access
> + * to GPU during PCI error recovery
> + */
> + amdgpu_device_lock_reset_domain(adev->reset_domain);
> + amdgpu_device_set_mp1_state(adev);
> +
> switch (state) {
> case pci_channel_io_normal:
> return PCI_ERS_RESULT_CAN_RECOVER;
BTW: Where are we unlocking that again?
> /* Fatal error, prepare for slot reset */
> case pci_channel_io_frozen:
> - /*
> - * Locking adev->reset_domain->sem will prevent any external access
> - * to GPU during PCI error recovery
> - */
> - amdgpu_device_lock_reset_domain(adev->reset_domain);
> - amdgpu_device_set_mp1_state(adev);
> -
> /*
> * Block any work scheduling as we do for regular GPU reset
> * for the duration of the recovery
More information about the amd-gfx
mailing list