[PATCH] drm/amd/amdgpu: move inc gpu_reset_counter after drm_sched_stop
Christian König
christian.koenig at amd.com
Thu Feb 25 09:18:03 UTC 2021
Am 25.02.21 um 10:16 schrieb Jingwen Chen:
> Move gpu_reset_counter after drm_sched_stop to avoid race
> condition caused by job submitted between reset_count +1 and
> drm_sched_stop.
>
> Signed-off-by: Jingwen Chen <Jingwen.Chen2 at amd.com>
Reviewed-by: Christian König <christian.koenig at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f0f7ed42ee7f..703b96cf3560 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4447,7 +4447,6 @@ static bool amdgpu_device_lock_adev(struct amdgpu_device *adev,
> down_write(&adev->reset_sem);
> }
>
> - atomic_inc(&adev->gpu_reset_counter);
> switch (amdgpu_asic_reset_method(adev)) {
> case AMD_RESET_METHOD_MODE1:
> adev->mp1_state = PP_MP1_STATE_SHUTDOWN;
> @@ -4708,6 +4707,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> if (need_emergency_restart)
> amdgpu_job_stop_all_jobs_on_sched(&ring->sched);
> }
> + atomic_inc(&tmp_adev->gpu_reset_counter);
> }
>
> if (need_emergency_restart)
> @@ -5050,6 +5050,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
>
> drm_sched_stop(&ring->sched, NULL);
> }
> + atomic_inc(&adev->gpu_reset_counter);
> return PCI_ERS_RESULT_NEED_RESET;
> case pci_channel_io_perm_failure:
> /* Permanent error, prepare for device removal */
More information about the amd-gfx
mailing list