[PATCH v2] drm/amd/amdgpu: consider kernel job always not guilty
Christian König
ckoenig.leichtzumerken at gmail.com
Wed Jul 21 06:26:12 UTC 2021
Am 21.07.21 um 04:05 schrieb Jingwen Chen:
> [Why]
> Currently all timedout job will be considered to be guilty. In SRIOV
> multi-vf use case, the vf flr happens first and then job time out is
> found. There can be several jobs timeout during a very small time slice.
> And if the innocent sdma job time out is found before the real bad
> job, then the innocent sdma job will be set to guilty. This will lead
> to a page fault after resubmitting job.
>
> [How]
> If the job is a kernel job, we will always consider it not guilty
>
> Signed-off-by: Jingwen Chen <Jingwen.Chen2 at amd.com>
Reviewed-by: Christian König <christian.koenig at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 37fa199be8b3..40461547701a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4410,7 +4410,7 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
> amdgpu_fence_driver_force_completion(ring);
> }
>
> - if(job)
> + if (job && job->vm)
> drm_sched_increase_karma(&job->base);
>
> r = amdgpu_reset_prepare_hwcontext(adev, reset_context);
> @@ -4874,7 +4874,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another already in progress",
> job ? job->base.id : -1, hive->hive_id);
> amdgpu_put_xgmi_hive(hive);
> - if (job)
> + if (job && job->vm)
> drm_sched_increase_karma(&job->base);
> return 0;
> }
> @@ -4898,7 +4898,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> job ? job->base.id : -1);
>
> /* even we skipped this reset, still need to set the job to guilty */
> - if (job)
> + if (job && job->vm)
> drm_sched_increase_karma(&job->base);
> goto skip_recovery;
> }
More information about the amd-gfx
mailing list