[PATCH] drm/amdgpu/job: fix is_guilty logic change (v2)
Alex Deucher
alexander.deucher at amd.com
Fri Feb 21 15:39:01 UTC 2025
Incrementing the gpu_reset counter needs to be
in the is_guilty block. Alos move the fence
error before the reset to keep the original ordering.
Fixes: f447ba2bbd48 ("drm/amdgpu: Update amdgpu_job_timedout to check if the ring is guilty")
Cc: Jesse Zhang <jesse.zhang at amd.com>
Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index efba509e2b5d1..c37bc683253a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -151,14 +151,16 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
else
is_guilty = true;
+ if (is_guilty)
+ dma_fence_set_error(&s_job->s_fence->finished, -ETIME);
+
r = amdgpu_ring_reset(ring, job->vmid);
if (!r) {
if (amdgpu_ring_sched_ready(ring))
drm_sched_stop(&ring->sched, s_job);
- atomic_inc(&ring->adev->gpu_reset_counter);
if (is_guilty) {
+ atomic_inc(&ring->adev->gpu_reset_counter);
amdgpu_fence_driver_force_completion(ring);
- dma_fence_set_error(&s_job->s_fence->finished, -ETIME);
}
if (amdgpu_ring_sched_ready(ring))
drm_sched_start(&ring->sched, 0);
--
2.48.1
More information about the amd-gfx
mailing list