[PATCH v2 2/2] drm/amdgpu: Process fences on IH overflow
Friedrich Vock
friedrich.vock at gmx.de
Thu Jan 18 18:54:02 UTC 2024
If the IH ring buffer overflows, it's possible that fence signal events
were lost. Check each ring for progress to prevent job timeouts/GPU
hangs due to the fences staying unsignaled despite the work being done.
Cc: Joshua Ashton <joshua at froggi.es>
Cc: Alex Deucher <alexander.deucher at amd.com>
Cc: Christian König <christian.koenig at amd.com>
Cc: stable at vger.kernel.org
Signed-off-by: Friedrich Vock <friedrich.vock at gmx.de>
---
v2: Set ih->overflow to false after processing fences
drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index f3b0aaf3ebc6..4e061f7741d8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -209,6 +209,7 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih)
{
unsigned int count;
u32 wptr;
+ int i;
if (!ih->enabled || adev->shutdown)
return IRQ_NONE;
@@ -227,6 +228,21 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih)
ih->rptr &= ih->ptr_mask;
}
+ /* If the ring buffer overflowed, we might have lost some fence
+ * signal interrupts. Check if there was any activity so the signal
+ * doesn't get lost.
+ */
+ if (ih->overflow) {
+ for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
+ struct amdgpu_ring *ring = adev->rings[i];
+
+ if (!ring || !ring->fence_drv.initialized)
+ continue;
+ amdgpu_fence_process(ring);
+ }
+ ih->overflow = false;
+ }
+
amdgpu_ih_set_rptr(adev, ih);
wake_up_all(&ih->wait_process);
--
2.43.0
More information about the amd-gfx
mailing list