[PATCH 2/2] drm/amdgpu: Fix SDMA queue reset array out-of-bounds access
Jesse Zhang
jesse.zhang at amd.com
Wed Jun 11 05:56:04 UTC 2025
The current SDMA v4.4.2 queue reset logic incorrectly uses GET_INST
macro for queue operations, leading to array index out-of-bounds
errors when harvesting is enabled. This manifests as UBSAN warnings
when stopping queues during reset operations.
[ 306.871518] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:118:38
[ 306.871538] index 4294967295 is out of range for type 'uint32_t *[44]'
[ 306.871929] amdgpu_sdma_reset_engine+0xe4/0x320 [amdgpu]
[ 306.872115] reset_queues_on_hws_hang+0x2dc/0x4d0 [amdgpu]
The fix ensures we use physical instance IDs consistently for queue
operations while maintaining harvest-aware mapping for register access.
Signed-off-by: Jesse Zhang <Jesse.Zhang at amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
index 9c169112a5e7..3de125062ee3 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
@@ -1670,7 +1670,7 @@ static bool sdma_v4_4_2_page_ring_is_guilty(struct amdgpu_ring *ring)
static int sdma_v4_4_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid)
{
struct amdgpu_device *adev = ring->adev;
- u32 id = GET_INST(SDMA0, ring->me);
+ u32 id = ring->me;
int r;
if (!(adev->sdma.supported_reset & AMDGPU_RESET_TYPE_PER_QUEUE))
@@ -1686,7 +1686,7 @@ static int sdma_v4_4_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid)
static int sdma_v4_4_2_stop_queue(struct amdgpu_ring *ring)
{
struct amdgpu_device *adev = ring->adev;
- u32 instance_id = GET_INST(SDMA0, ring->me);
+ u32 instance_id = ring->me;
u32 inst_mask;
uint64_t rptr;
--
2.34.1
More information about the amd-gfx
mailing list