[PATCH i-g-t] lib/amdgpu: fix sdma linear copy command

Jesse.zhang@amd.com jesse.zhang at amd.com
Thu Sep 26 07:35:33 UTC 2024


Fix page fault when using sdma linear copy:
[ 4606.313448] amdgpu 0000:1a:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:24 vmid:1 pasid:32772)
[ 4606.313463] amdgpu 0000:1a:00.0: amdgpu:  for process amd_deadlock pid 4440 thread amd_deadlock pid 4440)
[ 4606.313475] amdgpu 0000:1a:00.0: amdgpu:   in page starting at address 0x0000000000001000 from IH client 0x12 (VMC)
[ 4606.313490] amdgpu 0000:1a:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00120231
[ 4606.313501] amdgpu 0000:1a:00.0: amdgpu:      Faulty UTCL2 client ID: SDMA1 (0x101)
[ 4606.313511] amdgpu 0000:1a:00.0: amdgpu:      MORE_FAULTS: 0x1
[ 4606.313519] amdgpu 0000:1a:00.0: amdgpu:      WALKER_ERROR: 0x0
[ 4606.313527] amdgpu 0000:1a:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[ 4606.313535] amdgpu 0000:1a:00.0: amdgpu:      MAPPING_ERROR: 0x0
[ 4606.313543] amdgpu 0000:1a:00.0: amdgpu:      RW: 0x0

For old AI asics, the sdma copy count is shorter than newer ones.
So add count check in case the max range is exceeded.

Cc: Vitaly Prosyak <vitaly.prosyak at amd.com>
Cc: Alex Deucher <alexander.deucher at amd.com>
Cc: Christian Koenig <christian.koenig at amd.com>

Signed-off-by: Jesse Zhang <Jesse.Zhang at amd.com>
---
 lib/amdgpu/amd_ip_blocks.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/lib/amdgpu/amd_ip_blocks.c b/lib/amdgpu/amd_ip_blocks.c
index 3f8f28483..f22a322e5 100644
--- a/lib/amdgpu/amd_ip_blocks.c
+++ b/lib/amdgpu/amd_ip_blocks.c
@@ -189,10 +189,17 @@ sdma_ring_copy_linear(const struct amdgpu_ip_funcs *func,
 		context->pm4[i++] = SDMA_PACKET(SDMA_OPCODE_COPY,
 				       SDMA_COPY_SUB_OPCODE_LINEAR,
 					context->secure ? 0x4 : 0);
-		if (func->family_id >= AMDGPU_FAMILY_AI)
-			context->pm4[i++] = context->write_length - 1;
-		else
+		if (func->family_id >= AMDGPU_FAMILY_AI) {
+			/* For FAMILY AI, the maximum copy range supported by sdma is 4MB */
+			if (func->family_id >= AMDGPU_FAMILY_AI && context->write_length > 0x3fffff) {
+				context->pm4[i++] = 0x3fffff;
+				igt_warn("sdma copy count exceeds the maximum limit of 4MB\n");
+			} else {
+				context->pm4[i++] = context->write_length - 1;
+			}
+		} else {
 			context->pm4[i++] = context->write_length;
+		}
 		context->pm4[i++] = 0;
 		context->pm4[i++] = lower_32_bits(context->bo_mc);
 		context->pm4[i++] = upper_32_bits(context->bo_mc);
-- 
2.25.1



More information about the igt-dev mailing list