[PATCH i-g-t] lib/amdgpu: fix sdma linear copy command

vitaly prosyak vprosyak at amd.com
Thu Sep 26 21:59:27 UTC 2024


the change looks good to me

Reviewed-by: Vitaly Prosyak <vitaly.prosyak at amd.com>

On 2024-09-26 03:35, Jesse.zhang at amd.com wrote:
> Fix page fault when using sdma linear copy:
> [ 4606.313448] amdgpu 0000:1a:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:24 vmid:1 pasid:32772)
> [ 4606.313463] amdgpu 0000:1a:00.0: amdgpu:  for process amd_deadlock pid 4440 thread amd_deadlock pid 4440)
> [ 4606.313475] amdgpu 0000:1a:00.0: amdgpu:   in page starting at address 0x0000000000001000 from IH client 0x12 (VMC)
> [ 4606.313490] amdgpu 0000:1a:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00120231
> [ 4606.313501] amdgpu 0000:1a:00.0: amdgpu:      Faulty UTCL2 client ID: SDMA1 (0x101)
> [ 4606.313511] amdgpu 0000:1a:00.0: amdgpu:      MORE_FAULTS: 0x1
> [ 4606.313519] amdgpu 0000:1a:00.0: amdgpu:      WALKER_ERROR: 0x0
> [ 4606.313527] amdgpu 0000:1a:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
> [ 4606.313535] amdgpu 0000:1a:00.0: amdgpu:      MAPPING_ERROR: 0x0
> [ 4606.313543] amdgpu 0000:1a:00.0: amdgpu:      RW: 0x0
>
> For old AI asics, the sdma copy count is shorter than newer ones.
> So add count check in case the max range is exceeded.
>
> Cc: Vitaly Prosyak <vitaly.prosyak at amd.com>
> Cc: Alex Deucher <alexander.deucher at amd.com>
> Cc: Christian Koenig <christian.koenig at amd.com>
>
> Signed-off-by: Jesse Zhang <Jesse.Zhang at amd.com>
> ---
>  lib/amdgpu/amd_ip_blocks.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/lib/amdgpu/amd_ip_blocks.c b/lib/amdgpu/amd_ip_blocks.c
> index 3f8f28483..f22a322e5 100644
> --- a/lib/amdgpu/amd_ip_blocks.c
> +++ b/lib/amdgpu/amd_ip_blocks.c
> @@ -189,10 +189,17 @@ sdma_ring_copy_linear(const struct amdgpu_ip_funcs *func,
>  		context->pm4[i++] = SDMA_PACKET(SDMA_OPCODE_COPY,
>  				       SDMA_COPY_SUB_OPCODE_LINEAR,
>  					context->secure ? 0x4 : 0);
> -		if (func->family_id >= AMDGPU_FAMILY_AI)
> -			context->pm4[i++] = context->write_length - 1;
> -		else
> +		if (func->family_id >= AMDGPU_FAMILY_AI) {
> +			/* For FAMILY AI, the maximum copy range supported by sdma is 4MB */
> +			if (func->family_id >= AMDGPU_FAMILY_AI && context->write_length > 0x3fffff) {
> +				context->pm4[i++] = 0x3fffff;
> +				igt_warn("sdma copy count exceeds the maximum limit of 4MB\n");
> +			} else {
> +				context->pm4[i++] = context->write_length - 1;
> +			}
> +		} else {
>  			context->pm4[i++] = context->write_length;
> +		}
>  		context->pm4[i++] = 0;
>  		context->pm4[i++] = lower_32_bits(context->bo_mc);
>  		context->pm4[i++] = upper_32_bits(context->bo_mc);


More information about the igt-dev mailing list