[PATCH 2/2] drm/amdgpu: Enable per-queue reset support

Zhang, Jesse(Jie) Jesse.Zhang at amd.com
Fri Feb 14 06:44:57 UTC 2025


[AMD Official Use Only - AMD Internal Distribution Only]

Hi Lijo,
-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar at amd.com>
Sent: Friday, February 14, 2025 2:10 PM
To: Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Kim, Jonathan <Jonathan.Kim at amd.com>; Zhu, Jiadong <Jiadong.Zhu at amd.com>; Prosyak, Vitaly <Vitaly.Prosyak at amd.com>
Subject: Re: [PATCH 2/2] drm/amdgpu: Enable per-queue reset support



On 2/14/2025 11:25 AM, jesse.zhang at amd.com wrote:
> From: "Jesse.zhang at amd.com" <Jesse.zhang at amd.com>
>
> This patch updates the SDMA v4.4.2 software initialization to enable
> per-queue reset support when the MEC firmware version is 0xb0 or
> higher and the PMFW supports SDMA reset.
>
> The following changes are included:
> - Added a condition to check if the MEC firmware version is at least 0xb0 and if
>   the PMFW supports SDMA reset using `amdgpu_dpm_reset_sdma_is_supported`.
> - If both conditions are met, the `AMDGPU_RESET_TYPE_PER_QUEUE` flag is set in
>   `adev->sdma.supported_reset`.
>
> Suggested-by: Jonathan Kim <Jonathan.Kim at amd.com>
> Signed-off-by: Vitaly Prosyak <vitaly.prosyak at amd.com>
> Signed-off-by: Jesse Zhang <jesse.zhang at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> index b24a1ff5d743..e01d97b96655 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> @@ -1481,9 +1481,10 @@ static int sdma_v4_4_2_sw_init(struct amdgpu_ip_block *ip_block)
>               }
>       }
>
> -     /* TODO: Add queue reset mask when FW fully supports it */
>       adev->sdma.supported_reset =
>               amdgpu_get_soft_full_reset_mask(&adev->sdma.instance[0].ring);
> +     if (adev->gfx.mec_fw_version >= 0xb0 && amdgpu_dpm_reset_sdma_is_supported(adev))
> +             adev->sdma.supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE;

This function is reused across multiple IP versions. MEC fw versions aren't the same across those IP versions.

In fact, the user queue relies on MEC fw and pmfw when the sdma queue do reset.
So we need to check both of them at here  to skip old mec and pmfw.

Thanks
Jesse

Thanks,
Lijo

>
>       if (amdgpu_sdma_ras_sw_init(adev)) {
>               dev_err(adev->dev, "fail to initialize sdma ras block\n");



More information about the amd-gfx mailing list