[PATCH 2/2] drm/amdgpu: Enable per-queue reset support
Lazar, Lijo
lijo.lazar at amd.com
Fri Feb 14 06:53:52 UTC 2025
On 2/14/2025 12:14 PM, Zhang, Jesse(Jie) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Lijo,
> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> Sent: Friday, February 14, 2025 2:10 PM
> To: Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Kim, Jonathan <Jonathan.Kim at amd.com>; Zhu, Jiadong <Jiadong.Zhu at amd.com>; Prosyak, Vitaly <Vitaly.Prosyak at amd.com>
> Subject: Re: [PATCH 2/2] drm/amdgpu: Enable per-queue reset support
>
>
>
> On 2/14/2025 11:25 AM, jesse.zhang at amd.com wrote:
>> From: "Jesse.zhang at amd.com" <Jesse.zhang at amd.com>
>>
>> This patch updates the SDMA v4.4.2 software initialization to enable
>> per-queue reset support when the MEC firmware version is 0xb0 or
>> higher and the PMFW supports SDMA reset.
>>
>> The following changes are included:
>> - Added a condition to check if the MEC firmware version is at least 0xb0 and if
>> the PMFW supports SDMA reset using `amdgpu_dpm_reset_sdma_is_supported`.
>> - If both conditions are met, the `AMDGPU_RESET_TYPE_PER_QUEUE` flag is set in
>> `adev->sdma.supported_reset`.
>>
>> Suggested-by: Jonathan Kim <Jonathan.Kim at amd.com>
>> Signed-off-by: Vitaly Prosyak <vitaly.prosyak at amd.com>
>> Signed-off-by: Jesse Zhang <jesse.zhang at amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> index b24a1ff5d743..e01d97b96655 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> @@ -1481,9 +1481,10 @@ static int sdma_v4_4_2_sw_init(struct amdgpu_ip_block *ip_block)
>> }
>> }
>>
>> - /* TODO: Add queue reset mask when FW fully supports it */
>> adev->sdma.supported_reset =
>> amdgpu_get_soft_full_reset_mask(&adev->sdma.instance[0].ring);
>> + if (adev->gfx.mec_fw_version >= 0xb0 && amdgpu_dpm_reset_sdma_is_supported(adev))
>> + adev->sdma.supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE;
>
> This function is reused across multiple IP versions. MEC fw versions aren't the same across those IP versions.
>
> In fact, the user queue relies on MEC fw and pmfw when the sdma queue do reset.
> So we need to check both of them at here to skip old mec and pmfw.
>
To make it clear -
MEC FW >= 0xb0 is having reset support for say GC 9.4.3. With GC 9.5.0,
MEC FW 0x20 may have the same support.
Thanks,
Lijo
> Thanks
> Jesse
>
> Thanks,
> Lijo
>
>>
>> if (amdgpu_sdma_ras_sw_init(adev)) {
>> dev_err(adev->dev, "fail to initialize sdma ras block\n");
>
More information about the amd-gfx
mailing list