[v3 2/5] drm/amdgpu: Add ring reset support for VCN v5.0.1
Lijo Lazar
lijo.lazar at amd.com
Thu Aug 21 15:38:49 UTC 2025
On 8/20/2025 8:33 AM, Jesse.Zhang wrote:
> Implement the ring reset callback for VCN v5.0.1 to properly handle
> hardware recovery when encountering GPU hangs. The new functionality:
>
> 1. Adds vcn_v5_0_1_ring_reset() function that:
> - Prepares for reset using amdgpu_ring_reset_helper_begin()
> - Performs VCN instance reset via amdgpu_dpm_reset_vcn()
> - Re-initializes hardware through vcn_v5_0_1_hw_init_inst()
> - Restarts DPG mode with vcn_v5_0_1_start_dpg_mode()
> - Completes reset with amdgpu_ring_reset_helper_end()
>
> 2. Hooks the reset function into the unified ring functions via:
> - Adding .reset = vcn_v5_0_1_ring_reset to vcn_v5_0_1_unified_ring_vm_funcs
>
> 3. Maintains existing behavior for SR-IOV VF cases by checking RRMT status
>
> This provides proper hardware recovery capabilities for VCN 5.0.1 IP block
> during fault conditions, matching functionality available in other VCN versions.
>
> Signed-off-by: Jesse Zhang <Jesse.Zhang at amd.com>
> Signed-off-by: Ruili Ji <ruiliji2 at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c | 29 +++++++++++++++++++++++++
> 1 file changed, 29 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c
> index 1b5d44fa2b57..779043eac827 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c
> @@ -1284,6 +1284,34 @@ static void vcn_v5_0_1_unified_ring_set_wptr(struct amdgpu_ring *ring)
> }
> }
>
> +static int vcn_v5_0_1_ring_reset(struct amdgpu_ring *ring,
> + unsigned int vmid,
> + struct amdgpu_fence *timedout_fence)
> +{
> + int r = 0;
> + int vcn_inst;
> + struct amdgpu_device *adev = ring->adev;
> + struct amdgpu_vcn_inst *vinst = &adev->vcn.inst[ring->me];
> +
> + amdgpu_ring_reset_helper_begin(ring, timedout_fence);
> +
> + vcn_inst = GET_INST(VCN, ring->me);
> + r = amdgpu_dpm_reset_vcn(adev, 1 << vcn_inst);
> +
> + if (r) {
> + DRM_DEV_ERROR(adev->dev, "VCN reset fail : %d\n", r);
> + return r;
> + }
> +
> + /* This flag is not set for VF, assumed to be disabled always */
> + if (RREG32_SOC15(VCN, GET_INST(VCN, 0), regVCN_RRMT_CNTL) & 0x100)
> + adev->vcn.caps |= AMDGPU_VCN_CAPS(RRMT_ENABLED);
This is not required. The assumption is settings is common across all
instances, hence only the first instance's setting is taken. So if vcn
instance 2 or 3 is reset, this doesn't matter.
> + vcn_v5_0_1_hw_init_inst(adev, ring->me);
> + vcn_v5_0_1_start_dpg_mode(vinst, adev->vcn.inst[ring->me].indirect_sram);
You could use vinst->indirect_sram. That said, it seems there is no need
to pass this as an extra parameter.
Thanks,
Lijo
> +
> + return amdgpu_ring_reset_helper_end(ring, timedout_fence);
> +}
> +
> static const struct amdgpu_ring_funcs vcn_v5_0_1_unified_ring_vm_funcs = {
> .type = AMDGPU_RING_TYPE_VCN_ENC,
> .align_mask = 0x3f,
> @@ -1312,6 +1340,7 @@ static const struct amdgpu_ring_funcs vcn_v5_0_1_unified_ring_vm_funcs = {
> .emit_wreg = vcn_v4_0_3_enc_ring_emit_wreg,
> .emit_reg_wait = vcn_v4_0_3_enc_ring_emit_reg_wait,
> .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper,
> + .reset = vcn_v5_0_1_ring_reset,
> };
>
> /**
More information about the amd-gfx
mailing list