[PATCH V2 2/2] drm/amdgpu: read back register after written for VCN v4.0.0 and v5.0.0

Tue May 13 21:01:13 UTC 2025

On Tue, May 13, 2025 at 4:33 PM David Wu <davidwu2 at amd.com> wrote:
>
>
> On 2025-05-13 15:29, Alex Deucher wrote:
> > On Tue, May 13, 2025 at 3:01 PM David Wu <davidwu2 at amd.com> wrote:
> >>
> >> On 2025-05-13 14:40, Alex Deucher wrote:
> >>
> >> On Tue, May 13, 2025 at 2:23 PM David (Ming Qiang) Wu <David.Wu3 at amd.com> wrote:
> >>
> >> V2: not to add extra read-back in vcn_v4_0_start and vcn_v5_0_0_start as
> >>      there are read-back calls already. New comments for better understanding.
> >>
> >> Similar to the previous changes made for VCN v4.0.5, the addition of
> >> register read-back support in VCN v4.0.0 and v5.0.0 is intended to
> >> prevent potential race conditions, even though such issues have not
> >> been observed yet. This change ensures consistency across different
> >> VCN variants and helps avoid similar issues on newer or closely
> >> related GPUs. The overhead introduced by this read-back is negligible.
> >>
> >> Signed-off-by: David (Ming Qiang) Wu <David.Wu3 at amd.com>
> >> Reviewed-by: Mario Limonciello <mario.limonciello at amd.com>
> >>
> >> Maybe split this into two patches, one for vcn 4 and one for vcn 5.
> >> That will make it easier to backport to stable.  What about other
> >> VCNs?
> >>
> >> will split.
> >>
> >> This applies to those VCNs where regVCN_RB1_DB_CRTL is used for setting doorbell index, which
> >>
> >> means VCN 4 and up - all of them are covered (similar code is already there for those not in this patch).
> > Sure that prevents the doorbell from getting missed, but what about
> > other registers setup in the VCN start() functions?  What if some of
> > those are still pending when the doorbell is rung for other VCNs?
> I think adding a read-back is needed if there is any concern about race
> condition.
> If we only concern about start() it should be easy to add. The question
> is how we will know there is
> a race condition. Adding read back everywhere when missing after write
> is not a solution I think.
> For any VCN functions we need to check carefully
> (eg. vcn_v4_0_unified_ring_set_wptr()) in case it adds too much
> overhead and actually not needed (at least haven't seen the issue).
> Any suggestion as to where we should add or at the moment for _start()?
> I can work on it for sure or leave it for
> future improvement.

I think _start() makes the most sense because it will only be called
when we power on the VCN instance.  As long as work keeps coming in,
it will stay on.  The race is theoretically possible on any VCN
instance.  E.g., when the first VCN job comes in, VCN gets powered on,
and then we call _start() to program what we need.  After that, we
ring the doorbell to kick off the job.  The programming sequence in
_start() could still be in flight on the PCIe bus when the doorbell
gets rung.  Which is apparently exactly what happens when we hit this
issue on VCN 4.x and 5.x.  On VCN the doorbell gets ignored because it
races with the doorbell id register, but on other VCNs, the doorbell
getting missed may not happen, but it could be something else that
races, e.g., ring size.

Alex

>
> David
>
> >
> > Alex
> >
> >> David
> >>
> >> Alex
> >>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c   | 4 ++++
> >>   drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c | 4 ++++
> >>   2 files changed, 8 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
> >> index 8fff470bce873..070a2a8cdf6f4 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
> >> @@ -1122,6 +1122,10 @@ static int vcn_v4_0_start_dpg_mode(struct amdgpu_vcn_inst *vinst, bool indirect)
> >>                          ring->doorbell_index << VCN_RB1_DB_CTRL__OFFSET__SHIFT |
> >>                          VCN_RB1_DB_CTRL__EN_MASK);
> >>
> >> +       /* Keeping one read-back to ensure all register writes are done, otherwise
> >> +        * it may introduce race conditions */
> >> +       RREG32_SOC15(VCN, inst_idx, regVCN_RB1_DB_CTRL);
> >> +
> >>          return 0;
> >>   }
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c
> >> index 27dcc6f37a730..77c27a317e4c8 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c
> >> @@ -794,6 +794,10 @@ static int vcn_v5_0_0_start_dpg_mode(struct amdgpu_vcn_inst *vinst,
> >>                  ring->doorbell_index << VCN_RB1_DB_CTRL__OFFSET__SHIFT |
> >>                  VCN_RB1_DB_CTRL__EN_MASK);
> >>
> >> +       /* Keeping one read-back to ensure all register writes are done, otherwise
> >> +        * it may introduce race conditions */
> >> +       RREG32_SOC15(VCN, inst_idx, regVCN_RB1_DB_CTRL);
> >> +
> >>          return 0;
> >>   }
> >>
> >> --
> >> 2.34.1
> >>