[PATCH] drm/amdgpu/vcn: fix video profile race condition (v2)
Alex Deucher
alexdeucher at gmail.com
Wed Aug 13 14:03:33 UTC 2025
On Wed, Aug 13, 2025 at 9:55 AM Sundararaju, Sathishkumar
<sathishkumar.sundararaju at amd.com> wrote:
>
>
> On 8/13/2025 7:15 PM, Alex Deucher wrote:
> > On Tue, Aug 12, 2025 at 7:08 PM David Wu <davidwu2 at amd.com> wrote:
> >> Hi Alex,
> >>
> >> still have a concern - the fence or submission_cnt could increase in
> >> begin_use but idle handler
> >> has finished counting it (as 0) so it could also power off vcn.
> > Ok, I think that is possible. Will send an updated patch to handle
> > that case as well.
>
> cancel_delayed_work_sync(&vcn_inst->idle_work) at the top of begin_use
> covers this situation.
>
> So there is no idle handler for this instance anymore after this call
> completes, but the additional checks are okay to have.
There are separate work handers for each instance. What could happen
is that the instance 0 work handler is running when begin_use() is
running on instance 1. The instance 1 begin_use() sees that the video
profile is already enabled. The instance 0 work handler sees that
total_fences is 0 and disables the video profile because the fence for
instance 1 has not been emitted yet. It won't be emitted until after
begin_use() completes. The total_submission_cnt covers that period of
time (between begin_ring and end_ring until the actual fence is
emitted).
Alex
>
> Regards,
>
> Sathish
>
> >
> > Alex
> >
> >> David
> >>
> >> On 2025-08-12 16:58, Alex Deucher wrote:
> >>> If there are multiple instances of the VCN running,
> >>> we may end up switching the video profile while another
> >>> instance is active because we only take into account
> >>> the current instance's submissions. Look at all
> >>> outstanding fences for the video profile.
> >>>
> >>> v2: drop early exit in begin_use()
> >>>
> >>> Fixes: 3b669df92c85 ("drm/amdgpu/vcn: adjust workload profile handling")
> >>> Reviewed-by: Sathishkumar S <sathishkumar.sundararaju at amd.com> (v1)
> >>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> >>> ---
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 37 ++++++++++++-------------
> >>> 1 file changed, 17 insertions(+), 20 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> >>> index 9a76e11d1c184..fd127e9461c89 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> >>> @@ -415,19 +415,25 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
> >>> struct amdgpu_vcn_inst *vcn_inst =
> >>> container_of(work, struct amdgpu_vcn_inst, idle_work.work);
> >>> struct amdgpu_device *adev = vcn_inst->adev;
> >>> - unsigned int fences = 0, fence[AMDGPU_MAX_VCN_INSTANCES] = {0};
> >>> - unsigned int i = vcn_inst->inst, j;
> >>> + unsigned int total_fences = 0, fence[AMDGPU_MAX_VCN_INSTANCES] = {0};
> >>> + unsigned int i, j;
> >>> int r = 0;
> >>>
> >>> - if (adev->vcn.harvest_config & (1 << i))
> >>> + if (adev->vcn.harvest_config & (1 << vcn_inst->inst))
> >>> return;
> >>>
> >>> - for (j = 0; j < adev->vcn.inst[i].num_enc_rings; ++j)
> >>> - fence[i] += amdgpu_fence_count_emitted(&vcn_inst->ring_enc[j]);
> >>> + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
> >>> + struct amdgpu_vcn_inst *v = &adev->vcn.inst[i];
> >>> +
> >>> + for (j = 0; j < v->num_enc_rings; ++j)
> >>> + fence[i] += amdgpu_fence_count_emitted(&v->ring_enc[j]);
> >>> + fence[i] += amdgpu_fence_count_emitted(&v->ring_dec);
> >>> + total_fences += fence[i];
> >>> + }
> >>>
> >>> /* Only set DPG pause for VCN3 or below, VCN4 and above will be handled by FW */
> >>> if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG &&
> >>> - !adev->vcn.inst[i].using_unified_queue) {
> >>> + !vcn_inst->using_unified_queue) {
> >>> struct dpg_pause_state new_state;
> >>>
> >>> if (fence[i] ||
> >>> @@ -436,18 +442,17 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
> >>> else
> >>> new_state.fw_based = VCN_DPG_STATE__UNPAUSE;
> >>>
> >>> - adev->vcn.inst[i].pause_dpg_mode(vcn_inst, &new_state);
> >>> + vcn_inst->pause_dpg_mode(vcn_inst, &new_state);
> >>> }
> >>>
> >>> - fence[i] += amdgpu_fence_count_emitted(&vcn_inst->ring_dec);
> >>> - fences += fence[i];
> >>> -
> >>> - if (!fences && !atomic_read(&vcn_inst->total_submission_cnt)) {
> >>> + if (!fence[vcn_inst->inst] && !atomic_read(&vcn_inst->total_submission_cnt)) {
> >>> + /* This is specific to this instance */
> >>> mutex_lock(&vcn_inst->vcn_pg_lock);
> >>> vcn_inst->set_pg_state(vcn_inst, AMD_PG_STATE_GATE);
> >>> mutex_unlock(&vcn_inst->vcn_pg_lock);
> >>> mutex_lock(&adev->vcn.workload_profile_mutex);
> >>> - if (adev->vcn.workload_profile_active) {
> >>> + /* This is global and depends on all VCN instances */
> >>> + if (adev->vcn.workload_profile_active && !total_fences) {
> >>> r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
> >>> false);
> >>> if (r)
> >>> @@ -470,13 +475,6 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
> >>>
> >>> cancel_delayed_work_sync(&vcn_inst->idle_work);
> >>>
> >>> - /* We can safely return early here because we've cancelled the
> >>> - * the delayed work so there is no one else to set it to false
> >>> - * and we don't care if someone else sets it to true.
> >>> - */
> >>> - if (adev->vcn.workload_profile_active)
> >>> - goto pg_lock;
> >>> -
> >>> mutex_lock(&adev->vcn.workload_profile_mutex);
> >>> if (!adev->vcn.workload_profile_active) {
> >>> r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
> >>> @@ -487,7 +485,6 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
> >>> }
> >>> mutex_unlock(&adev->vcn.workload_profile_mutex);
> >>>
> >>> -pg_lock:
> >>> mutex_lock(&vcn_inst->vcn_pg_lock);
> >>> vcn_inst->set_pg_state(vcn_inst, AMD_PG_STATE_UNGATE);
> >>>
More information about the amd-gfx
mailing list