[PATCH] drm/amdgpu/vcn: fix video profile race condition (v2)

Wed Aug 13 14:05:47 UTC 2025

Hi Alex,

On 8/13/2025 7:33 PM, Alex Deucher wrote:
> On Wed, Aug 13, 2025 at 9:55 AM Sundararaju, Sathishkumar
> <sathishkumar.sundararaju at amd.com> wrote:
>>
>> On 8/13/2025 7:15 PM, Alex Deucher wrote:
>>> On Tue, Aug 12, 2025 at 7:08 PM David Wu <davidwu2 at amd.com> wrote:
>>>> Hi Alex,
>>>>
>>>> still have a concern - the fence or submission_cnt could increase in
>>>> begin_use but idle handler
>>>> has finished counting it (as 0) so it could also power off vcn.
>>> Ok, I think that is possible.  Will send an updated patch to handle
>>> that case as well.
>> cancel_delayed_work_sync(&vcn_inst->idle_work) at the top of begin_use
>> covers this situation.
>>
>> So there is no idle handler for this instance anymore after this call
>> completes, but the additional checks are okay to have.
> There are separate work handers for each instance.  What could happen
> is that the instance 0 work handler is running when begin_use() is
> running on instance 1.  The instance 1 begin_use() sees that the video
> profile is already enabled.  The instance 0 work handler sees that
> total_fences is 0 and disables the video profile because the fence for
> instance 1 has not been emitted yet.  It won't be emitted until after
> begin_use() completes.  The total_submission_cnt covers that period of
> time (between begin_ring and end_ring until the actual fence is
> emitted).

Ah! I get it now, thanks for clarifying, sorry I missed this.

Regards,

Sathish

>
> Alex
>
>> Regards,
>>
>> Sathish
>>
>>> Alex
>>>
>>>> David
>>>>
>>>> On 2025-08-12 16:58, Alex Deucher wrote:
>>>>> If there are multiple instances of the VCN running,
>>>>> we may end up switching the video profile while another
>>>>> instance is active because we only take into account
>>>>> the current instance's submissions.  Look at all
>>>>> outstanding fences for the video profile.
>>>>>
>>>>> v2: drop early exit in begin_use()
>>>>>
>>>>> Fixes: 3b669df92c85 ("drm/amdgpu/vcn: adjust workload profile handling")
>>>>> Reviewed-by: Sathishkumar S <sathishkumar.sundararaju at amd.com> (v1)
>>>>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 37 ++++++++++++-------------
>>>>>     1 file changed, 17 insertions(+), 20 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> index 9a76e11d1c184..fd127e9461c89 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> @@ -415,19 +415,25 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
>>>>>         struct amdgpu_vcn_inst *vcn_inst =
>>>>>                 container_of(work, struct amdgpu_vcn_inst, idle_work.work);
>>>>>         struct amdgpu_device *adev = vcn_inst->adev;
>>>>> -     unsigned int fences = 0, fence[AMDGPU_MAX_VCN_INSTANCES] = {0};
>>>>> -     unsigned int i = vcn_inst->inst, j;
>>>>> +     unsigned int total_fences = 0, fence[AMDGPU_MAX_VCN_INSTANCES] = {0};
>>>>> +     unsigned int i, j;
>>>>>         int r = 0;
>>>>>
>>>>> -     if (adev->vcn.harvest_config & (1 << i))
>>>>> +     if (adev->vcn.harvest_config & (1 << vcn_inst->inst))
>>>>>                 return;
>>>>>
>>>>> -     for (j = 0; j < adev->vcn.inst[i].num_enc_rings; ++j)
>>>>> -             fence[i] += amdgpu_fence_count_emitted(&vcn_inst->ring_enc[j]);
>>>>> +     for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>>> +             struct amdgpu_vcn_inst *v = &adev->vcn.inst[i];
>>>>> +
>>>>> +             for (j = 0; j < v->num_enc_rings; ++j)
>>>>> +                     fence[i] += amdgpu_fence_count_emitted(&v->ring_enc[j]);
>>>>> +             fence[i] += amdgpu_fence_count_emitted(&v->ring_dec);
>>>>> +             total_fences += fence[i];
>>>>> +     }
>>>>>
>>>>>         /* Only set DPG pause for VCN3 or below, VCN4 and above will be handled by FW */
>>>>>         if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG &&
>>>>> -         !adev->vcn.inst[i].using_unified_queue) {
>>>>> +         !vcn_inst->using_unified_queue) {
>>>>>                 struct dpg_pause_state new_state;
>>>>>
>>>>>                 if (fence[i] ||
>>>>> @@ -436,18 +442,17 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
>>>>>                 else
>>>>>                         new_state.fw_based = VCN_DPG_STATE__UNPAUSE;
>>>>>
>>>>> -             adev->vcn.inst[i].pause_dpg_mode(vcn_inst, &new_state);
>>>>> +             vcn_inst->pause_dpg_mode(vcn_inst, &new_state);
>>>>>         }
>>>>>
>>>>> -     fence[i] += amdgpu_fence_count_emitted(&vcn_inst->ring_dec);
>>>>> -     fences += fence[i];
>>>>> -
>>>>> -     if (!fences && !atomic_read(&vcn_inst->total_submission_cnt)) {
>>>>> +     if (!fence[vcn_inst->inst] && !atomic_read(&vcn_inst->total_submission_cnt)) {
>>>>> +             /* This is specific to this instance */
>>>>>                 mutex_lock(&vcn_inst->vcn_pg_lock);
>>>>>                 vcn_inst->set_pg_state(vcn_inst, AMD_PG_STATE_GATE);
>>>>>                 mutex_unlock(&vcn_inst->vcn_pg_lock);
>>>>>                 mutex_lock(&adev->vcn.workload_profile_mutex);
>>>>> -             if (adev->vcn.workload_profile_active) {
>>>>> +             /* This is global and depends on all VCN instances */
>>>>> +             if (adev->vcn.workload_profile_active && !total_fences) {
>>>>>                         r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
>>>>>                                                             false);
>>>>>                         if (r)
>>>>> @@ -470,13 +475,6 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
>>>>>
>>>>>         cancel_delayed_work_sync(&vcn_inst->idle_work);
>>>>>
>>>>> -     /* We can safely return early here because we've cancelled the
>>>>> -      * the delayed work so there is no one else to set it to false
>>>>> -      * and we don't care if someone else sets it to true.
>>>>> -      */
>>>>> -     if (adev->vcn.workload_profile_active)
>>>>> -             goto pg_lock;
>>>>> -
>>>>>         mutex_lock(&adev->vcn.workload_profile_mutex);
>>>>>         if (!adev->vcn.workload_profile_active) {
>>>>>                 r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
>>>>> @@ -487,7 +485,6 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
>>>>>         }
>>>>>         mutex_unlock(&adev->vcn.workload_profile_mutex);
>>>>>
>>>>> -pg_lock:
>>>>>         mutex_lock(&vcn_inst->vcn_pg_lock);
>>>>>         vcn_inst->set_pg_state(vcn_inst, AMD_PG_STATE_UNGATE);
>>>>>