[PATCH 1/4] drm/amdgpu/vcn: fix race condition issue for vcn start

James Zhu jamesz at amd.com
Tue Mar 3 19:03:30 UTC 2020


On 2020-03-03 1:44 p.m., Christian König wrote:
> Am 03.03.20 um 19:16 schrieb James Zhu:
>> Fix race condition issue when multiple vcn starts are called.
>>
>> Signed-off-by: James Zhu <James.Zhu at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 4 ++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h | 1 +
>>   2 files changed, 5 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> index f96464e..aa7663f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> @@ -63,6 +63,7 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev)
>>       int i, r;
>>         INIT_DELAYED_WORK(&adev->vcn.idle_work, 
>> amdgpu_vcn_idle_work_handler);
>> +    mutex_init(&adev->vcn.vcn_pg_lock);
>>         switch (adev->asic_type) {
>>       case CHIP_RAVEN:
>> @@ -210,6 +211,7 @@ int amdgpu_vcn_sw_fini(struct amdgpu_device *adev)
>>       }
>>         release_firmware(adev->vcn.fw);
>> +    mutex_destroy(&adev->vcn.vcn_pg_lock);
>>         return 0;
>>   }
>> @@ -321,6 +323,7 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring 
>> *ring)
>>       struct amdgpu_device *adev = ring->adev;
>>       bool set_clocks = !cancel_delayed_work_sync(&adev->vcn.idle_work);
>>   +    mutex_lock(&adev->vcn.vcn_pg_lock);
>
> That still won't work correctly here.
>
> The whole idea of the cancel_delayed_work_sync() and 
> schedule_delayed_work() dance is that you have exactly one user of 
> that. If you have multiple rings that whole thing won't work correctly.
>
> To fix this you need to call mutex_lock() before 
> cancel_delayed_work_sync() and schedule_delayed_work() before 
> mutex_unlock().

Big lock definitely works. I am trying to use as smaller lock as 
possible here. the share resource which needs protect here are power 
gate process and dpg mode switch process.

if we move mutex_unlock() before schedule_delayed_work(. I am wondering 
what are the other necessary resources which need protect.

Thanks!

James

>
> Regards,
> Christian.
>
>>       if (set_clocks) {
>>           amdgpu_gfx_off_ctrl(adev, false);
>>           amdgpu_device_ip_set_powergating_state(adev, 
>> AMD_IP_BLOCK_TYPE_VCN,
>> @@ -345,6 +348,7 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring 
>> *ring)
>>             adev->vcn.pause_dpg_mode(adev, ring->me, &new_state);
>>       }
>> +    mutex_unlock(&adev->vcn.vcn_pg_lock);
>>   }
>>     void amdgpu_vcn_ring_end_use(struct amdgpu_ring *ring)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
>> index 6fe0573..2ae110d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
>> @@ -200,6 +200,7 @@ struct amdgpu_vcn {
>>       struct drm_gpu_scheduler *vcn_dec_sched[AMDGPU_MAX_VCN_INSTANCES];
>>       uint32_t         num_vcn_enc_sched;
>>       uint32_t         num_vcn_dec_sched;
>> +    struct mutex         vcn_pg_lock;
>>         unsigned    harvest_config;
>>       int (*pause_dpg_mode)(struct amdgpu_device *adev,
>


More information about the amd-gfx mailing list