[PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN ring_end_use hooks
Michel Dänzer
michel at daenzer.net
Thu Aug 12 16:54:50 UTC 2021
On 2021-08-12 1:33 p.m., Lazar, Lijo wrote:
> On 8/12/2021 1:41 PM, Michel Dänzer wrote:
>> On 2021-08-12 7:55 a.m., Koenig, Christian wrote:
>>> Hi James,
>>>
>>> Evan seems to have understood how this all works together.
>>>
>>> See while any begin/end use critical section is active the work should not be active.
>>>
>>> When you handle only one ring you can just call cancel in begin use and schedule in end use. But when you have more than one ring you need a lock or counter to prevent concurrent work items to be started.
>>>
>>> Michelle's idea to use mod_delayed_work is a bad one because it assumes that the delayed work is still running.
>>
>> It merely assumes that the work may already have been scheduled before.
>>
>> Admittedly, I missed the cancel_delayed_work_sync calls for patch 2. While I think it can still have some effect when there's a single work item for multiple rings, as described by James, it's probably negligible, since presumably the time intervals between ring_begin_use and ring_end_use are normally much shorter than a second.
>>
>> So, while patch 2 is at worst a no-op (since mod_delayed_work is the same as schedule_delayed_work if the work hasn't been scheduled yet), I'm fine with dropping it.
>>
>>
>>> Something similar applies to the first patch I think,
>>
>> There are no cancel work calls in that case, so the commit log is accurate TTBOMK.
>
> Curious -
>
> For patch 1, does it make a difference if any delayed work scheduled is cancelled in the else part before proceeding?
>
> } else if (!enable && adev->gfx.gfx_off_state) {
> cancel_delayed_work();
I tried the patch below.
While this does seem to fix the problem as well, I see a potential issue:
1. amdgpu_gfx_off_ctrl locks adev->gfx.gfx_off_mutex
2. amdgpu_device_delay_enable_gfx_off runs, blocks in mutex_lock
3. amdgpu_gfx_off_ctrl calls cancel_delayed_work_sync
I'm afraid this would deadlock? (CONFIG_PROVE_LOCKING doesn't complain though)
Maybe it's possible to fix it with cancel_delayed_work_sync somehow, but I'm not sure how offhand. (With cancel_delayed_work instead, I'm worried amdgpu_device_delay_enable_gfx_off might still enable GFXOFF in the HW immediately after amdgpu_gfx_off_ctrl unlocks the mutex. Then again, that might happen with mod_delayed_work as well...)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index a0be0772c8b3..3e4585ffb9af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -570,8 +570,11 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
schedule_delayed_work(&adev->gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE);
- } else if (!enable && adev->gfx.gfx_off_state) {
- if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) {
+ } else if (!enable) {
+ cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work);
+
+ if (adev->gfx.gfx_off_state &&
+ !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) {
adev->gfx.gfx_off_state = false;
if (adev->gfx.funcs->init_spm_golden) {
--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and X developer
More information about the amd-gfx
mailing list