[PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

Lazar, Lijo lijo.lazar at amd.com
Mon Aug 16 11:33:43 UTC 2021



On 8/16/2021 4:05 PM, Michel Dänzer wrote:
> From: Michel Dänzer <mdaenzer at redhat.com>
> 
> schedule_delayed_work does not push back the work if it was already
> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
> was disabled and re-enabled again during those 100 ms.
> 
> This resulted in frame drops / stutter with the upcoming mutter 41
> release on Navi 14, due to constantly enabling GFXOFF in the HW and
> disabling it again (for getting the GPU clock counter).
> 
> To fix this, call cancel_delayed_work_sync when the disable count
> transitions from 0 to 1, and only schedule the delayed work on the
> reverse transition, not if the disable count was already 0. This makes
> sure the delayed work doesn't run at unexpected times, and allows it to
> be lock-free.
> 
> v2:
> * Use cancel_delayed_work_sync & mutex_trylock instead of
>    mod_delayed_work.
> v3:
> * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
> 
> Cc: stable at vger.kernel.org
> Signed-off-by: Michel Dänzer <mdaenzer at redhat.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++++------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 22 +++++++++++++++++-----
>   2 files changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f3fd5ec710b6..f944ed858f3e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2777,12 +2777,11 @@ static void amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>   	struct amdgpu_device *adev =
>   		container_of(work, struct amdgpu_device, gfx.gfx_off_delay_work.work);
>   
> -	mutex_lock(&adev->gfx.gfx_off_mutex);
> -	if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
> -		if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true))
> -			adev->gfx.gfx_off_state = true;
> -	}
> -	mutex_unlock(&adev->gfx.gfx_off_mutex);
> +	WARN_ON_ONCE(adev->gfx.gfx_off_state);

Don't see any case for this. It's not expected to be scheduled in this 
case, right?

> +	WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
> +

Thinking about ON_ONCE here - this may happen more than once if it's 
completed as part of cancel_ call. Is the warning needed?

Anyway,
	Reviewed-by: Lijo Lazar <lijo.lazar at amd.com>

> +	if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true))
> +		adev->gfx.gfx_off_state = true;
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a0be0772c8b3..ca91aafcb32b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -563,15 +563,26 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
>   
>   	mutex_lock(&adev->gfx.gfx_off_mutex);
>   
> -	if (!enable)
> -		adev->gfx.gfx_off_req_count++;
> -	else if (adev->gfx.gfx_off_req_count > 0)
> +	if (enable) {
> +		/* If the count is already 0, it means there's an imbalance bug somewhere.
> +		 * Note that the bug may be in a different caller than the one which triggers the
> +		 * WARN_ON_ONCE.
> +		 */
> +		if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
> +			goto unlock;
> +
>   		adev->gfx.gfx_off_req_count--;
> +	} else {
> +		adev->gfx.gfx_off_req_count++;
> +	}
>   
>   	if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
>   		schedule_delayed_work(&adev->gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE);
> -	} else if (!enable && adev->gfx.gfx_off_state) {
> -		if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) {
> +	} else if (!enable && adev->gfx.gfx_off_req_count == 1) {
> +		cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work);
> +
> +		if (adev->gfx.gfx_off_state &&
> +		    !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) {
>   			adev->gfx.gfx_off_state = false;
>   
>   			if (adev->gfx.funcs->init_spm_golden) {
> @@ -581,6 +592,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
>   		}
>   	}
>   
> +unlock:
>   	mutex_unlock(&adev->gfx.gfx_off_mutex);
>   }
>   
> 


More information about the amd-gfx mailing list