[PATCH] drm/xe/pm: Move xe_rpm_lockmap_acquire

Wed Sep 11 11:34:54 UTC 2024

On Wed, Sep 11, 2024 at 03:00:25PM +0530, Suraj Kandpal wrote:
> Move xe_rpm_lockmap_acquire after display_pm_suspend and resume
> funtions to avoid cirular locking dependency because of locks
> being taken in intel_fbdev, intel_dp_mst_mgr suspend and resume
> functions.
> 
> Signed-off-by: Suraj Kandpal <suraj.kandpal at intel.com>

The actual problem is that MST is being suspended during runtime
suspend. This is not required (adding only unnecessary overhead) but
also incorrect as it involves AUX transfers which itself depends on the
device being runtime resumed. This is what lockdep is also trying to
say.

So the solution would be not to suspend/resume MST during runtime
suspend/resume.

> ---
>  drivers/gpu/drm/xe/xe_pm.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index a3d1509066f7..7f33e553728a 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -363,6 +363,18 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>  	/* Disable access_ongoing asserts and prevent recursive pm calls */
>  	xe_pm_write_callback_task(xe, current);
>  
> +	/*
> +	 * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
> +	 * also checks and delets bo entry from user fault list.
> +	 */
> +	mutex_lock(&xe->mem_access.vram_userfault.lock);
> +	list_for_each_entry_safe(bo, on,
> +				 &xe->mem_access.vram_userfault.list, vram_userfault_link)
> +		xe_bo_runtime_pm_release_mmap_offset(bo);
> +	mutex_unlock(&xe->mem_access.vram_userfault.lock);
> +
> +	xe_display_pm_runtime_suspend(xe);
> +
>  	/*
>  	 * The actual xe_pm_runtime_put() is always async underneath, so
>  	 * exactly where that is called should makes no difference to us. However
> @@ -386,18 +398,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>  	 */
>  	xe_rpm_lockmap_acquire(xe);
>  
> -	/*
> -	 * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
> -	 * also checks and delets bo entry from user fault list.
> -	 */
> -	mutex_lock(&xe->mem_access.vram_userfault.lock);
> -	list_for_each_entry_safe(bo, on,
> -				 &xe->mem_access.vram_userfault.list, vram_userfault_link)
> -		xe_bo_runtime_pm_release_mmap_offset(bo);
> -	mutex_unlock(&xe->mem_access.vram_userfault.lock);
> -
> -	xe_display_pm_runtime_suspend(xe);
> -
>  	if (xe->d3cold.allowed) {
>  		err = xe_bo_evict_all(xe);
>  		if (err)
> @@ -438,8 +438,6 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>  	/* Disable access_ongoing asserts and prevent recursive pm calls */
>  	xe_pm_write_callback_task(xe, current);
>  
> -	xe_rpm_lockmap_acquire(xe);
> -
>  	if (xe->d3cold.allowed) {
>  		err = xe_pcode_ready(xe, true);
>  		if (err)
> @@ -463,6 +461,8 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>  
>  	xe_display_pm_runtime_resume(xe);
>  
> +	xe_rpm_lockmap_acquire(xe);
> +
>  	if (xe->d3cold.allowed) {
>  		err = xe_bo_restore_user(xe);
>  		if (err)
> -- 
> 2.43.2
>