[Intel-gfx] [PATCH] PM: sleep: Avoid calling put_device() under dpm_list_mtx

Thomas Hellström thomas.hellstrom at linux.intel.com
Thu Dec 16 13:27:29 UTC 2021


Hi, Rafael,

On 11/4/21 18:26, Rafael J. Wysocki wrote:
> It is generally unsafe to call put_device() with dpm_list_mtx held,
> because the given device's release routine may carry out an action
> depending on that lock which then may deadlock, so modify the
> system-wide suspend and resume of devices to always drop dpm_list_mtx
> before calling put_device() (and adjust white space somewhat while
> at it).
>
> For instance, this prevents the following splat from showing up in
> the kernel log after a system resume in certain configurations:


<snip>


> @@ -1748,21 +1769,27 @@ int dpm_suspend(pm_message_t state)
>   		struct device *dev = to_device(dpm_prepared_list.prev);
>   
>   		get_device(dev);
> +
>   		mutex_unlock(&dpm_list_mtx);
>   
>   		error = device_suspend(dev);
>   
>   		mutex_lock(&dpm_list_mtx);
> +
>   		if (error) {
>   			pm_dev_err(dev, state, "", error);
>   			dpm_save_failed_dev(dev_name(dev));
> -			put_device(dev);
> -			break;
> -		}
> -		if (!list_empty(&dev->power.entry))
> +		} else if (!list_empty(&dev->power.entry)) {
>   			list_move(&dev->power.entry, &dpm_suspended_list);
> +		}
> +
> +		mutex_unlock(&dpm_list_mtx);
> +
>   		put_device(dev);
> -		if (async_error)
> +
> +		mutex_lock(&dpm_list_mtx);
> +
> +		if (error || async_error)
>   			break;
>   	}
>   	mutex_unlock(&dpm_list_mtx);
> @@ -1879,6 +1906,7 @@ int dpm_prepare(pm_message_t state)
>   		struct device *dev = to_device(dpm_list.next);
>   
>   		get_device(dev);
> +
>   		mutex_unlock(&dpm_list_mtx);
>   
>   		trace_device_pm_callback_start(dev, "", state.event);
> @@ -1886,21 +1914,23 @@ int dpm_prepare(pm_message_t state)
>   		trace_device_pm_callback_end(dev, error);
>   
>   		mutex_lock(&dpm_list_mtx);
> -		if (error) {
> -			if (error == -EAGAIN) {
> -				put_device(dev);
> -				error = 0;
> -				continue;
> -			}
> +
> +		if (!error) {
> +			dev->power.is_prepared = true;
> +			if (!list_empty(&dev->power.entry))
> +				list_move_tail(&dev->power.entry, &dpm_prepared_list);
> +		} else if (error == -EAGAIN) {
> +			error = 0;
> +		} else {
>   			dev_info(dev, "not prepared for power transition: code %d\n",
>   				 error);
> -			put_device(dev);
> -			break;

It appears the above break disappeared.


>   		}
> -		dev->power.is_prepared = true;
> -		if (!list_empty(&dev->power.entry))
> -			list_move_tail(&dev->power.entry, &dpm_prepared_list);
> +
> +		mutex_unlock(&dpm_list_mtx);
> +
>   		put_device(dev);

Should be

                  if (error)

                         break;

Here?

Symptoms is if we return an error from the device prepare callback, we 
end up spinning forever with little clue what's going on.


> +
> +		mutex_lock(&dpm_list_mtx);
>   	}
>   	mutex_unlock(&dpm_list_mtx);
>   	trace_suspend_resume(TPS("dpm_prepare"), state.event, false);

Thanks,

Thomas




More information about the Intel-gfx mailing list