[Intel-gfx] [PATCH] PM: sleep: Avoid calling put_device() under dpm_list_mtx

Rafael J. Wysocki rafael.j.wysocki at intel.com
Thu Dec 16 18:24:13 UTC 2021


On 12/16/2021 2:27 PM, Thomas Hellström wrote:
> Hi, Rafael,
>
> On 11/4/21 18:26, Rafael J. Wysocki wrote:
>> It is generally unsafe to call put_device() with dpm_list_mtx held,
>> because the given device's release routine may carry out an action
>> depending on that lock which then may deadlock, so modify the
>> system-wide suspend and resume of devices to always drop dpm_list_mtx
>> before calling put_device() (and adjust white space somewhat while
>> at it).
>>
>> For instance, this prevents the following splat from showing up in
>> the kernel log after a system resume in certain configurations:
>
>
> <snip>
>
>
>> @@ -1748,21 +1769,27 @@ int dpm_suspend(pm_message_t state)
>>           struct device *dev = to_device(dpm_prepared_list.prev);
>>             get_device(dev);
>> +
>>           mutex_unlock(&dpm_list_mtx);
>>             error = device_suspend(dev);
>>             mutex_lock(&dpm_list_mtx);
>> +
>>           if (error) {
>>               pm_dev_err(dev, state, "", error);
>>               dpm_save_failed_dev(dev_name(dev));
>> -            put_device(dev);
>> -            break;
>> -        }
>> -        if (!list_empty(&dev->power.entry))
>> +        } else if (!list_empty(&dev->power.entry)) {
>>               list_move(&dev->power.entry, &dpm_suspended_list);
>> +        }
>> +
>> +        mutex_unlock(&dpm_list_mtx);
>> +
>>           put_device(dev);
>> -        if (async_error)
>> +
>> +        mutex_lock(&dpm_list_mtx);
>> +
>> +        if (error || async_error)
>>               break;
>>       }
>>       mutex_unlock(&dpm_list_mtx);
>> @@ -1879,6 +1906,7 @@ int dpm_prepare(pm_message_t state)
>>           struct device *dev = to_device(dpm_list.next);
>>             get_device(dev);
>> +
>>           mutex_unlock(&dpm_list_mtx);
>>             trace_device_pm_callback_start(dev, "", state.event);
>> @@ -1886,21 +1914,23 @@ int dpm_prepare(pm_message_t state)
>>           trace_device_pm_callback_end(dev, error);
>>             mutex_lock(&dpm_list_mtx);
>> -        if (error) {
>> -            if (error == -EAGAIN) {
>> -                put_device(dev);
>> -                error = 0;
>> -                continue;
>> -            }
>> +
>> +        if (!error) {
>> +            dev->power.is_prepared = true;
>> +            if (!list_empty(&dev->power.entry))
>> +                list_move_tail(&dev->power.entry, &dpm_prepared_list);
>> +        } else if (error == -EAGAIN) {
>> +            error = 0;
>> +        } else {
>>               dev_info(dev, "not prepared for power transition: code 
>> %d\n",
>>                    error);
>> -            put_device(dev);
>> -            break;
>
> It appears the above break disappeared.
>
>
>>           }
>> -        dev->power.is_prepared = true;
>> -        if (!list_empty(&dev->power.entry))
>> -            list_move_tail(&dev->power.entry, &dpm_prepared_list);
>> +
>> +        mutex_unlock(&dpm_list_mtx);
>> +
>>           put_device(dev);
>
> Should be
>
>                  if (error)
>
>                         break;
>
> Here?
>
Looks like that.


> Symptoms is if we return an error from the device prepare callback, we 
> end up spinning forever with little clue what's going on.
>
>
>> +
>> +        mutex_lock(&dpm_list_mtx);
>>       }
>>       mutex_unlock(&dpm_list_mtx);
>>       trace_suspend_resume(TPS("dpm_prepare"), state.event, false);
>
I'll have a look, thanks for the report!




More information about the Intel-gfx mailing list