[Nouveau] [PATCH 1/5] drm/nouveau: Prevent RPM callback recursion in suspend/resume paths

Rafael J. Wysocki rafael at kernel.org
Wed Jul 18 08:35:18 UTC 2018


On Wed, Jul 18, 2018 at 10:25 AM, Lukas Wunner <lukas at wunner.de> wrote:
> On Wed, Jul 18, 2018 at 09:38:41AM +0200, Rafael J. Wysocki wrote:
>> On Tue, Jul 17, 2018 at 8:20 PM, Lukas Wunner <lukas at wunner.de> wrote:
>> > Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
>> > wants it in resumed state, so is waiting forever for the device to
>> > runtime suspend in order to resume it again immediately afterwards.
>> >
>> > The deadlock in the stack trace you've posted could be resolved using
>> > the technique I used in d61a5c106351 by adding the following to
>> > include/linux/pm_runtime.h:
>> >
>> > static inline bool pm_runtime_status_suspending(struct device *dev)
>> > {
>> >         return dev->power.runtime_status == RPM_SUSPENDING;
>> > }
>> >
>> > static inline bool is_pm_work(struct device *dev)
>> > {
>> >         struct work_struct *work = current_work();
>> >
>> >         return work && work->func == dev->power.work;
>> > }
>> >
>> > Then adding this to nvkm_i2c_aux_acquire():
>> >
>> >         struct device *dev = pad->i2c->subdev.device->dev;
>> >
>> >         if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
>> >                 ret = pm_runtime_get_sync(dev);
>> >                 if (ret < 0 && ret != -EACCES)
>> >                         return ret;
>> >         }
> [snip]
>>
>> For the record, I don't quite like this approach as it seems to be
>> working around a broken dependency graph.
>>
>> If you need to resume device A from within the runtime resume callback
>> of device B, then clearly B depends on A and there should be a link
>> between them.
>>
>> That said, I do realize that it may be the path of least resistance,
>> but then I wonder if we can do better than this.
>
> The GPU contains an i2c subdevice for each connector with DDC lines.
> I believe those are modelled as children of the GPU's PCI device as
> they're accessed via mmio of the PCI device.
>
> The problem here is that when the GPU's PCI device runtime suspends,
> its i2c child device needs to be runtime active to suspend the MST
> topology.  Catch-22.

I see.

This sounds like a case for the ignore_children flag, maybe in a
slightly modified form, that will allow the parent to be suspended
regardless of the state of the children.

I wonder what happens to the I2C subdevices when the PCI device goes
into D3.  They are not accessible through MMIO any more then, so how
can they be suspended then?  Or do they need to be suspended at all?

> I don't know whether or not it's necessary to suspend the MST topology.
> I'm not an expert on DisplayPort MultiStream transport.

Me neither. :-)


More information about the Nouveau mailing list