[Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device.
Gupta, Anshuman
anshuman.gupta at intel.com
Thu Sep 22 11:09:44 UTC 2022
On 9/22/2022 3:13 PM, Rodrigo Vivi wrote:
> On Thu, Sep 22, 2022 at 08:56:00AM +0100, Tvrtko Ursulin wrote:
>>
>> On 21/09/2022 18:39, Rodrigo Vivi wrote:
>>> The force_probe protection actively avoids the probe of i915 to
>>> manage a device that is currently under development. It is a nice
>>> protection for future users when getting a new platform but using
>>> some older kernel.
>>>
>>> However, when we avoid the probe we don't take back the registration
>>> of the device. We cannot give up the registration anyway since we can
>>> have multiple devices present. For instance an integrated and a discrete
>>> one.
>>>
>>> When this scenario occurs, the user will not be able to change any
>>> of the runtime pm configuration of the unmanaged device. So, it will
>>> be blocked in D0 state wasting power. This is specially bad in the
>>> case where we have a discrete platform attached, but the user is
>>> able to fully use the integrated one for everything else.
>>>
>>> So, let's put the protected and unmanaged device in D3. So we can
>>> save some power.
>>>
>>> Reported-by: Daniel J Blueman <daniel at quora.org>
>>> Cc: stable at vger.kernel.org
>>> Cc: Daniel J Blueman <daniel at quora.org>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: Anshuman Gupta <anshuman.gupta at intel.com>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
>>> 1 file changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
>>> index 77e7df21f539..fc3e7c69af2a 100644
>>> --- a/drivers/gpu/drm/i915/i915_pci.c
>>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>>> @@ -25,6 +25,7 @@
>>> #include <drm/drm_color_mgmt.h>
>>> #include <drm/drm_drv.h>
>>> #include <drm/i915_pciids.h>
>>> +#include <linux/pm_runtime.h>
>>> #include "gt/intel_gt_regs.h"
>>> #include "gt/intel_sa_media.h"
>>> @@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>> {
>>> struct intel_device_info *intel_info =
>>> (struct intel_device_info *) ent->driver_data;
>>> + struct device *kdev = &pdev->dev;
>>> int err;
>>> if (intel_info->require_force_probe &&
>>> @@ -1314,6 +1316,12 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>> "module parameter or CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
>>> "or (recommended) check for kernel updates.\n",
>>> pdev->device, pdev->device, pdev->device);
>>> +
>>> + /* Let's not waste power if we are not managing the device */
>>> + pm_runtime_use_autosuspend(kdev);
>>> + pm_runtime_allow(kdev);
>>> + pm_runtime_put_autosuspend(kdev);
AFAIK we don't need to enable autosuspend here,
pm_runtime_put_autosuspend() will cause a NULL pointer de-reference as
it will immediately call the intel_runtime_suspend()(because we haven't
called the pm_runtime_mark_last_busy) without initializing i915.
Having said that we only need below, in order to let pci core keep the
pci dev in D3.
pm_runtime_put_noidle()
Br,
Anshuman Gupta
>>
>> This sequence is black magic to me so can't really comment on the specifics. But in general, what I think I've figured out is, that the PCI core calls our runtime resume callback before probe:
>>
>> local_pci_probe:
>> ...
>> /*
>> * Unbound PCI devices are always put in D0, regardless of
>> * runtime PM status. During probe, the device is set to
>> * active and the usage count is incremented. If the driver
>> * supports runtime PM, it should call pm_runtime_put_noidle(),
>> * or any other runtime PM helper function decrementing the usage
>> * count, in its probe routine and pm_runtime_get_noresume() in
>> * its remove routine.
>> */
>> pm_runtime_get_sync(dev);
>> pci_dev->driver = pci_drv;
>> rc = pci_drv->probe(pci_dev, ddi->id);
>> if (!rc)
>> return rc;
>> if (rc < 0) {
>> pci_dev->driver = NULL;
>> pm_runtime_put_sync(dev);
>> return rc;
>> }
>>
>
> Yes, in Linux the default is D0 for any unmanaged device. But then the
> user can go there in the sysfs and change the power/control to 'auto'
> and get the device to D3.
>
>> And if probe fails it calls pm_runtime_put_sync which presumably does not provide the symmetry we need?
>
> The main problem I see is that when the probe fail in our case we don't
> unregister and i915 is still listed as controlling that device as we could
> see with lspci --nnv.
>
> And any attempt to change the control to 'auto' fails. So we are forever
> stuck in D0.
>
> So, I really believe it is better to bring the device to D3 then leaving
> it there blocked in D0 forever.
>
> Or forcing users to use another parameter to entirely avoid i915 to get
> this device at first place.
>
>>
>> Anyway since I can't provide meaningful review I'll copy Imre since I think he worked in the area in the past. Just so more eyes is better.
>>
>> Regards,
>>
>> Tvrtko
>>
>>
>>> +
>>> return -ENODEV;
>>> }
More information about the Intel-gfx
mailing list