[Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device.

Gupta, Anshuman anshuman.gupta at intel.com
Thu Sep 22 11:09:44 UTC 2022



On 9/22/2022 3:13 PM, Rodrigo Vivi wrote:
> On Thu, Sep 22, 2022 at 08:56:00AM +0100, Tvrtko Ursulin wrote:
>>
>> On 21/09/2022 18:39, Rodrigo Vivi wrote:
>>> The force_probe protection actively avoids the probe of i915 to
>>> manage a device that is currently under development. It is a nice
>>> protection for future users when getting a new platform but using
>>> some older kernel.
>>>
>>> However, when we avoid the probe we don't take back the registration
>>> of the device. We cannot give up the registration anyway since we can
>>> have multiple devices present. For instance an integrated and a discrete
>>> one.
>>>
>>> When this scenario occurs, the user will not be able to change any
>>> of the runtime pm configuration of the unmanaged device. So, it will
>>> be blocked in D0 state wasting power. This is specially bad in the
>>> case where we have a discrete platform attached, but the user is
>>> able to fully use the integrated one for everything else.
>>>
>>> So, let's put the protected and unmanaged device in D3. So we can
>>> save some power.
>>>
>>> Reported-by: Daniel J Blueman <daniel at quora.org>
>>> Cc: stable at vger.kernel.org
>>> Cc: Daniel J Blueman <daniel at quora.org>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: Anshuman Gupta <anshuman.gupta at intel.com>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
>>>    1 file changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
>>> index 77e7df21f539..fc3e7c69af2a 100644
>>> --- a/drivers/gpu/drm/i915/i915_pci.c
>>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>>> @@ -25,6 +25,7 @@
>>>    #include <drm/drm_color_mgmt.h>
>>>    #include <drm/drm_drv.h>
>>>    #include <drm/i915_pciids.h>
>>> +#include <linux/pm_runtime.h>
>>>    #include "gt/intel_gt_regs.h"
>>>    #include "gt/intel_sa_media.h"
>>> @@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>>    {
>>>    	struct intel_device_info *intel_info =
>>>    		(struct intel_device_info *) ent->driver_data;
>>> +	struct device *kdev = &pdev->dev;
>>>    	int err;
>>>    	if (intel_info->require_force_probe &&
>>> @@ -1314,6 +1316,12 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>>    			 "module parameter or CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
>>>    			 "or (recommended) check for kernel updates.\n",
>>>    			 pdev->device, pdev->device, pdev->device);
>>> +
>>> +		/* Let's not waste power if we are not managing the device */
>>> +		pm_runtime_use_autosuspend(kdev);
>>> +		pm_runtime_allow(kdev);
>>> +		pm_runtime_put_autosuspend(kdev);
AFAIK we don't need to enable autosuspend here, 
pm_runtime_put_autosuspend() will cause a NULL pointer de-reference as 
it will immediately call the intel_runtime_suspend()(because we haven't 
called the pm_runtime_mark_last_busy) without initializing i915.

Having said that we only need below, in order to let pci core keep the 
pci dev in D3.

pm_runtime_put_noidle()

Br,
Anshuman Gupta


>>
>> This sequence is black magic to me so can't really comment on the specifics. But in general, what I think I've figured out is, that the PCI core calls our runtime resume callback before probe:
>>
>> local_pci_probe:
>> ...
>>          /*
>>           * Unbound PCI devices are always put in D0, regardless of
>>           * runtime PM status.  During probe, the device is set to
>>           * active and the usage count is incremented.  If the driver
>>           * supports runtime PM, it should call pm_runtime_put_noidle(),
>>           * or any other runtime PM helper function decrementing the usage
>>           * count, in its probe routine and pm_runtime_get_noresume() in
>>           * its remove routine.
>>           */
>>          pm_runtime_get_sync(dev);
>>          pci_dev->driver = pci_drv;
>>          rc = pci_drv->probe(pci_dev, ddi->id);
>>          if (!rc)
>>                  return rc;
>>          if (rc < 0) {
>>                  pci_dev->driver = NULL;
>>                  pm_runtime_put_sync(dev);
>>                  return rc;
>>          }
>>
> 
> Yes, in Linux the default is D0 for any unmanaged device. But then the
> user can go there in the sysfs and change the power/control to 'auto'
> and get the device to D3.
> 
>> And if probe fails it calls pm_runtime_put_sync which presumably does not provide the symmetry we need?
> 
> The main problem I see is that when the probe fail in our case we don't
> unregister and i915 is still listed as controlling that device as we could
> see with lspci --nnv.
> 
> And any attempt to change the control to 'auto' fails. So we are forever
> stuck in D0.
> 
> So, I really believe it is better to bring the device to D3 then leaving
> it there blocked in D0 forever.
> 
> Or forcing users to use another parameter to entirely avoid i915 to get
> this device at first place.
> 
>>
>> Anyway since I can't provide meaningful review I'll copy Imre since I think he worked in the area in the past. Just so more eyes is better.
>>
>> Regards,
>>
>> Tvrtko
>>
>>
>>> +
>>>    		return -ENODEV;
>>>    	}


More information about the Intel-gfx mailing list