[Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back

Matthew Auld matthew.auld at intel.com
Wed Feb 22 12:01:58 UTC 2023


On 21/02/2023 21:16, Lucas De Marchi wrote:
> On Tue, Feb 21, 2023 at 02:52:21PM +0000, Matthew Auld wrote:
>> In local_pci_probe() the core kernel increments the rpm for the device,
>> just before calling into the probe hook. If the driver/device supports
>> runtime pm it is then meant to drop this ref during probe (like we do in
> 
> s/drop/put/ to be consistent with the terminology?
> 
>> xe_pm_runtime_init()). However when removing the device we then also need
>> to give the reference back, otherwise the ref that is dropped in
> 
> give? we are calling pm_runtime_get_sync(), which  would be "take".
> 
>> pci_device_remove() will be unbalanced when for example unloading the
>> driver, leading to warnings like:
>>
>>    [ 3808.596345] xe 0000:03:00.0: Runtime PM usage count underflow!
>>
>> Fix this by incrementing the rpm ref when removing the device.
>>
>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/193
>> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>> Cc: Matthew Brost <matthew.brost at intel.com>
>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_pci.c | 1 +
>> drivers/gpu/drm/xe/xe_pm.c  | 7 +++++++
>> drivers/gpu/drm/xe/xe_pm.h  | 1 +
>> 3 files changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>> index 25598de3a1fc..85d337cd8fbe 100644
>> --- a/drivers/gpu/drm/xe/xe_pci.c
>> +++ b/drivers/gpu/drm/xe/xe_pci.c
>> @@ -441,6 +441,7 @@ static void xe_pci_remove(struct pci_dev *pdev)
>>         return;
>>
>>     xe_device_remove(xe);
>> +    xe_pm_runtime_fini(xe);
> 
> after xe_device_remove()? Wouldn't that end up calling the last
> drm_dev_put() and thus triggering all the drmm_* releases?

In __device_release_driver() it will call device_remove() first, which 
eventually calls our xe_pci_remove() hook. A little further down it then 
calls device_unbind_cleanup(), which in turn calls devres_release_all(), 
which eventually calls into drm_managed_release() and handles all the 
drmm_* stuff, AFAICT.

> 
> Lucas De Marchi
> 
>>     pci_set_drvdata(pdev, NULL);
>> }
>>
>> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
>> index 44c38e670587..73d81621d960 100644
>> --- a/drivers/gpu/drm/xe/xe_pm.c
>> +++ b/drivers/gpu/drm/xe/xe_pm.c
>> @@ -128,6 +128,13 @@ void xe_pm_runtime_init(struct xe_device *xe)
>>     pm_runtime_put_autosuspend(dev);
>> }
>>
>> +void xe_pm_runtime_fini(struct xe_device *xe)
>> +{
>> +    struct device *dev = xe->drm.dev;
>> +
>> +    pm_runtime_get_sync(dev);
>> +}
>> +
>> int xe_pm_runtime_suspend(struct xe_device *xe)
>> {
>>     struct xe_gt *gt;
>> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
>> index b8c5f9558e26..6a885585f653 100644
>> --- a/drivers/gpu/drm/xe/xe_pm.h
>> +++ b/drivers/gpu/drm/xe/xe_pm.h
>> @@ -14,6 +14,7 @@ int xe_pm_suspend(struct xe_device *xe);
>> int xe_pm_resume(struct xe_device *xe);
>>
>> void xe_pm_runtime_init(struct xe_device *xe);
>> +void xe_pm_runtime_fini(struct xe_device *xe);
>> int xe_pm_runtime_suspend(struct xe_device *xe);
>> int xe_pm_runtime_resume(struct xe_device *xe);
>> int xe_pm_runtime_get(struct xe_device *xe);
>> -- 
>> 2.39.1
>>


More information about the Intel-xe mailing list