[Intel-xe] [PATCH v3 4/5] xe/drm/pm: Toggle d3cold_allowed using vram_usages
Riana Tauro
riana.tauro at intel.com
Wed Jul 5 11:10:06 UTC 2023
On 7/5/2023 12:32 PM, Gupta, Anshuman wrote:
>
>
>> -----Original Message-----
>> From: Tauro, Riana <riana.tauro at intel.com>
>> Sent: Tuesday, July 4, 2023 11:34 AM
>> To: Gupta, Anshuman <anshuman.gupta at intel.com>; intel-
>> xe at lists.freedesktop.org
>> Cc: Nilawar, Badal <badal.nilawar at intel.com>; Vivi, Rodrigo
>> <rodrigo.vivi at intel.com>; Sundaresan, Sujaritha
>> <sujaritha.sundaresan at intel.com>; Brost, Matthew
>> <matthew.brost at intel.com>
>> Subject: Re: [PATCH v3 4/5] xe/drm/pm: Toggle d3cold_allowed using
>> vram_usages
>>
>> Hi Anshuman
>>
>> On 6/27/2023 5:26 PM, Anshuman Gupta wrote:
>>> Adding support to control d3cold by using vram_usages metric from ttm
>>> resource manager.
>>> When root port is capable of d3cold but xe has disallowed d3cold due
>>> to vrame_usages above vram_d3ccold_threshol. It is required to disable
>>> d3cold to avoid any resume failure because root port can still
>>> transition to d3cold when all of pcie endpoints and {upstream,
>>> virtual} switch ports will transition to d3hot.
>>> Also cleaning up the TODO code comment.
>>>
>>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>> Signed-off-by: Anshuman Gupta <anshuman.gupta at intel.com>
>>> Reviewed-by: Badal Nilawar <badal.nilawar at intel.com>
>>> ---
>>> drivers/gpu/drm/xe/xe_pci.c | 27 ++++++++++++++++++++++++---
>>> drivers/gpu/drm/xe/xe_pm.c | 29 +++++++++++++++++++++++++++++
>>> drivers/gpu/drm/xe/xe_pm.h | 1 +
>>> 3 files changed, 54 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>>> index 848de5dcdaa5..78e906607188 100644
>>> --- a/drivers/gpu/drm/xe/xe_pci.c
>>> +++ b/drivers/gpu/drm/xe/xe_pci.c
>>> @@ -746,6 +746,24 @@ static int xe_pci_resume(struct device *dev)
>>> return 0;
>>> }
>>>
>>> +static void d3cold_toggle(struct pci_dev *pdev, bool enable) {
>>> + struct xe_device *xe = pdev_to_xe_device(pdev);
>>> + struct pci_dev *root_pdev;
>>> +
>>> + if (!xe->d3cold.capable)
>>> + return;
>>> +
>>> + root_pdev = pcie_find_root_port(pdev);
>>> + if (!root_pdev)
>>> + return;
>>> +
>>> + if (enable)
>>> + pci_d3cold_enable(root_pdev);
>>> + else
>>> + pci_d3cold_disable(root_pdev);
>>> +}
>>> +
>>> static int xe_pci_runtime_suspend(struct device *dev)
>>> {
>>> struct pci_dev *pdev = to_pci_dev(dev); @@ -763,6 +781,7 @@
>> static
>>> int xe_pci_runtime_suspend(struct device *dev)
>>> pci_ignore_hotplug(pdev);
>>> pci_set_power_state(pdev, PCI_D3cold);
>>> } else {
>>> + d3cold_toggle(pdev, false);
>>> pci_set_power_state(pdev, PCI_D3hot);
>>> }
>>>
>>> @@ -787,6 +806,8 @@ static int xe_pci_runtime_resume(struct device
>> *dev)
>>> return err;
>>>
>>> pci_set_master(pdev);
>>> + } else {
>>> + d3cold_toggle(pdev, true);
>>> }
>>>
>>> return xe_pm_runtime_resume(xe);
>>> @@ -800,15 +821,15 @@ static int xe_pci_runtime_idle(struct device *dev)
>>> if (!xe->d3cold.capable) {
>>> xe->d3cold.allowed = false;
>>> } else {
>>> + xe->d3cold.allowed = xe_pm_vram_d3cold_allowed(xe);
>>> +
>>> /*
>>> * TODO: d3cold should be allowed (true) if
>>> * (IS_DGFX(xe) && !xe_device_mem_access_ongoing(xe))
>>> * but maybe include some other conditions. So, before
>>> * we can re-enable the D3cold, we need to:
>>> * 1. rewrite the VRAM save / restore to avoid buffer object
>> locks
>>> - * 2. block D3cold if we have a big amount of device memory
>> in use
>>> - * in order to reduce the latency.
>>> - * 3. at resume, detect if we really lost power and avoid
>> memory
>>> + * 2. at resume, detect if we really lost power and avoid
>> memory
>>> * restoration if we were only up to d3cold
>>> */
>>> xe->d3cold.allowed = false;
>>> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
>>> index 7028c9b6e94c..4db4e5a1b051 100644
>>> --- a/drivers/gpu/drm/xe/xe_pm.c
>>> +++ b/drivers/gpu/drm/xe/xe_pm.c
>>> @@ -277,3 +277,32 @@ int xe_pm_set_vram_threshold(struct xe_device
>>> *xe, u32 threshold)
>>>
>>> return 0;
>>> }
>>> +
>>> +bool xe_pm_vram_d3cold_allowed(struct xe_device *xe) {
>>> + struct ttm_resource_manager *man;
>>> + u32 total_vram_used_mb = 0;
>>> + bool allowed;
>>> + u64 vram_used;
>>> + int i;
>>> +
>>> + /* TODO: Extend the logic to beyond XE_PL_VRAM1 */
>>> + for (i = XE_PL_VRAM0; i <= XE_PL_VRAM1; ++i) {
>>> + man = ttm_manager_type(&xe->ttm, i);
>>> + if (man) {
>>> + vram_used = ttm_resource_manager_usage(man);
>>> + total_vram_used_mb +=
>> DIV_ROUND_UP_ULL(vram_used, 1024 * 1024);
>>> + }
>>> + }
>>> +
>>> + mutex_lock(&xe->d3cold.lock);
>>> +
>>> + if (total_vram_used_mb <= xe->d3cold.vram_threshold)
>>> + allowed = true;
>> Can't xe->d3cold.allowed be directly modified here? There's also a lock
>> around the code
> patch have lock to protect the vram_threshold condition, it is not to proect the d3cold.allowed. I can assign d3cold.allowed here but that would require to change the function name to
> xe_pm_vram_toggle_d3cold_allow().
Not required to change.
LGTM
Reviewed-by: Riana Tauro <riana.tauro at intel.com>
> Br,
> Anshuman Gupta
>>
>> Thanks
>> Riana
>>> + else
>>> + allowed = false;
>>> +
>>> + mutex_unlock(&xe->d3cold.lock);
>>> +
>>> + return allowed;
>>> +}
>>> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
>>> index b50ec8bdce6f..08612cf3e67b 100644
>>> --- a/drivers/gpu/drm/xe/xe_pm.h
>>> +++ b/drivers/gpu/drm/xe/xe_pm.h
>>> @@ -24,5 +24,6 @@ int xe_pm_runtime_put(struct xe_device *xe);
>>> bool xe_pm_runtime_resume_if_suspended(struct xe_device *xe);
>>> int xe_pm_runtime_get_if_active(struct xe_device *xe);
>>> int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold);
>>> +bool xe_pm_vram_d3cold_allowed(struct xe_device *xe);
>>>
>>> #endif
More information about the Intel-xe
mailing list