[PATCH] drm/amdgpu: Add a new module param to disable d3cold

Lazar, Lijo lijo.lazar at amd.com
Thu Nov 30 10:55:13 UTC 2023



On 11/30/2023 4:17 PM, Ma, Jun wrote:
> Hi Lijo,
> 
> On 11/30/2023 5:18 PM, Lazar, Lijo wrote:
>>
>>
>> On 11/30/2023 11:59 AM, Ma, Jun wrote:
>>> Hi Alex,
>>>
>>> On 11/30/2023 12:39 AM, Alex Deucher wrote:
>>>> On Wed, Nov 29, 2023 at 11:37 AM Ma Jun <Jun.Ma2 at amd.com> wrote:
>>>>>
>>>>> Some platforms can't resume from d3cold state, So add a
>>>>> new module parameter to disable d3cold state for debugging
>>>>> purpose or workaround.
>>>>
>>>> Doesn't the runpm parameter already handle this?  If you set runpm=0,
>>>> that should disable d3cold.
>>>>
>>> runpm=0  prevents calls to driver runtime_suspend/resume functions.
>>> While d3cold=0 allows calls to runtime_suspend/resume functions and puts
>>> the device in d3hot state instead of d3cold.
>>>
>>
>> Why not use the sysfs node to change "d3cold_allowed" on the device's
>> upstream bridge?
>>
> It seems the same question as Mario. Please refer to my reply to his question.
> 

Once you disable on the device, all upstream devices along the path will 
be taken care. I don't see a special need to disable BOCO separately. 
pci_d3cold_disable is the same API used by sysfs node also.

Thanks,
Lijo

> Regards,
> Ma Jun
> 
>> Thanks,
>> Lijo
>>
>>> Regards,
>>> Ma Jun
>>>
>>>> Alex
>>>>
>>>>>
>>>>> Signed-off-by: Ma Jun <Jun.Ma2 at amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 1 +
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++++++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 8 ++++++++
>>>>>    3 files changed, 16 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> index a9f54df9d33e..db9f60790267 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> @@ -166,6 +166,7 @@ extern char amdgpu_lockup_timeout[AMDGPU_MAX_TIMEOUT_PARAM_LENGTH];
>>>>>    extern int amdgpu_dpm;
>>>>>    extern int amdgpu_fw_load_type;
>>>>>    extern int amdgpu_aspm;
>>>>> +extern int amdgpu_d3cold;
>>>>>    extern int amdgpu_runtime_pm;
>>>>>    extern uint amdgpu_ip_block_mask;
>>>>>    extern int amdgpu_bapm;
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index 22b6a910b7f2..90501c44e7d0 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -264,6 +264,13 @@ bool amdgpu_device_supports_px(struct drm_device *dev)
>>>>>    bool amdgpu_device_supports_boco(struct drm_device *dev)
>>>>>    {
>>>>>           struct amdgpu_device *adev = drm_to_adev(dev);
>>>>> +       struct pci_dev *parent;
>>>>> +
>>>>> +       if (!amdgpu_d3cold) {
>>>>> +               parent = pcie_find_root_port(adev->pdev);
>>>>> +               pci_d3cold_disable(parent);
>>>>> +               return false;
>>>>> +       }
>>>>>
>>>>>           if (adev->has_pr3 ||
>>>>>               ((adev->flags & AMD_IS_PX) && amdgpu_is_atpx_hybrid()))
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> index 5f14f04cb553..c9fbb8bd4169 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> @@ -145,6 +145,7 @@ char amdgpu_lockup_timeout[AMDGPU_MAX_TIMEOUT_PARAM_LENGTH];
>>>>>    int amdgpu_dpm = -1;
>>>>>    int amdgpu_fw_load_type = -1;
>>>>>    int amdgpu_aspm = -1;
>>>>> +int amdgpu_d3cold = -1;
>>>>>    int amdgpu_runtime_pm = -1;
>>>>>    uint amdgpu_ip_block_mask = 0xffffffff;
>>>>>    int amdgpu_bapm = -1;
>>>>> @@ -359,6 +360,13 @@ module_param_named(fw_load_type, amdgpu_fw_load_type, int, 0444);
>>>>>    MODULE_PARM_DESC(aspm, "ASPM support (1 = enable, 0 = disable, -1 = auto)");
>>>>>    module_param_named(aspm, amdgpu_aspm, int, 0444);
>>>>>
>>>>> +/**
>>>>> + * DOC: d3cold (int)
>>>>> + * To disable d3cold (1 = enable, 0 = disable). The default is -1 (auto, enabled).
>>>>> + */
>>>>> +MODULE_PARM_DESC(d3cold, "d3cold support (1 = enable, 0 = disable, -1 = auto)");
>>>>> +module_param_named(d3cold, amdgpu_d3cold, int, 0444);
>>>>> +
>>>>>    /**
>>>>>     * DOC: runpm (int)
>>>>>     * Override for runtime power management control for dGPUs. The amdgpu driver can dynamically power down
>>>>> --
>>>>> 2.34.1
>>>>>
>>
>>


More information about the amd-gfx mailing list