[PATCH 1/3] drm/amd/pm/smu11: BACO is supported when it's in BACO state

Chen, Guchun Guchun.Chen at amd.com
Wed Dec 28 02:02:27 UTC 2022


Hi Bjorn,

Thanks for your comments.

Exactly, this patch actually fixes an amdgpu driver issue, however it's only exposed by 8795e182b02d. Without it, it was running well over the past.

Regarding the question, I will provide a similar change in smu13 later on.

Regards,
Guchun

-----Original Message-----
From: Bjorn Helgaas <helgaas at kernel.org> 
Sent: Wednesday, December 28, 2022 2:24 AM
To: Chen, Guchun <Guchun.Chen at amd.com>
Cc: amd-gfx at lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher at amd.com>; Zhang, Hawking <Hawking.Zhang at amd.com>; Lazar, Lijo <Lijo.Lazar at amd.com>; Quan, Evan <Evan.Quan at amd.com>; Stefan Roese <sr at denx.de>; linux-pci at vger.kernel.org
Subject: Re: [PATCH 1/3] drm/amd/pm/smu11: BACO is supported when it's in BACO state

[+cc Stefan, linux-pci]

On Wed, Nov 23, 2022 at 09:43:07AM +0800, Guchun Chen wrote:
> Return true early if ASIC is in BACO state already, no need to talk to 
> SMU. It can fix the issue that driver was not calling BACO exit at all 
> in runtime pm resume, and a timing issue leading to a PCI AER error 
> happened eventually.

This sounds suspiciously racy.

> Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in 
> get_port_device_capability()")

To clarify, this patch avoids a driver problem, not a problem with 8795e182b02d.

See question below.

> Suggested-by: Lijo Lazar <lijo.lazar at amd.com>
> Signed-off-by: Guchun Chen <guchun.chen at amd.com>
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> index 70b560737687..ad5f6a15a1d7 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> @@ -1588,6 +1588,10 @@ bool smu_v11_0_baco_is_support(struct smu_context *smu)
>  	if (amdgpu_sriov_vf(smu->adev) || !smu_baco->platform_support)
>  		return false;
>  
> +	/* return true if ASIC is in BACO state already */
> +	if (smu_v11_0_baco_get_state(smu) == SMU_BACO_STATE_ENTER)
> +		return true;

smu_v13_0_baco_is_support() is essentially identical to smu_v11_0_baco_is_support().  Does it need a similar change?

>  	/* Arcturus does not support this bit mask */
>  	if (smu_cmn_feature_is_supported(smu, SMU_FEATURE_BACO_BIT) &&
>  	   !smu_cmn_feature_is_enabled(smu, SMU_FEATURE_BACO_BIT))
> --
> 2.25.1
> 


More information about the amd-gfx mailing list