[PATCH] drm/amd: Add a new parameter that can force reset on suspend

Alex Deucher alexdeucher at gmail.com
Tue Sep 27 18:09:52 UTC 2022


On Tue, Sep 27, 2022 at 1:57 PM Mario Limonciello
<mario.limonciello at amd.com> wrote:
>
> We previously had in place some behavior that would cause APU or GPU
> to be reset during suspend entry.  This caused various problems, and
> eventually was reverted.  There are however users that preferred this
> behavior, so add an option that they can use to force it.

You should never need to reset on suspend unless the platform is not
actually going into a low power state.  The reason the reset was added
was as a workaround for aborted suspends so that we can put the GPU
into a known good state if we need to re-initialize the hardware with
the device not actually having been powered down.  In most cases I've
seen where a reset "helps" it's because the GPU was not actually
powered off during suspend.  Adding a workaround like this will just
hide actual bugs or users will read somewhere that they need to set it
and it will break suspend for them and then we'll get spurious bug
reports.

Alex


>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216516
> Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h      | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  | 8 ++++++++
>  3 files changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index ae9371b172e3a..85999f48e2835 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -230,6 +230,7 @@ extern bool pcie_p2p;
>
>  extern int amdgpu_tmz;
>  extern int amdgpu_reset_method;
> +extern int amdgpu_reset_on_suspend;
>
>  #ifdef CONFIG_DRM_AMDGPU_SI
>  extern int amdgpu_si_support;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> index b14800ac179ee..17c0a0ec21bd6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> @@ -1051,6 +1051,8 @@ bool amdgpu_acpi_is_s3_active(struct amdgpu_device *adev)
>   */
>  bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device *adev)
>  {
> +       if (amdgpu_reset_on_suspend >= 0)
> +               return !!amdgpu_reset_on_suspend;
>         if (adev->flags & AMD_IS_APU)
>                 return false;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 16f6a313335e9..6a786dacf2c2d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -179,6 +179,7 @@ int amdgpu_noretry = -1;
>  int amdgpu_force_asic_type = -1;
>  int amdgpu_tmz = -1; /* auto */
>  int amdgpu_reset_method = -1; /* auto */
> +int amdgpu_reset_on_suspend = -1; /* auto */
>  int amdgpu_num_kcq = -1;
>  int amdgpu_smartshift_bias;
>  int amdgpu_use_xgmi_p2p = 1;
> @@ -870,6 +871,13 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
>  MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco/bamaco)");
>  module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>
> +/**
> + * DOC: reset_on_suspend (int)
> + * GPUs should be reset on suspend (-1 = auto (default), 0 = no, 1 = yes)
> + */
> +MODULE_PARM_DESC(reset_on_suspend, "GPUs should be reset on suspend (-1 = auto (default), 0 = no 1 = yes");
> +module_param_named(reset_on_suspend, amdgpu_reset_on_suspend, int, 0444);
> +
>  /**
>   * DOC: bad_page_threshold (int) Bad page threshold is specifies the
>   * threshold value of faulty pages detected by RAS ECC, which may
> --
> 2.25.1
>


More information about the amd-gfx mailing list