[PATCH] drm/amdgpu: fix the hw hang during perform system reboot and reset

Alex Deucher alexdeucher at gmail.com
Mon Apr 13 20:32:45 UTC 2020


On Mon, Apr 13, 2020 at 2:17 PM Paul Menzel
<pmenzel+amd-gfx at molgen.mpg.de> wrote:
>
> Dear Alex, dear Prike,
>
>
> Am 13.04.20 um 17:14 schrieb Alex Deucher:
> > On Mon, Apr 13, 2020 at 11:09 AM Prike Liang <Prike.Liang at amd.com> wrote:
> >>
> >> Unify set device CGPG to ungate state before enter poweroff or reset.
> >>
> >> Signed-off-by: Prike Liang <Prike.Liang at amd.com>
> >> Tested-by: Mengbing Wang <Mengbing.Wang at amd.com>
> >
> > Acked-by: Alex Deucher <alexander.deucher at amd.com>
>
> First:
>
> Tested-by: Paul Menzel <pmenzel at molgen.mpg.de> (MSI B350M MORTAR
> (MS-7A37) with an AMD Ryzen 3 2200G)
>
> Second, I am having trouble to understand, how you can add your Acked-by
> tag to a commit with such a commit message?
>
> The problem is not described (apparently it only affected certain
> devices), it is not mentioned that it’s a regression (Fixes: tag/line is
> missing), and I am having a hard time to understand the commit message
> at all (and the one from the commit introducing the regression). Why is
> it more or less reverting part of the other commit, while the issue was
> not reproducible on Prike’s system?

The original issue was that we were not properly ungating some of the
hw blocks in the right order for S3 suspend on renoir.  So the fix was
to add ungate calls to amdgpu_device_suspend() to handle that case.
However, the original fix should not have removed the calls from
amdgpu_device_ip_suspend_phase1() since that is called separately for
some other use cases (e.g., pci shutdown).  It didn't matter for some
asics as they don't have different levels of powergating
functionality.  I'll add the fixes tag before the patch goes upstream.

Alex

>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
> >>   1 file changed, 2 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> index 87f7c12..bbe090a 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> @@ -2413,6 +2413,8 @@ static int amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev)
> >>   {
> >>          int i, r;
> >>
> >> +       amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
> >> +       amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
> >>
> >>          for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
> >>                  if (!adev->ip_blocks[i].status.valid)
> >> --
> >> 2.7.4
>
> Kind regards,
>
> Paul


More information about the amd-gfx mailing list