[PATCH] drm/amdgpu: fix the hw hang during perform system reboot and reset

Liang, Prike Prike.Liang at amd.com
Tue Apr 14 02:06:17 UTC 2020


> 
> On Mon, Apr 13, 2020 at 2:17 PM Paul Menzel <pmenzel+amd-
> gfx at molgen.mpg.de> wrote:
> >
> > Dear Alex, dear Prike,
> >
> >
> > Am 13.04.20 um 17:14 schrieb Alex Deucher:
> > > On Mon, Apr 13, 2020 at 11:09 AM Prike Liang <Prike.Liang at amd.com>
> wrote:
> > >>
> > >> Unify set device CGPG to ungate state before enter poweroff or reset.
> > >>
> > >> Signed-off-by: Prike Liang <Prike.Liang at amd.com>
> > >> Tested-by: Mengbing Wang <Mengbing.Wang at amd.com>
> > >
> > > Acked-by: Alex Deucher <alexander.deucher at amd.com>
> >
> > First:
> >
> > Tested-by: Paul Menzel <pmenzel at molgen.mpg.de> (MSI B350M MORTAR
> > (MS-7A37) with an AMD Ryzen 3 2200G)
> >
> > Second, I am having trouble to understand, how you can add your
> > Acked-by tag to a commit with such a commit message?
> >
> > The problem is not described (apparently it only affected certain
> > devices), it is not mentioned that it’s a regression (Fixes: tag/line
> > is missing), and I am having a hard time to understand the commit
> > message at all (and the one from the commit introducing the
> > regression). Why is it more or less reverting part of the other
> > commit, while the issue was not reproducible on Prike’s system?
> 
> The original issue was that we were not properly ungating some of the hw
> blocks in the right order for S3 suspend on renoir.  So the fix was to add
> ungate calls to amdgpu_device_suspend() to handle that case.
> However, the original fix should not have removed the calls from
> amdgpu_device_ip_suspend_phase1() since that is called separately for
> some other use cases (e.g., pci shutdown).  It didn't matter for some asics as
> they don't have different levels of powergating functionality.  I'll add the fixes
> tag before the patch goes upstream.
> 
> Alex
> 
[Prike]  Thanks Alex help clarify and I will give more detail in the message.
> >
> > >> ---
> > >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
> > >>   1 file changed, 2 insertions(+)
> > >>
> > >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > >> index 87f7c12..bbe090a 100644
> > >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > >> @@ -2413,6 +2413,8 @@ static int
> amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev)
> > >>   {
> > >>          int i, r;
> > >>
> > >> +       amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
> > >> +       amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
> > >>
> > >>          for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
> > >>                  if (!adev->ip_blocks[i].status.valid)
> > >> --
> > >> 2.7.4
> >
> > Kind regards,
> >
> > Paul


More information about the amd-gfx mailing list