[PATCH] drm/amdgpu: resove reboot exception for si oland

Lazar, Lijo Lijo.Lazar at amd.com
Fri Mar 10 15:55:02 UTC 2023


[AMD Official Use Only - General]

I recall that there was a previous discussion around this and that time we found that the range is already set earlier during DPM enablement.

The suspected root cause was enable/disable of thermal alert within this call to set range again.

Thanks,
Lijo
________________________________
From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> on behalf of Alex Deucher <alexdeucher at gmail.com>
Sent: Friday, March 10, 2023 8:51:06 PM
To: Chen, Guchun <Guchun.Chen at amd.com>
Cc: David Airlie <airlied at linux.ie>; Pan, Xinhui <Xinhui.Pan at amd.com>; Zhenneng Li <lizhenneng at kylinos.cn>; amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>; linux-kernel at vger.kernel.org <linux-kernel at vger.kernel.org>; dri-devel at lists.freedesktop.org <dri-devel at lists.freedesktop.org>; Daniel Vetter <daniel at ffwll.ch>; Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
Subject: Re: [PATCH] drm/amdgpu: resove reboot exception for si oland

On Fri, Mar 10, 2023 at 3:18 AM Chen, Guchun <Guchun.Chen at amd.com> wrote:
>
>
> > -----Original Message-----
> > From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
> > Zhenneng Li
> > Sent: Friday, March 10, 2023 3:40 PM
> > To: Deucher, Alexander <Alexander.Deucher at amd.com>
> > Cc: David Airlie <airlied at linux.ie>; Pan, Xinhui <Xinhui.Pan at amd.com>;
> > linux-kernel at vger.kernel.org; dri-devel at lists.freedesktop.org; Zhenneng Li
> > <lizhenneng at kylinos.cn>; amd-gfx at lists.freedesktop.org; Daniel Vetter
> > <daniel at ffwll.ch>; Koenig, Christian <Christian.Koenig at amd.com>
> > Subject: [PATCH] drm/amdgpu: resove reboot exception for si oland
> >
> > During reboot test on arm64 platform, it may failure on boot.
> >
> > The error message are as follows:
> > [    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init [amdgpu]]
> > *ERROR*
> >                           late_init of IP block <si_dpm> failed -22
> > [    7.006919][ 7] [  T295] amdgpu 0000:04:00.0: amdgpu_device_ip_late_init
> > failed
> > [    7.014224][ 7] [  T295] amdgpu 0000:04:00.0: Fatal error during GPU init
> > ---
> >  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 3 ---
> >  1 file changed, 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > index d6d9e3b1b2c0..dee51c757ac0 100644
> > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> > @@ -7632,9 +7632,6 @@ static int si_dpm_late_init(void *handle)
> >       if (!adev->pm.dpm_enabled)
> >               return 0;
> >
> > -     ret = si_set_temperature_range(adev);
> > -     if (ret)
> > -             return ret;
>
> si_set_temperature_range should be platform agnostic. Can you please elaborate more?
>

Yes.  Not setting this means we won't get thermal interrupts.  We
shouldn't skip this.

Alex


> Regards,
> Guchun
>
> >  #if 0 //TODO ?
> >       si_dpm_powergate_uvd(adev, true);
> >  #endif
> > --
> > 2.25.1
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20230310/528b09b9/attachment-0001.htm>


More information about the amd-gfx mailing list