[PATCH] drm/amdgpu: add mb for si

Quan, Evan Evan.Quan at amd.com
Fri Nov 25 02:06:34 UTC 2022


[AMD Official Use Only - General]

Did you see that? It's a patch which I created by git-format-patch.
Anyway I will paste the changes below. I was suspecting maybe we need some waits for smu running.

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
index 49c398ec0aaf..9f308a021b2d 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
@@ -6814,6 +6814,7 @@ static int si_dpm_enable(struct amdgpu_device *adev)
        struct si_power_info *si_pi = si_get_pi(adev);
        struct amdgpu_ps *boot_ps = adev->pm.dpm.boot_ps;
        int ret;
+       int i;

        if (amdgpu_si_is_smc_running(adev))
                return -EINVAL;
@@ -6909,6 +6910,17 @@ static int si_dpm_enable(struct amdgpu_device *adev)
        si_program_response_times(adev);
        si_program_ds_registers(adev);
        si_dpm_start_smc(adev);
+       /* Waiting for smc alive */
+       for (i = 0; i < adev->usec_timeout; i++) {
+               if (amdgpu_si_is_smc_running(adev))
+                       break;
+               udelay(1);
+       }
+       if (i >= adev->usec_timeout) {
+               DRM_ERROR("Timedout on waiting for smu running\n");
+               return -EINVAL;
+       }
+
        ret = si_notify_smc_display_change(adev, false);
        if (ret) {
                DRM_ERROR("si_notify_smc_display_change failed\n");


BR
Evan
> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken at gmail.com>
> Sent: Thursday, November 24, 2022 6:06 PM
> To: Quan, Evan <Evan.Quan at amd.com>; 李真能 <lizhenneng at kylinos.cn>;
> Michel Dänzer <michel.daenzer at mailbox.org>; Koenig, Christian
> <Christian.Koenig at amd.com>; Deucher, Alexander
> <Alexander.Deucher at amd.com>
> Cc: dri-devel at lists.freedesktop.org; Pan, Xinhui <Xinhui.Pan at amd.com>;
> linux-kernel at vger.kernel.org; amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> 
> That's not a patch but some binary file?
> 
> Christian.
> 
> Am 24.11.22 um 11:04 schrieb Quan, Evan:
> > [AMD Official Use Only - General]
> >
> > Could the attached patch help?
> >
> > Evan
> >> -----Original Message-----
> >> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf
> Of ???
> >> Sent: Friday, November 18, 2022 5:25 PM
> >> To: Michel Dänzer <michel.daenzer at mailbox.org>; Koenig, Christian
> >> <Christian.Koenig at amd.com>; Deucher, Alexander
> >> <Alexander.Deucher at amd.com>
> >> Cc: amd-gfx at lists.freedesktop.org; Pan, Xinhui <Xinhui.Pan at amd.com>;
> >> linux-kernel at vger.kernel.org; dri-devel at lists.freedesktop.org
> >> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> >>
> >>
> >> 在 2022/11/18 17:18, Michel Dänzer 写道:
> >>> On 11/18/22 09:01, Christian König wrote:
> >>>> Am 18.11.22 um 08:48 schrieb Zhenneng Li:
> >>>>> During reboot test on arm64 platform, it may failure on boot, so
> >>>>> add this mb in smc.
> >>>>>
> >>>>> The error message are as follows:
> >>>>> [    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
> >>>>> [amdgpu]] *ERROR*
> >>>>>                   late_init of IP block <si_dpm> failed -22 [
> >>>>> 7.006919][ 7] [  T295] amdgpu 0000:04:00.0:
> >>>>> amdgpu_device_ip_late_init failed [    7.014224][ 7] [  T295]
> >>>>> amdgpu
> >>>>> 0000:04:00.0: Fatal error during GPU init
> >>>> Memory barries are not supposed to be sprinkled around like this,
> >>>> you
> >> need to give a detailed explanation why this is necessary.
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> Signed-off-by: Zhenneng Li <lizhenneng at kylinos.cn>
> >>>>> ---
> >>>>>     drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
> >>>>>     1 file changed, 2 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>> index 8f994ffa9cd1..c7656f22278d 100644
> >>>>> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
> >>>>> amdgpu_device *adev)
> >>>>>         u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
> >>>>>         u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
> >>>>>     +    mb();
> >>>>> +
> >>>>>         if (!(rst & RST_REG) && !(clk & CK_DISABLE))
> >>>>>             return true;
> >>> In particular, it makes no sense in this specific place, since it
> >>> cannot directly
> >> affect the values of rst & clk.
> >>
> >> I thinks so too.
> >>
> >> But when I do reboot test using nine desktop machines,  there maybe
> >> report this error on one or two machines after Hundreds of times or
> >> Thousands of times reboot test, at the beginning, I use msleep()
> >> instead of mb(), these two methods are all works, but I don't know what
> is the root case.
> >>
> >> I use this method on other verdor's oland card, this error message
> >> are reported again.
> >>
> >> What could be the root reason?
> >>
> >> test environmen:
> >>
> >> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87
> >>
> >> driver: amdgpu
> >>
> >> os: ubuntu 2004
> >>
> >> platform: arm64
> >>
> >> kernel: 5.4.18
> >>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 18214 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20221125/510b038c/attachment-0001.bin>


More information about the amd-gfx mailing list