[PATCH v2] drm/amdgpu: Fix discovery initialization failure during pci rescan

Alex Deucher alexdeucher at gmail.com
Mon Apr 8 18:47:21 UTC 2024


On Tue, Apr 2, 2024 at 7:56 AM Christian König <christian.koenig at amd.com> wrote:
>
> Am 02.04.24 um 12:05 schrieb Ma Jun:
> > Waiting for system ready to fix the discovery initialization
> > failure issue. This failure usually occurs when dGPU is removed
> > and then rescanned via command line.
> > It's caused by following two errors:
> > [1] vram size is 0
> > [2] wrong binary signature
> >
> > Signed-off-by: Ma Jun <Jun.Ma2 at amd.com>
>
> I'm not an expert for that stuff, but using dev_is_removable() indeed
> seems to be incorrect here.
>
> Feel free to add an Acked-by: Christian König
> <christian.koenig at amd.com>, but I would rather wait for Alex to come
> back from vacation and take a look as well.
>
> Might be that I missed something why the dev_is_removable() check is
> mandatory or something like that.

I added it originally for USB4/thunderbolt connected devices (hence
the removable check) and didn't want to add the extra latency all the
time, but I hadn't considered the rescan case.

Patch is:
Reviewed-by: Alex Deucher <alexander.deucher at amd.com>

Alex


>
> Regards,
> Christian.
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 17 ++++++-----------
> >   1 file changed, 6 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> > index 07c5fca06178..90735e966318 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> > @@ -255,7 +255,6 @@ static int amdgpu_discovery_read_binary_from_mem(struct amdgpu_device *adev,
> >       uint64_t vram_size;
> >       u32 msg;
> >       int i, ret = 0;
> > -     int ip_discovery_ver = 0;
> >
> >       /* It can take up to a second for IFWI init to complete on some dGPUs,
> >        * but generally it should be in the 60-100ms range.  Normally this starts
> > @@ -265,17 +264,13 @@ static int amdgpu_discovery_read_binary_from_mem(struct amdgpu_device *adev,
> >        * continue.
> >        */
> >
> > -     ip_discovery_ver = RREG32(mmIP_DISCOVERY_VERSION);
> > -     if ((dev_is_removable(&adev->pdev->dev)) ||
> > -         (ip_discovery_ver == IP_DISCOVERY_V2) ||
> > -         (ip_discovery_ver == IP_DISCOVERY_V4)) {
> > -             for (i = 0; i < 1000; i++) {
> > -                     msg = RREG32(mmMP0_SMN_C2PMSG_33);
> > -                     if (msg & 0x80000000)
> > -                             break;
> > -                     msleep(1);
> > -             }
> > +     for (i = 0; i < 1000; i++) {
> > +             msg = RREG32(mmMP0_SMN_C2PMSG_33);
> > +             if (msg & 0x80000000)
> > +                     break;
> > +             usleep_range(1000, 1100);
> >       }
> > +
> >       vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20;
> >
> >       if (vram_size) {
>


More information about the amd-gfx mailing list