[PATCH] drm/amdgpu: fix AGP addressing when GART is not at 0
Zhang, Yifan
Yifan1.Zhang at amd.com
Thu Nov 16 15:05:17 UTC 2023
[AMD Official Use Only - General]
Yes, it fixes regressions in KFDTest introduced by this commit ("b93ed51c32ca drm/amdgpu: fix AGP init order"),
e.g. KFDMemoryTest.MemoryRegister failure:
fault addr 0x00008084575a6000 is calculated by (gart_start + AGP aperture mc addr) wrongly.
[ 46.662856] amdgpu 0000:c2:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:169 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 46.662890] amdgpu 0000:c2:00.0: amdgpu: in page starting at address 0x00008084575a6000 from client 10
[ 46.662909] amdgpu 0000:c2:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040B52
[ 46.662923] amdgpu 0000:c2:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5)
[ 46.662936] amdgpu 0000:c2:00.0: amdgpu: MORE_FAULTS: 0x0
[ 46.662947] amdgpu 0000:c2:00.0: amdgpu: WALKER_ERROR: 0x1
[ 46.662957] amdgpu 0000:c2:00.0: amdgpu: PERMISSION_FAULTS: 0x5
[ 46.662968] amdgpu 0000:c2:00.0: amdgpu: MAPPING_ERROR: 0x1
[ 46.662979] amdgpu 0000:c2:00.0: amdgpu: RW: 0x1
-----Original Message-----
From: Alex Deucher <alexdeucher at gmail.com>
Sent: Thursday, November 16, 2023 10:26 PM
To: Zhang, Yifan <Yifan1.Zhang at amd.com>
Cc: Koenig, Christian <Christian.Koenig at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; amd-gfx at lists.freedesktop.org; Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>
Subject: Re: [PATCH] drm/amdgpu: fix AGP addressing when GART is not at 0
On Thu, Nov 16, 2023 at 4:37 AM Zhang, Yifan <Yifan1.Zhang at amd.com> wrote:
>
> [AMD Official Use Only - General]
>
> Ping... this patch seems still not merged.
>
Can you confirm it fixes the AGP issues you saw?
Alex
> Best Regards,
> Yifan
>
> -----Original Message-----
> From: Alex Deucher <alexdeucher at gmail.com>
> Sent: Monday, November 13, 2023 2:13 AM
> To: Koenig, Christian <Christian.Koenig at amd.com>
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>;
> amd-gfx at lists.freedesktop.org; Zhang, Yifan <Yifan1.Zhang at amd.com>;
> Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>
> Subject: Re: [PATCH] drm/amdgpu: fix AGP addressing when GART is not
> at 0
>
> On Sat, Nov 11, 2023 at 2:17 AM Christian König <christian.koenig at amd.com> wrote:
> >
> > Am 10.11.23 um 15:47 schrieb Alex Deucher:
> > > This worked by luck if the GART aperture ended up at 0. When we
> > > ended up moving GART on some chips, the GART aperture ended up
> > > offsetting the the AGP address since the resource->start is a GART
> > > offset, not an MC address. Fix this by moving the AGP address
> > > setup into amdgpu_bo_gpu_offset_no_check().
> > >
> > > Reported-by: Jesse Zhang <Jesse.Zhang at amd.com>
> > > Reported-by: Yifan Zhang <yifan1.zhang at amd.com>
> > > Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> > > Cc: christian.koenig at amd.com
> > > ---
> > > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++---
> > > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 +---
> > > 2 files changed, 8 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > index cef920a93924..1b3e97522838 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > @@ -1527,10 +1527,14 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
> > > u64 amdgpu_bo_gpu_offset_no_check(struct amdgpu_bo *bo)
> > > {
> > > struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> > > - uint64_t offset;
> > > + uint64_t offset, addr;
> > >
> > > - offset = (bo->tbo.resource->start << PAGE_SHIFT) +
> > > - amdgpu_ttm_domain_start(adev, bo->tbo.resource->mem_type);
> > > + addr = amdgpu_gmc_agp_addr(&bo->tbo);
> >
> > IIRC you must check bo->tbo.resource->mem_type before calling
> > amdgpu_gmc_agp_addr().
>
> Yes, this was fixed in v2.
>
> Alex
>
> >
> > Regards,
> > Christian.
> >
> > > + if (addr != AMDGPU_BO_INVALID_OFFSET)
> > > + offset = addr;
> > > + else
> > > + offset = (bo->tbo.resource->start << PAGE_SHIFT) +
> > > + amdgpu_ttm_domain_start(adev,
> > > + bo->tbo.resource->mem_type);
> > >
> > > return amdgpu_gmc_sign_extend(offset);
> > > }
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > > index 05991c5c8ddb..ab4a762aed5b 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > > @@ -959,10 +959,8 @@ int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo)
> > > return 0;
> > >
> > > addr = amdgpu_gmc_agp_addr(bo);
> > > - if (addr != AMDGPU_BO_INVALID_OFFSET) {
> > > - bo->resource->start = addr >> PAGE_SHIFT;
> > > + if (addr != AMDGPU_BO_INVALID_OFFSET)
> > > return 0;
> > > - }
> > >
> > > /* allocate GART space */
> > > placement.num_placement = 1;
> >
More information about the amd-gfx
mailing list