[PATCH] drm/amdgpu: csa_vaddr should not larger than AMDGPU_GMC_HOLE_START

Thu Jan 17 08:04:41 UTC 2019

Hi Christian

I believe Wentao can fix the issue we it by below step:

  1.  Return Virtual_address_max (UMD use it) to HOLE_START – RESERVED_SIZE
  2.  [optional] Still Keep virtual_address_offset to RESERVED_SIZE (current way, I think it’s because previously we put CSA in 0 --> RESERVED_SIZE space)
  3.  Put CSA in HOLE_START – RESERVED_SIZE  ==> HOLE_START (it’s current design)

I don’t get where above scheme is not correct … can you give more explain for the GMC_HOLE_START ?

e.g.

  1.  why you set GMC_HOLE_START to 0x8’000’0000’0000 (half size of MAX of 48bit address space) ? is it for HSA purpose to make sure GPU address can also be used for CPU address ?
  2.  now MAX_PFN is 1’000’0000’0000, do you need to change GMC_HOLE_START ?

thanks
we need some catch up

/Monk

From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Koenig, Christian
Sent: Thursday, January 17, 2019 3:39 PM
To: Lou, Wentao <Wentao.Lou at amd.com>; Liu, Monk <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org; Zhu, Rex <Rex.Zhu at amd.com>
Cc: Deng, Emily <Emily.Deng at amd.com>
Subject: Re: [PATCH] drm/amdgpu: csa_vaddr should not larger than AMDGPU_GMC_HOLE_START

Am 17.01.19 um 04:17 schrieb Lou, Wentao:
Hi Christian,

Your solution as:
addr = (max_pfn - (AMDGPU_VA_RESERVED_SIZE >> AMDGPU_PAGE_SHIFT)) << AMDGPU_PAGE_SHIFT;
now max_pfn = 0x10 0000 0000, AMDGPU_VA_RESERVED_SIZE = 0x10 0000, AMDGPU_PAGE_SHIFT = 12
Still got addr = 0xFFFF FFF0 0000, which would cause ring gfx timeout.

But 0xFFFF FFF0 0000 is the correct address, if that is causing a problem then there is a bug somewhere else.

Please try to use AMDGPU_GMC_HOLE_START-AMDGPU_VA_RESERVED_SIZE as well. Does that work?

Before commit 1bf621c42137926ac249af761c0190a9258aa0db, vm_size was 32GB, and csa_addr was under AMDGPU_GMC_HOLE_START.

Wait a second why was the vm_size 32GB? This is on a Vega10 isn't it?

I didn’t understand why csa_addr need to be above AMDGPU_GMC_HOLE_START now.

On Vega10 the lower range, e.g. everything below AMDGPU_GMC_HOLE_START is reserved for SVA.

Regards,
Christian.

Thanks.

BR,
Wentao

From: Koenig, Christian <Christian.Koenig at amd.com><mailto:Christian.Koenig at amd.com>
Sent: Wednesday, January 16, 2019 5:48 PM
To: Lou, Wentao <Wentao.Lou at amd.com><mailto:Wentao.Lou at amd.com>; Liu, Monk <Monk.Liu at amd.com><mailto:Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org>; Zhu, Rex <Rex.Zhu at amd.com><mailto:Rex.Zhu at amd.com>
Cc: Deng, Emily <Emily.Deng at amd.com><mailto:Emily.Deng at amd.com>
Subject: Re: [PATCH] drm/amdgpu: csa_vaddr should not larger than AMDGPU_GMC_HOLE_START

Hi Wentao,

well the problem is you don't seem to understand how the hardware works.

See the engines see an MC address space with a hole in the middle, similar to the how x86 64bit CPU address space works. But the page tables are programmed linearly.

So the calculation in amdgpu_driver_open_kms() is correct because it takes the MC address and mages a linear page table index from it again.

The only thing we might need to fix here is shifting max_pfn before the subtraction and I doubt that even that is necessary.

Regards,
Christian.

Am 16.01.19 um 10:34 schrieb Lou, Wentao:

Hi Christian,

Now vm_size was set to 0x4 0000 GB by below commit:

1bf621c42137926ac249af761c0190a9258aa0db drm/amdgpu: Remove unnecessary VM size calculations

So that max_pfn would be 0x10 0000 0000.

amdgpu_csa_vaddr would make max_pfn << 12 to get 0x1 0000 0000 0000, and then minus AMDGPU_VA_RESERVED_SIZE, to get 0xFFFF FFF0 0000

unfortunately this number was between AMDGPU_GMC_HOLE_START and AMDGPU_GMC_HOLE_END, so that amdgpu_gmc_sign_extend was called to make it 0xFFFF FFFF FFF0 0000

in amdgpu_driver_open_kms, extended csa_addr cannot be passed into amdgpu_map_static_csa directly, it would be above the limit of max_pfn.

So that csa_addr was restricted by AMDGPU_GMC_HOLE_MASK to make it possible for amdgpu_vm_alloc_pts.

But this restriction by AMDGPU_GMC_HOLE_MASK would make the address fall back into AMDGPU_GMC_HOLE again,  which causing GPU reset.

We just put amdgpu_csa_vaddr back to AMDGPU_GMC_HOLE_START, to avoid the address touching AMDGPU_GMC_HOLE.

By the way, if max_pfn was shift much to the left, it would always get zero, with or without min(*,*).

BR,

Wentao

-----Original Message-----
From: Koenig, Christian <Christian.Koenig at amd.com><mailto:Christian.Koenig at amd.com>
Sent: Tuesday, January 15, 2019 4:02 PM
To: Liu, Monk <Monk.Liu at amd.com><mailto:Monk.Liu at amd.com>; Lou, Wentao <Wentao.Lou at amd.com><mailto:Wentao.Lou at amd.com>; amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org>; Zhu, Rex <Rex.Zhu at amd.com><mailto:Rex.Zhu at amd.com>
Subject: Re: [PATCH] drm/amdgpu: csa_vaddr should not larger than AMDGPU_GMC_HOLE_START

Am 15.01.19 um 07:19 schrieb Liu, Monk:

> The max_pfn is now 1'0000'0000'0000'0000 (bytes) which is above 48 bit now, and it with AMDGPU_GMC_HOLE_MASK make it to zero ....

>

> And in code "amdgpu_driver_open_kms()" I saw @Zhu, Rex write the code as :

>

> "csa_addr = amdgpu_csa_vadr(adev) & AMDGPU_GMC_HOLE_MASK", I think this is wrong since you intentionally place the csa above GMC hole, right ?

The fix is just completely incorrect since min(adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT, AMDGPU_GMC_HOLE_START) still gives you 0 when we shift max_pfn to much to the left.

The correct solution is to substract the reserved size first and then shift. E.g.:

addr = (max_pfn - (AMDGPU_VA_RESERVED_SIZE >> AMDGPU_PAGE_SHIFT)) << AMDGPU_PAGE_SHIFT;

Regards,

Christian.

>

> Looks like  we should modify this place

>

> /Monk

>

> -----Original Message-----

> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org<mailto:amd-gfx-bounces at lists.freedesktop.org>> On Behalf Of

> Christian K?nig

> Sent: Monday, January 14, 2019 9:05 PM

> To: Lou, Wentao <Wentao.Lou at amd.com<mailto:Wentao.Lou at amd.com>>; amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org>

> Subject: Re: [PATCH] drm/amdgpu: csa_vaddr should not larger than

> AMDGPU_GMC_HOLE_START

>

> Am 14.01.19 um 09:40 schrieb wentalou:

>> After removing unnecessary VM size calculations, vm_manager.max_pfn

>> would reach 0x10,0000,0000 max_pfn << AMDGPU_GPU_PAGE_SHIFT exceeding

>> AMDGPU_GMC_HOLE_START would caused GPU reset.

>>

>> Change-Id: I47ad0be2b0bd9fb7490c4e1d7bb7bdacf71132cb

>> Signed-off-by: wentalou <Wentao.Lou at amd.com<mailto:Wentao.Lou at amd.com>>

> NAK, that is incorrect. We intentionally place the csa above the GMC hole.

>

> Regards,

> Christian.

>

>> ---

>>    drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 3 ++-

>>    1 file changed, 2 insertions(+), 1 deletion(-)

>>

>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c

>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c

>> index 7e22be7..dd3bd01 100644

>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c

>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c

>> @@ -26,7 +26,8 @@

>>

>>    uint64_t amdgpu_csa_vaddr(struct amdgpu_device *adev)

>>    {

>> -        uint64_t addr = adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT;

>> +       uint64_t addr = min(adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT,

>> +                                                    AMDGPU_GMC_HOLE_START);

>>

>>          addr -= AMDGPU_VA_RESERVED_SIZE;

>>          addr = amdgpu_gmc_sign_extend(addr);

> _______________________________________________

> amd-gfx mailing list

> amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org>

> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190117/29c9750c/attachment-0001.html>