[PATCH 3/3] drm/amdgpu: fix another fundamental VM bug

Zhang, Hawking Hawking.Zhang at amd.com
Tue May 16 08:38:34 UTC 2017


Hi Christian,

Basically we are on the same page that we should use relative address of vram bo to program hw.  

What I missed part is that you used the tbo.mem.start, the relative address, rather than tbo.offset, the absolute address, for the calculation. Based on that, I'm okay with the follow up logic: addr = adev->vm_manager.vram_base_offset + addr; However, I'm under impression that even with that fixed, and pre-GFX9 also changed to use FB_LOCATION_BASE programmed by bios as mc start, there is still some issue in VCE/UVD.etc. 

Have you verified the patch on pre-GFX9 adapter?

Regards,
Hawking

-----Original Message-----
From: Christian König [mailto:deathsimple at vodafone.de] 
Sent: Tuesday, May 16, 2017 15:49
To: Zhang, Hawking <Hawking.Zhang at amd.com>; Zhou, David(ChunMing) <David1.Zhou at amd.com>; amd-gfx at lists.freedesktop.org; Koenig, Christian <Christian.Koenig at amd.com>
Subject: Re: [PATCH 3/3] drm/amdgpu: fix another fundamental VM bug

Am 16.05.2017 um 09:21 schrieb Zhang, Hawking:
>>    static u64 gmc_v9_0_get_vm_pde(struct amdgpu_device *adev, u64 addr)
>>    {
>> -	addr = adev->vm_manager.vram_base_offset + addr - adev->mc.vram_start;
>> +	addr = adev->vm_manager.vram_base_offset + addr;
> Please be noted that mv.vram_start is not 0, it is equal to the value from mmMC_VM_FB_LOCATION_BASE, which is initialized by vbios. Typically is 0xf400000000, mc address. Thus any addr here should be based on 0xf400000000.
>
> + addr = adev->vm_manager.vram_base_offset + addr;
>
> I don't think it is correct to add an absolute addr here. vram_base_offset is already the base physical address of vram (typically 0 for dGPU, while some non-zero value for APU), we need an relative address/offset to reflect the location of vram bo.
> addr - adev->mc.vram_start just calculate the relative address of the vram bo, or the offset from vram start point.

Yes, completely agree. We used the MC address space to program the page directory address into the hardware instead of the relative address/offset to reflect the location of VRAM BO.

During GFX9 bringup somebody noticed that this in incorrect and doesn't work when mmMC_VM_FB_LOCATION_BASE is != 0.

The problem is that whoever did this didn't fixed the underlying bug and started to use the relative address/offset for programming the PD address, but rather added the code snippet " - adev->mc.vram_start" to just get GFX9 working.

Alex was wondering for quite a while why programming mmMC_VM_FB_LOCATION_BASE != 0 didn't worked correctly on GFX7 and GFX8 and this is the explanation.

Please review,
Christian.

>
> Anyway, the base physical address (vram_base_offset) should plus an offset to reflect the real physical address of vram bo.
>
> Regards,
> Hawking
>
> -----Original Message-----
> From: Zhou, David(ChunMing)
> Sent: Tuesday, May 16, 2017 10:45
> To: Christian König <deathsimple at vodafone.de>; 
> amd-gfx at lists.freedesktop.org; Zhang, Hawking <Hawking.Zhang at amd.com>
> Subject: Re: [PATCH 3/3] drm/amdgpu: fix another fundamental VM bug
>
>
>
> On 2017年05月15日 19:57, Christian König wrote:
>> From: Christian König <christian.koenig at amd.com>
>>
>> I always wondered why this code uses the MC address. Now it came to 
>> me that this is actually a bug and only works by coincident because 
>> we used to have VRAM mapped at zero.
> adding Hawking to confirm this.
>
> Regards,
> David Zhou
>> Signed-off-by: Christian König <christian.koenig at amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>>    drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 2 +-
>>    3 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> index e4c5d31..ba6eb73 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> @@ -884,7 +884,7 @@ static int amdgpu_cs_ib_vm_chunk(struct amdgpu_device *adev,
>>    	}
>>    
>>    	if (p->job->vm) {
>> -		p->job->vm_pd_addr = amdgpu_bo_gpu_offset(vm->root.bo);
>> +		p->job->vm_pd_addr = vm->root.bo->tbo.mem.start << PAGE_SHIFT;
>>    
>>    		r = amdgpu_bo_vm_update_pte(p);
>>    		if (r)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index b877f9f3..7256fcc 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1012,7 +1012,7 @@ static int amdgpu_vm_update_level(struct amdgpu_device *adev,
>>    				return r;
>>    		}
>>    
>> -		pt = amdgpu_bo_gpu_offset(bo);
>> +		pt = bo->tbo.mem.start << PAGE_SHIFT;
>>    		if (parent->entries[pt_idx].addr == pt)
>>    			continue;
>>    
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> index 047b1a7..63cb573 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> @@ -360,7 +360,7 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct 
>> amdgpu_device *adev,
>>    
>>    static u64 gmc_v9_0_get_vm_pde(struct amdgpu_device *adev, u64 addr)
>>    {
>> -	addr = adev->vm_manager.vram_base_offset + addr - adev->mc.vram_start;
>> +	addr = adev->vm_manager.vram_base_offset + addr;
>>    	BUG_ON(addr & 0xFFFF00000000003FULL);
>>    	return addr;
>>    }




More information about the amd-gfx mailing list