[PATCH 07/25] drm/amdgpu: Update kgd2kfd_shared_resources for dGPU support
Christian König
ckoenig.leichtzumerken at gmail.com
Tue Jan 30 09:13:59 UTC 2018
Am 29.01.2018 um 21:25 schrieb Felix Kuehling:
> On 2018-01-29 06:42 AM, Christian König wrote:
>> Am 29.01.2018 um 00:02 schrieb Felix Kuehling:
>>> On 2018-01-27 04:19 AM, Christian König wrote:
>>>> Am 27.01.2018 um 02:09 schrieb Felix Kuehling:
>>>>> Add GPUVM size and DRM render node. Also add function to query the
>>>>> VMID mask to avoid hard-coding it in multiple places later.
>>>>>
>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>>> ---
>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 19
>>>>> +++++++++++++++++--
>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 ++
>>>>> drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 6 ++++++
>>>>> 3 files changed, 25 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> index c9f204d..294c467 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> @@ -30,6 +30,8 @@
>>>>> const struct kgd2kfd_calls *kgd2kfd;
>>>>> bool (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
>>>>> +static const unsigned int compute_vmid_bitmap = 0xFF00;
>>>>> +
>>>>> int amdgpu_amdkfd_init(void)
>>>>> {
>>>>> int ret;
>>>>> @@ -137,9 +139,12 @@ void amdgpu_amdkfd_device_init(struct
>>>>> amdgpu_device *adev)
>>>>> int last_valid_bit;
>>>>> if (adev->kfd) {
>>>>> struct kgd2kfd_shared_resources gpu_resources = {
>>>>> - .compute_vmid_bitmap = 0xFF00,
>>>>> + .compute_vmid_bitmap = compute_vmid_bitmap,
>>>>> .num_pipe_per_mec = adev->gfx.mec.num_pipe_per_mec,
>>>>> - .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe
>>>>> + .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe,
>>>>> + .gpuvm_size = adev->vm_manager.max_pfn
>>>>> + << AMDGPU_GPU_PAGE_SHIFT,
>>>> That most likely doesn't work as intended on Vega10. The address space
>>>> is divided into an upper and a lower range, but max_pfn includes both.
>>>>
>>>> I suggest to use something like min(adev->vm_manager.max_pfn <<
>>>> AMDGPU_GPU_PAGE_SHIFT, AMDGPU_VM_HOLE_START).
>>> I think this is fine as it is. This just tells the Thunk the size of the
>>> virtual address space supported by the GPU. Currently the Thunk only
>>> uses 40-bits for SVM. But eventually it will be able to use the entire
>>> 47 bits of user address space. Any excess address space will just go
>>> unused.
>>>
>>> I'm also wondering how universal the split 48-bit virtual address space
>>> layout is. Even for x86_64 there seems to be a 5-level page table layout
>>> that supports 56 bits of user mode addresses
>>> (Documentation/x86/x86_64/mm.txt). AArch64 seems to support 48-bit user
>>> mode addresses (Documentation/arm64/memory.txt). I haven't found similar
>>> information for PowerPC yet.
>>>
>>> We should avoid coding too much architecture-specific logic into this
>>> driver that's supposed to support other architectures as well. I should
>>> also review the aperture placement with bigger user mode address spaces
>>> in mind.
>> And that is exactly the reason why I've suggested to change that.
>>
>> See the split between lower and upper range is not related to the CPU
>> configuration, instead it is a property of our GPU setup and hardware
>> generation.
> How exactly does the GPUVM hardware behave when it sees a virtual
> address "in the hole". Does it generate a VM fault? Or does it simply
> ignore the high bits?
At least on the Vega10 system I've tested you get a range fault when you
try to access the hole.
> If it ignored the high bits, it would work OK for both x86_64 (with a
> split address space) and ARM64 (with a full 48-bit user mode virtual
> address space).
>
> On the other hand, if addresses in the hole generate a VM fault, then I
> agree with you, and we should only report 47-bits of virtual address
> space to user mode for SVM purposes.
Yes, according to my testing exactly that is the case here.
The hardware documentation is a bit ambivalent. So I initially thought
as well that the high bits are just ignored, but that doesn't seem to be
the case.
Regards,
Christian.
>
> Regards,
> Felix
>
>> So we should either split it in gpuvm_size_low/gpuvm_size_high or to
>> just use the lower range and limit the value as suggested above.
>>
>> This should avoid having any GPU generation and CPU configuration
>> specific logic in the common interface.
>>
>> Regards,
>> Christian.
>>
>>> Regards,
>>> Felix
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list