Making drm_gpuvm work across gpu devices

Fri Jan 26 10:09:54 UTC 2024

Hi Oak,

you can still use SVM, but it should not be a design criteria for the 
kernel UAPI. In other words the UAPI should be designed in such a way 
that the GPU virtual address can be equal to the CPU virtual address of 
a buffer, but can also be different to support use cases where this 
isn't the case.

Additionally to what Dave wrote I can summarize a few things I have 
learned while working on the AMD GPU drivers in the last decade or so:

1. Userspace requirements are *not* relevant for UAPI or even more 
general kernel driver design.

2. What should be done is to look at the hardware capabilities and try 
to expose those in a save manner to userspace.

3. The userspace requirements are then used to validate the kernel 
driver and especially the UAPI design to ensure that nothing was missed.

The consequence of this is that nobody should ever use things like Cuda, 
Vulkan, OpenCL, OpenGL etc.. as argument to propose a certain UAPI design.

What should be done instead is to say: My hardware works in this and 
that way -> we want to expose it like this -> because that enables us to 
implement the high level API in this and that way.

Only this gives then a complete picture of how things interact together 
and allows the kernel community to influence and validate the design.

This doesn't mean that you need to throw away everything, but it gives a 
clear restriction that designs are not nailed in stone and for example 
you can't use something like a waterfall model.

Going to answer on your other questions separately.

Regards,
Christian.

Am 25.01.24 um 06:25 schrieb Zeng, Oak:
> Hi Dave,
>
> Let me step back. When I wrote " shared virtual address space b/t cpu and all gpu devices is a hard requirement for our system allocator design", I meant this is not only Intel's design requirement. Rather this is a common requirement for both Intel, AMD and Nvidia. Take a look at cuda driver API definition of cuMemAllocManaged (search this API on https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MEM.html#group__CUDA__MEM), it said:
>
> "The pointer is valid on the CPU and on all GPUs in the system that support managed memory."
>
> This means the program virtual address space is shared b/t CPU and all GPU devices on the system. The system allocator we are discussing is just one step advanced than cuMemAllocManaged: it allows malloc'ed memory to be shared b/t CPU and all GPU devices.
>
> I hope we all agree with this point.
>
> With that, I agree with Christian that in kmd we should make driver code per-device based instead of managing all devices in one driver instance. Our system allocator (and generally xekmd)design follows this rule: we make xe_vm per device based - one device is *not* aware of other device's address space, as I explained in previous email. I started this email seeking a one drm_gpuvm instance to cover all GPU devices. I gave up this approach (at least for now) per Danilo and Christian's feedback: We will continue to have per device based drm_gpuvm. I hope this is aligned with Christian but I will have to wait for Christian's reply to my previous email.
>
> I hope this clarify thing a little.
>
> Regards,
> Oak
>
>> -----Original Message-----
>> From: dri-devel <dri-devel-bounces at lists.freedesktop.org> On Behalf Of David
>> Airlie
>> Sent: Wednesday, January 24, 2024 8:25 PM
>> To: Zeng, Oak <oak.zeng at intel.com>
>> Cc: Ghimiray, Himal Prasad <himal.prasad.ghimiray at intel.com>;
>> Thomas.Hellstrom at linux.intel.com; Winiarski, Michal
>> <michal.winiarski at intel.com>; Felix Kuehling <felix.kuehling at amd.com>; Welty,
>> Brian <brian.welty at intel.com>; Shah, Ankur N <ankur.n.shah at intel.com>; dri-
>> devel at lists.freedesktop.org; intel-xe at lists.freedesktop.org; Gupta, saurabhg
>> <saurabhg.gupta at intel.com>; Danilo Krummrich <dakr at redhat.com>; Daniel
>> Vetter <daniel at ffwll.ch>; Brost, Matthew <matthew.brost at intel.com>; Bommu,
>> Krishnaiah <krishnaiah.bommu at intel.com>; Vishwanathapura, Niranjana
>> <niranjana.vishwanathapura at intel.com>; Christian König
>> <christian.koenig at amd.com>
>> Subject: Re: Making drm_gpuvm work across gpu devices
>>
>>>
>>> For us, Xekmd doesn't need to know it is running under bare metal or
>> virtualized environment. Xekmd is always a guest driver. All the virtual address
>> used in xekmd is guest virtual address. For SVM, we require all the VF devices
>> share one single shared address space with guest CPU program. So all the design
>> works in bare metal environment can automatically work under virtualized
>> environment. + at Shah, Ankur N + at Winiarski, Michal to backup me if I am wrong.
>>>
>>>
>>> Again, shared virtual address space b/t cpu and all gpu devices is a hard
>> requirement for our system allocator design (which means malloc’ed memory,
>> cpu stack variables, globals can be directly used in gpu program. Same
>> requirement as kfd SVM design). This was aligned with our user space software
>> stack.
>>
>> Just to make a very general point here (I'm hoping you listen to
>> Christian a bit more and hoping he replies in more detail), but just
>> because you have a system allocator design done, it doesn't in any way
>> enforce the requirements on the kernel driver to accept that design.
>> Bad system design should be pushed back on, not enforced in
>> implementation stages. It's a trap Intel falls into regularly since
>> they say well we already agreed this design with the userspace team
>> and we can't change it now. This isn't acceptable. Design includes
>> upstream discussion and feedback, if you say misdesigned the system
>> allocator (and I'm not saying you definitely have), and this is
>> pushing back on that, then you have to go fix your system
>> architecture.
>>
>> KFD was an experiment like this, I pushed back on AMD at the start
>> saying it was likely a bad plan, we let it go and got a lot of
>> experience in why it was a bad design.
>>
>> Dave.