Making drm_gpuvm work across gpu devices

David Airlie airlied at redhat.com
Thu Jan 25 01:25:11 UTC 2024


>
>
> For us, Xekmd doesn't need to know it is running under bare metal or virtualized environment. Xekmd is always a guest driver. All the virtual address used in xekmd is guest virtual address. For SVM, we require all the VF devices share one single shared address space with guest CPU program. So all the design works in bare metal environment can automatically work under virtualized environment. + at Shah, Ankur N + at Winiarski, Michal to backup me if I am wrong.
>
>
>
> Again, shared virtual address space b/t cpu and all gpu devices is a hard requirement for our system allocator design (which means malloc’ed memory, cpu stack variables, globals can be directly used in gpu program. Same requirement as kfd SVM design). This was aligned with our user space software stack.

Just to make a very general point here (I'm hoping you listen to
Christian a bit more and hoping he replies in more detail), but just
because you have a system allocator design done, it doesn't in any way
enforce the requirements on the kernel driver to accept that design.
Bad system design should be pushed back on, not enforced in
implementation stages. It's a trap Intel falls into regularly since
they say well we already agreed this design with the userspace team
and we can't change it now. This isn't acceptable. Design includes
upstream discussion and feedback, if you say misdesigned the system
allocator (and I'm not saying you definitely have), and this is
pushing back on that, then you have to go fix your system
architecture.

KFD was an experiment like this, I pushed back on AMD at the start
saying it was likely a bad plan, we let it go and got a lot of
experience in why it was a bad design.

Dave.



More information about the dri-devel mailing list