Making drm_gpuvm work across gpu devices
Christian König
christian.koenig at amd.com
Tue Jan 23 11:13:12 UTC 2024
Hi Oak,
Am 23.01.24 um 04:21 schrieb Zeng, Oak:
> Hi Danilo and all,
>
> During the work of Intel's SVM code, we came up the idea of making drm_gpuvm to work across multiple gpu devices. See some discussion here: https://lore.kernel.org/dri-devel/PH7PR11MB70049E7E6A2F40BF6282ECC292742@PH7PR11MB7004.namprd11.prod.outlook.com/
>
> The reason we try to do this is, for a SVM (shared virtual memory across cpu program and all gpu program on all gpu devices) process, the address space has to be across all gpu devices. So if we make drm_gpuvm to work across devices, then our SVM code can leverage drm_gpuvm as well.
>
> At a first look, it seems feasible because drm_gpuvm doesn't really use the drm_device *drm pointer a lot. This param is used only for printing/warning. So I think maybe we can delete this drm field from drm_gpuvm.
>
> This way, on a multiple gpu device system, for one process, we can have only one drm_gpuvm instance, instead of multiple drm_gpuvm instances (one for each gpu device).
>
> What do you think?
Well from the GPUVM side I don't think it would make much difference if
we have the drm device or not.
But the experience we had with the KFD I think I should mention that we
should absolutely *not* deal with multiple devices at the same time in
the UAPI or VM objects inside the driver.
The background is that all the APIs inside the Linux kernel are build
around the idea that they work with only one device at a time. This
accounts for both low level APIs like the DMA API as well as pretty high
level things like for example file system address space etc...
So when you have multiple GPUs you either have an inseparable cluster of
them which case you would also only have one drm_device. Or you have
separated drm_device which also results in separate drm render nodes and
separate virtual address spaces and also eventually separate IOMMU
domains which gives you separate dma_addresses for the same page and so
separate GPUVM page tables....
It's up to you how to implement it, but I think it's pretty clear that
you need separate drm_gpuvm objects to manage those.
That you map the same thing in all those virtual address spaces at the
same address is a completely different optimization problem I think.
What we could certainly do is to optimize hmm_range_fault by making
hmm_range a reference counted object and using it for multiple devices
at the same time if those devices request the same range of an mm_struct.
I think if you start using the same drm_gpuvm for multiple devices you
will sooner or later start to run into the same mess we have seen with
KFD, where we moved more and more functionality from the KFD to the DRM
render node because we found that a lot of the stuff simply doesn't work
correctly with a single object to maintain the state.
Just one more point to your original discussion on the xe list: I think
it's perfectly valid for an application to map something at the same
address you already have something else.
Cheers,
Christian.
>
> Thanks,
> Oak
More information about the Intel-xe
mailing list