[PATCH 0/9] amdkfd: Implement kfd multiple contexts
Zhu Lingshan
lingshan.zhu at amd.com
Fri Jul 25 02:43:07 UTC 2025
Currently kfd manages kfd_process in a one context (kfd_process)
per program manner, thus each user space program
only onws one kfd context (kfd_process).
This model works fine for most of the programs, but imperfect
for a hypervisor like QEMU. Because all programs in the guest
user space share the same only one kfd context, which is
problematic, including but not limited to:
As illustrated in Figure 1, all guest user space programs share the same fd of /dev/kfd
and the same kfd_process, and the same PASID leading to the same
GPU_VM address space. Therefore the IOVA range of each
guest user space programs are not isolated,
they can attack each other through GPU DMA.
+----------------------------------------------------------------------------------+
| |
| +-----------+ +-----------+ +------------+ +------------+ |
| | | | | | | | | |
| | Program 1 | | Program 2 | | Program 3 | | Program N | |
| | | | | | | | | |
| +----+------+ +--------+--+ +--+---------+ +-----+------+ |
| | | | | |
| | | | | Guest |
| | | | | |
+-------+----------------------+------------+----------------------+---------------+
| | | |
| | | |
| | | |
| | | |
| +--+------------+---+ |
| | file descriptor | |
+-------------------+ of /dev/kfd +------------------+
| opened by QEMU |
| |
+---------+---------+ User Space
| QEMU
|
---------------------------------------+-----------------------------------------------------
| Kernel Space
| KFD Module
|
+--------+--------+
| |
| kfd_process |<------------------The only one KFD context
| |
+--------+--------+
|
+--------+--------+
| PASID |
+--------+--------+
|
+--------+--------+
| GPU_VM |
+-----------------+
Fiture 1
This series implements a multiple contexts solution:
- Allow each program to create and hold multiple contexts (kfd processes)
- Each context has its own fd of /dev/kfd and an exclusive kfd_process,
which is a secondary kfd context. So that PASID/GPU VM isolates their IOVA address spaces.
Therefore, they can not attack each other through GPU DMA.
The design is illustrated in Figure 2 below:
+---------------------------------------------------------------------------------------------------------+
| |
| |
| |
| +----------------------------------------------------------------------------------+ |
| | | |
| | +-----------+ +-----------+ +-----------+ +-----------+ | |
| | | | | | | | | | | |
| | | Program 1 | | Program 2 | | Program 3 | | Program N | | |
| | | | | | | | | | | |
| | +-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+ | |
| | | | | | | |
| | | | | | Guest | |
| | | | | | | |
| +-------+------------------+-----------------+----------------+--------------------+ |
| | | | | QEMU |
| | | | | |
+---------------+------------------+-----------------+----------------+--------------------------+--------+
| | | | |
| | | | |
| | | | |
+---+----+ +---+----+ +---+----+ +---+----+ +---+-----+
| | | | | | | | | Primary |
| FD 1 | | FD 2 | | FD 3 | | FD 4 | | FD |
| | | | | | | | | |
+---+----+ +---+----+ +---+----+ +----+---+ +----+----+
| | | | | User Space
| | | | |
-------------------+------------------+-----------------+-----------------+--------------------------+----------------------------
| | | | | Kernel SPace
| | | | |
| | | | |
+--------------------------------------------------------------------------------------------------------------------------+
| +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ |
| | Secondary | | Secondary | | Secondary | | Secondary | | Primary | KFD Module |
| |kfd_process 1| |kfd_process 2| |kfd_process 3| |kfd_process 4| | kfd_process | |
| | | | | | | | | | | |
| +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ |
| | | | | | |
| +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ |
| | PASID | | PASID | | PASID | | PASID | | PASID | |
| +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ |
| | | | | | |
| | | | | | |
| +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ |
| | GPU_VM | | GPU_VM | | GPU_VM | | GPU_VM | | GPU_VM | |
| +-------------+ +-------------+ +-------------+ +-------------+ +-------------+ |
| |
+--------------------------------------------------------------------------------------------------------------------------+
Figure 2
Zhu Lingshan (9):
amdkfd: enlarge the hashtable of kfd_process
amdkfd: mark the first kfd_process as the primary one
amdkfd: find_process_by_mm always return the primary context
amdkfd: Introduce kfd_create_process_sysfs as a separate function
amdkfd: destroy kfd secondary contexts through fd close
amdkfd: process svm ioctl only on the primary kfd process
amdkfd: process USERPTR allocation only on the primary kfd process
amdkfd: identify a secondary kfd process by its id
amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 62 ++++++-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 14 +-
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 204 +++++++++++++++++------
include/uapi/linux/kfd_ioctl.h | 8 +-
4 files changed, 231 insertions(+), 57 deletions(-)
--
2.47.1
More information about the amd-gfx
mailing list