[PATCH V2 00/10] amdkfd: Implement kfd multiple contexts
Zhu Lingshan
lingshan.zhu at amd.com
Fri Aug 1 08:55:46 UTC 2025
Currently kfd manages kfd_process in a one context (kfd_process)
per program manner, thus each user space program
only onws one kfd context (kfd_process).
This model works fine for most of the programs, but imperfect
for a hypervisor like QEMU. Because all programs in the guest
user space share the same only one kfd context, which is
problematic, including but not limited to:
As illustrated in Figure 1, all guest user space programs share the same fd of /dev/kfd
and the same kfd_process, and the same PASID leading to the same
GPU_VM address space. Therefore the IOVA range of each
guest user space programs are not isolated,
they can attack each other through GPU DMA.
+----------------------------------------------------------------------------------+
| |
| +-----------+ +-----------+ +------------+ +------------+ |
| | | | | | | | | |
| | Program 1 | | Program 2 | | Program 3 | | Program N | |
| | | | | | | | | |
| +----+------+ +--------+--+ +--+---------+ +-----+------+ |
| | | | | |
| | | | | Guest |
| | | | | |
+-------+----------------------+------------+----------------------+---------------+
| | | |
| | | |
| | | |
| | | |
| +--+------------+---+ |
| | file descriptor | |
+-------------------+ of /dev/kfd +------------------+
| opened by QEMU |
| |
+---------+---------+ User Space
| QEMU
|
---------------------------------------+-----------------------------------------------------
| Kernel Space
| KFD Module
|
+--------+--------+
| |
| kfd_process |<------------------The only one KFD context
| |
+--------+--------+
|
+--------+--------+
| PASID |
+--------+--------+
|
+--------+--------+
| GPU_VM |
+-----------------+
Fiture 1
This series implements a multiple contexts solution:
- Allow each program to create and hold multiple contexts (kfd processes)
- Each context has its own fd of /dev/kfd and an exclusive kfd_process,
which is a secondary kfd context. So that PASID/GPU VM isolates their IOVA address spaces.
Therefore, they can not attack each other through GPU DMA.
The design is illustrated in Figure 2 below:
+---------------------------------------------------------------------------------------------------------+
| |
| |
| |
| +----------------------------------------------------------------------------------+ |
| | | |
| | +-----------+ +-----------+ +-----------+ +-----------+ | |
| | | | | | | | | | | |
| | | Program 1 | | Program 2 | | Program 3 | | Program N | | |
| | | | | | | | | | | |
| | +-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+ | |
| | | | | | | |
| | | | | | Guest | |
| | | | | | | |
| +-------+------------------+-----------------+----------------+--------------------+ |
| | | | | QEMU |
| | | | | |
+---------------+------------------+-----------------+----------------+--------------------------+--------+
| | | | |
| | | | |
| | | | |
+---+----+ +---+----+ +---+----+ +---+----+ +---+-----+
| | | | | | | | | Primary |
| FD 1 | | FD 2 | | FD 3 | | FD 4 | | FD |
| | | | | | | | | |
+---+----+ +---+----+ +---+----+ +----+---+ +----+----+
| | | | | User Space
| | | | |
-------------------+------------------+-----------------+-----------------+--------------------------+----------------------------
| | | | | Kernel SPace
| | | | |
| | | | |
+--------------------------------------------------------------------------------------------------------------------------+
| +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ |
| | Secondary | | Secondary | | Secondary | | Secondary | | Primary | KFD Module |
| |kfd_process 1| |kfd_process 2| |kfd_process 3| |kfd_process 4| | kfd_process | |
| | | | | | | | | | | |
| +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ |
| | | | | | |
| +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ |
| | PASID | | PASID | | PASID | | PASID | | PASID | |
| +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ |
| | | | | | |
| | | | | | |
| +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ |
| | GPU_VM | | GPU_VM | | GPU_VM | | GPU_VM | | GPU_VM | |
| +-------------+ +-------------+ +-------------+ +-------------+ +-------------+ |
| |
+--------------------------------------------------------------------------------------------------------------------------+
Figure 2
Appendix, a minimal test program:
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <sys/ioctl.h>
#include <linux/kfd_ioctl.h>
int main() {
int fd1, fd2, ret;
// open FDs
fd1 = open("/dev/kfd", O_RDWR);
if (fd1 < 0) {
perror("Failed to open FD1 /dev/kfd");
return EXIT_FAILURE;
}
printf("FD1 == %d /dev/kfd opened successfully\n", fd1);
getchar();
fd2 = open("/dev/kfd", O_RDWR);
if (fd2 < 0) {
perror("Failed to open FD2 /dev/kfd");
return EXIT_FAILURE;
}
printf("FD2 == %d /dev/kfd opened successfully\n", fd2);
getchar();
// create a new secondary context
ret = ioctl(fd2, AMDKFD_IOC_CREATE_PROCESS);
printf("AMDKFD_IOC_CREATE_PROCESS ioctl ret = %d\n", ret);
getchar();
// close FDs
close(fd2);
getchar();
close(fd1);
getchar();
return EXIT_SUCCESS;
}
Zhu Lingshan (10):
amdkfd: enlarge the hashtable of kfd_process
amdkfd: mark the first kfd_process as the primary one
amdkfd: find_process_by_mm always return the primary context
amdkfd: Introduce kfd_create_process_sysfs as a separate function
amdkfd: destroy kfd secondary contexts through fd close
amdkfd: process svm ioctl only on the primary kfd process
amdkfd: process USERPTR allocation only on the primary kfd process
amdkfd: identify a secondary kfd process by its id
amdkfd: decommission kfd_get_process and remove DIQ support
amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 73 +++++-
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 6 +-
drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 60 +----
.../drm/amd/amdkfd/kfd_packet_manager_v9.c | 4 -
.../drm/amd/amdkfd/kfd_packet_manager_vi.c | 4 -
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 15 +-
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 223 ++++++++++++------
.../amd/amdkfd/kfd_process_queue_manager.c | 35 +--
include/uapi/linux/kfd_ioctl.h | 8 +-
9 files changed, 248 insertions(+), 180 deletions(-)
--
2.47.1
More information about the amd-gfx
mailing list