[RFC PATCH 0/4] uapi, drm: Add and implement RLIMIT_GPUPRIO

Christian König christian.koenig at amd.com
Mon Apr 3 19:54:53 UTC 2023


Am 03.04.23 um 21:40 schrieb Joshua Ashton:
> Hello all!
>
> I would like to propose a new API for allowing processes to control
> the priority of GPU queues similar to RLIMIT_NICE/RLIMIT_RTPRIO.
>
> The main reason for this is for compositors such as Gamescope and
> SteamVR vrcompositor to be able to create realtime async compute
> queues on AMD without the need of CAP_SYS_NICE.
>
> The current situation is bad for a few reasons, one being that in order
> to setcap the executable, typically one must run as root which involves
> a pretty high privelage escalation in order to achieve one
> small feat, a realtime async compute queue queue for VR or a compositor.
> The executable cannot be setcap'ed inside a
> container nor can the setcap'ed executable be run in a container with
> NO_NEW_PRIVS.
>
> I go into more detail in the description in
> `uapi: Add RLIMIT_GPUPRIO`.
>
> My initial proposal here is to add a new RLIMIT, `RLIMIT_GPUPRIO`,
> which seems to make most initial sense to me to solve the problem.
>
> I am definitely not set that this is the best formulation however
> or if this should be linked to DRM (in terms of it's scheduler
> priority enum/definitions) in any way and and would really like other
> people's opinions across the stack on this.
>
> Once initial concern is that potentially this RLIMIT could out-live
> the lifespan of DRM. It sounds crazy saying it right now, something
> that definitely popped into my mind when touching `resource.h`. :-)
>
> Anyway, please let me know what you think!
> Definitely open to any feedback and advice you may have. :D

Well the basic problem is that higher priority queues can be used to 
starve low priority queues.

This starvation in turn is very very bad for memory management since the 
dma_fence's the GPU scheduler deals with have very strong restrictions.

Even exposing this under CAP_SYS_NICE is questionable, so we will most 
likely have to NAK this.

Regards,
Christian.

>
> Thanks!
>   - Joshie
>
> Joshua Ashton (4):
>    drm/scheduler: Add DRM_SCHED_PRIORITY_VERY_HIGH
>    drm/scheduler: Split out drm_sched_priority to own file
>    uapi: Add RLIMIT_GPUPRIO
>    drm/amd/amdgpu: Check RLIMIT_GPUPRIO in priority permissions
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 13 ++++++--
>   drivers/gpu/drm/msm/msm_gpu.h           |  2 +-
>   fs/proc/base.c                          |  1 +
>   include/asm-generic/resource.h          |  3 +-
>   include/drm/drm_sched_priority.h        | 41 +++++++++++++++++++++++++
>   include/drm/gpu_scheduler.h             | 14 +--------
>   include/uapi/asm-generic/resource.h     |  3 +-
>   7 files changed, 58 insertions(+), 19 deletions(-)
>   create mode 100644 include/drm/drm_sched_priority.h
>



More information about the amd-gfx mailing list