[RFC] Exclusive gpu access for SteamVR usecases

Tue May 30 15:19:53 UTC 2017

Looks like a good start, but a few notes in general:

1. Split the patches into two sets.

One for implementing changing the priorities and one for limiting the 
priorities.

2. How are the priorities from processes supposed to interact with the 
per context priority?

3. Thinking more about it we can't limit the minimum priority in the 
scheduler.

For example a low priority job might block resources the high priority 
job needs to run. E.g. VRAM memory.

We need something like blocking the submitter instead (bad) or detection 
of dependencies in the scheduler (good, but tricky to implement).

Otherwise we can easily run into a deadlock situation with that approach.

Regards,
Christian.

Am 25.05.2017 um 02:00 schrieb Andres Rodriguez:
> When multiple environments are running simultaneously on a system, e.g.
> an X desktop + a SteamVR game session, it may be useful to sacrifice
> performance in one environment in order to boost it on the other.
>
> This series provides a mechanism for a DRM_MASTER to provide exclusive
> gpu access to a group of processes.
>
> Note: This series is built on the assumption that the drm lease patch series
> will extend DRM_MASTER status to lesees.
>
> The libdrm we intend to provide is as follows:
>
> /**
>   * Set the priority of all contexts in a process
>   *
>   * This function will change the priority of all contexts owned by
>   * the process identified by fd.
>   *
>   * \param dev             - \c [in] device handle
>   * \param fd              - \c [in] fd from target process
>   * \param priority        - \c [in] target priority AMDGPU_CTX_PRIORITY_*
>   *
>   * \return  0 on success\n
>   *         <0 - Negative POSIX error code
>   *
>   * \notes @fd can be *any* file descriptor from the target process.
>   * \notes this function requires DRM_MASTER
>   */
> int amdgpu_sched_process_priority_set(amdgpu_device_handle dev,
> 				      int fd, int32_t priority);
>
> /**
>   * Request to raise the minimum required priority to schedule a gpu job
>   *
>   * Submit a request to increase the minimum required priority to schedule
>   * a gpu job. Once this function returns, the gpu scheduler will no longer
>   * consider jobs from contexts with priority lower than @priority.
>   *
>   * The minimum priority considered by the scheduler will be the highest from
>   * all currently active requests.
>   *
>   * Requests are refcounted, and must be balanced using
>   * amdgpu_sched_min_priority_put()
>   *
>   * \param dev             - \c [in] device handle
>   * \param priority        - \c [in] target priority AMDGPU_CTX_PRIORITY_*
>   *
>   * \return  0 on success\n
>   *         <0 - Negative POSIX error code
>   *
>   * \notes this function requires DRM_MASTER
>   */
> int amdgpu_sched_min_priority_get(amdgpu_device_handle dev,
> 				  int32_t priority);
>
> /**
>   * Drop a request to raise the minimum required scheduler priority
>   *
>   * This call balances amdgpu_sched_min_priority_get()
>   *
>   * If no other active requests exists for @priority, the minimum required
>   * priority will decay to a lower level until one is reached with an active
>   * request or the lowest priority is reached.
>   *
>   * \param dev             - \c [in] device handle
>   * \param priority        - \c [in] target priority AMDGPU_CTX_PRIORITY_*
>   *
>   * \return  0 on success\n
>   *         <0 - Negative POSIX error code
>   *
>   * \notes this function requires DRM_MASTER
>   */
> int amdgpu_sched_min_priority_put(amdgpu_device_handle dev,
> 				  int32_t priority);
>
> Using this app, VRComposer can raise the priority of the VRapp and itself. Then
> it can restrict the minimum scheduler priority in order to become exclusive gpu
> clients.
>
> One of the areas I'd like feedback is the following scenario. If a VRapp opens
> a new fd and creates a new context after a call to set_priority, this specific
> context will be lower priority than the rest. If the minimum required priority
> is then raised, it is possible that this new context will be starved and
> deadlock the VRapp.
>
> One solution I had in mind to address this situation, is to make set_priority
> also raise the priority of future contexts created by the VRapp. However, that
> would require keeping track of the requested priority on a per-process data
> structure. The current design appears to steer clean of keeping any process
> specific data, and everything instead of stored on a per-file basis. Which is
> why I did not pursue this approach. But if this is something you'd like me to
> implement let me know.
>
> One could also argue that preventing an application deadlock should be handled
> between the VRComposer and the VRApp. It is not the kernel's responsibility to
> babysit userspace applications and prevent themselves from shooting themselves
> in the foot. The same could be achieved by improper usage of shared fences
> between processes.
>
> Thoughts/feedback/comments on this issue, or others, are appreciated.
>
> Regards,
> Andres
>   
>