Add support for high priority scheduling in amdgpu

Bridgman, John John.Bridgman at
Wed Mar 1 16:14:37 UTC 2017

In patch "drm/amdgpu: implement ring set_priority for gfx_v8 compute" can you remind me why you are only passing pipe and not queue to vi_srbm_select() ?

+static void gfx_v8_0_ring_set_priority_compute(struct amdgpu_ring *ring,  
+					       int priority)  
+	struct amdgpu_device *adev = ring->adev;  
+	if (ring->hw_ip != AMDGPU_HW_IP_COMPUTE)  
+		return;  
+	mutex_lock(&adev->srbm_mutex);  
+	vi_srbm_select(adev, ring->me, ring->pipe, 0, 0);  

>-----Original Message-----
>From: amd-gfx [mailto:amd-gfx-bounces at] On Behalf Of
>Andres Rodriguez
>Sent: Tuesday, February 28, 2017 5:14 PM
>To: amd-gfx at
>Subject: Add support for high priority scheduling in amdgpu
>This patch series introduces a mechanism that allows users with sufficient
>privileges to categorize their work as "high priority". A userspace app can
>create a high priority amdgpu context, where any work submitted to this
>context will receive preferential treatment over any other work.
>High priority contexts will be scheduled ahead of other contexts by the sw gpu
>scheduler. This functionality is generic for all HW blocks.
>Optionally, a ring can implement a set_priority() function that allows
>programming HW specific features to elevate a ring's priority.
>This patch series implements set_priority() for gfx8 compute rings. It takes
>advantage of SPI scheduling and CU reservation to provide improved frame
>latencies for high priority contexts.
>For compute + compute scenarios we get near perfect scheduling latency. E.g.
>one high priority ComputeParticles + one low priority ComputeParticles:
>    - High priority ComputeParticles: 2.0-2.6 ms/frame
>    - Regular ComputeParticles: 35.2-68.5 ms/frame
>For compute + gfx scenarios the high priority compute application does
>experience some latency variance. However, the variance has smaller bounds
>and a smalled deviation then without high priority scheduling.
>Following is a graph of the frame time experienced by a high priority compute
>app in 4 different scenarios to exemplify the compute + gfx latency variance:
>    - ComputeParticles: this scenario invloves running the compute particles
>      sample on its own.
>    - +SSAO: Previous scenario with the addition of running the ssao sample
>      application that clogs the GFX ring with constant work.
>    - +SPI Priority: Previous scenario with the addition of SPI priority
>      programming for compute rings.
>    - +CU Reserve: Previous scenario with the addition of dynamic CU
>      reservation for compute rings.
>Graph link:
>As seen above, high priority contexts for compute allow us to schedule work
>with enhanced confidence of completion latency under high GPU loads. This
>property will be important for VR reprojection workloads.
>Note: The first part of this series is a resend of "Change queue/pipe split
>between amdkfd and amdgpu" with the following changes:
>    - Fixed kfdtest on Kaveri due to shift overflow. Refer to: "drm/amdkfdallow
>      split HQD on per-queue granularity v3"
>    - Used Felix's suggestions for a simplified HQD programming sequence
>    - Added a workaround for a Tonga HW bug during HQD programming
>This series is also available at:
>amd-gfx mailing list
>amd-gfx at

More information about the amd-gfx mailing list