[PATCH] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly

Andres Rodriguez andresx7 at gmail.com
Thu Jul 13 18:36:18 UTC 2017


On 2017-07-12 02:26 PM, Jay Cornwall wrote:
> The number of compute queues available to the KFD was erroneously
> calculated as 64. Only the first MEC can execute compute queues and
> it has 32 queue slots.
> 
> This caused the oversubscription limit to be calculated incorrectly,
> leading to a missing chained runlist command at the end of an
> oversubscribed runlist.
> 
> Change-Id: Ic4a139c04b8a6d025fbb831a0a67e98728bfe461
> Signed-off-by: Jay Cornwall <Jay.Cornwall at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 7060daf..aa4006a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -140,7 +140,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>   		/* According to linux/bitmap.h we shouldn't use bitmap_clear if
>   		 * nbits is not compile time constant
>   		 */
> -		last_valid_bit = adev->gfx.mec.num_mec
> +		last_valid_bit = 1 /* only first MEC can have compute queues */

Hey Jay,

Minor nitpick. We already have some similar resource patching in 
kgd2kfd_device_init(), and I think it would be good to keep all of these 
together.

Otherwise, looks good to me.

Regards,
Andres

>   				* adev->gfx.mec.num_pipe_per_mec
>   				* adev->gfx.mec.num_queue_per_pipe;
>   		for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
> 


More information about the amd-gfx mailing list