[PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly

Alex Deucher alexdeucher at gmail.com
Fri Jul 14 16:24:30 UTC 2017


On Thu, Jul 13, 2017 at 9:21 PM, Jay Cornwall <Jay.Cornwall at amd.com> wrote:
> The number of compute queues available to the KFD was erroneously
> calculated as 64. Only the first MEC can execute compute queues and
> it has 32 queue slots.
>
> This caused the oversubscription limit to be calculated incorrectly,
> leading to a missing chained runlist command at the end of an
> oversubscribed runlist.
>
> v2: Remove unused num_mec field to avoid duplicate logic
> v3: Separate num_mec removal into separate patches
>
> Change-Id: I9e7bba2cc1928b624e3eeb1edb06fdb602e5294f
> Signed-off-by: Jay Cornwall <Jay.Cornwall at amd.com>

Series is:
Reviewed-by: Alex Deucher <alexander.deucher at amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 7060daf..aa4006a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -140,7 +140,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>                 /* According to linux/bitmap.h we shouldn't use bitmap_clear if
>                  * nbits is not compile time constant
>                  */
> -               last_valid_bit = adev->gfx.mec.num_mec
> +               last_valid_bit = 1 /* only first MEC can have compute queues */
>                                 * adev->gfx.mec.num_pipe_per_mec
>                                 * adev->gfx.mec.num_queue_per_pipe;
>                 for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


More information about the amd-gfx mailing list