[PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly

Deucher, Alexander Alexander.Deucher at amd.com
Tue Jul 18 12:46:34 UTC 2017


> -----Original Message-----
> From: Kuehling, Felix
> Sent: Tuesday, July 18, 2017 12:41 AM
> To: amd-gfx at lists.freedesktop.org; Deucher, Alexander
> Subject: Re: [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by
> tracking queues correctly
> 
> Hi Alex,
> 
> This patch series went into amd-kfd-staging. I'd like to also push it
> into amd-staging-4.11 as I'm just working to minimize any unnecessary
> differences between the branches before the big KFD history rework.
> 
> I rebased it, resolved some contlicts, and removed the declaration of
> get_mec_num from kfd_device_queue_manager.h. Do you want me to
> push that
> rebased patch series?

Sure.  Sounds good.

Alex

> 
> Thanks,
>   Felix
> 
> 
> On 17-07-17 11:52 AM, Oded Gabbay wrote:
> > On Fri, Jul 14, 2017 at 7:24 PM, Alex Deucher <alexdeucher at gmail.com>
> wrote:
> >> On Thu, Jul 13, 2017 at 9:21 PM, Jay Cornwall <Jay.Cornwall at amd.com>
> wrote:
> >>> The number of compute queues available to the KFD was erroneously
> >>> calculated as 64. Only the first MEC can execute compute queues and
> >>> it has 32 queue slots.
> >>>
> >>> This caused the oversubscription limit to be calculated incorrectly,
> >>> leading to a missing chained runlist command at the end of an
> >>> oversubscribed runlist.
> >>>
> >>> v2: Remove unused num_mec field to avoid duplicate logic
> >>> v3: Separate num_mec removal into separate patches
> >>>
> >>> Change-Id: I9e7bba2cc1928b624e3eeb1edb06fdb602e5294f
> >>> Signed-off-by: Jay Cornwall <Jay.Cornwall at amd.com>
> >> Series is:
> >> Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
> >>
> > Hi Jay,
> > Thanks for the patches, I applied them to amdkfd-fixes (after rebasing
> > them over 4.13-rc1)
> >
> > Oded
> >
> >>> ---
> >>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> >>> index 7060daf..aa4006a 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> >>> @@ -140,7 +140,7 @@ void amdgpu_amdkfd_device_init(struct
> amdgpu_device *adev)
> >>>                 /* According to linux/bitmap.h we shouldn't use bitmap_clear if
> >>>                  * nbits is not compile time constant
> >>>                  */
> >>> -               last_valid_bit = adev->gfx.mec.num_mec
> >>> +               last_valid_bit = 1 /* only first MEC can have compute queues */
> >>>                                 * adev->gfx.mec.num_pipe_per_mec
> >>>                                 * adev->gfx.mec.num_queue_per_pipe;
> >>>                 for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
> >>> --
> >>> 2.7.4
> >>>
> >>> _______________________________________________
> >>> amd-gfx mailing list
> >>> amd-gfx at lists.freedesktop.org
> >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx at lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list