[PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly

Felix Kuehling felix.kuehling at amd.com
Tue Jul 18 04:41:04 UTC 2017


Hi Alex,

This patch series went into amd-kfd-staging. I'd like to also push it
into amd-staging-4.11 as I'm just working to minimize any unnecessary
differences between the branches before the big KFD history rework.

I rebased it, resolved some contlicts, and removed the declaration of
get_mec_num from kfd_device_queue_manager.h. Do you want me to push that
rebased patch series?

Thanks,
  Felix


On 17-07-17 11:52 AM, Oded Gabbay wrote:
> On Fri, Jul 14, 2017 at 7:24 PM, Alex Deucher <alexdeucher at gmail.com> wrote:
>> On Thu, Jul 13, 2017 at 9:21 PM, Jay Cornwall <Jay.Cornwall at amd.com> wrote:
>>> The number of compute queues available to the KFD was erroneously
>>> calculated as 64. Only the first MEC can execute compute queues and
>>> it has 32 queue slots.
>>>
>>> This caused the oversubscription limit to be calculated incorrectly,
>>> leading to a missing chained runlist command at the end of an
>>> oversubscribed runlist.
>>>
>>> v2: Remove unused num_mec field to avoid duplicate logic
>>> v3: Separate num_mec removal into separate patches
>>>
>>> Change-Id: I9e7bba2cc1928b624e3eeb1edb06fdb602e5294f
>>> Signed-off-by: Jay Cornwall <Jay.Cornwall at amd.com>
>> Series is:
>> Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
>>
> Hi Jay,
> Thanks for the patches, I applied them to amdkfd-fixes (after rebasing
> them over 4.13-rc1)
>
> Oded
>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> index 7060daf..aa4006a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> @@ -140,7 +140,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>>>                 /* According to linux/bitmap.h we shouldn't use bitmap_clear if
>>>                  * nbits is not compile time constant
>>>                  */
>>> -               last_valid_bit = adev->gfx.mec.num_mec
>>> +               last_valid_bit = 1 /* only first MEC can have compute queues */
>>>                                 * adev->gfx.mec.num_pipe_per_mec
>>>                                 * adev->gfx.mec.num_queue_per_pipe;
>>>                 for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list