[PATCH 04/14] drm/amdkfd: Fix oversubscription accounting

Felix Kuehling felix.kuehling at amd.com
Tue Dec 5 19:27:17 UTC 2017


On 2017-12-05 03:10 AM, Oded Gabbay wrote:
> On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling at amd.com> wrote:
>> Don't count SDMA queues towards compute HQD oversubscription when
>> deciding whether to create a chained runlist.
>>
>> Signed-off-by: Jay Cornwall <Jay.Cornwall at amd.com>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> index 0b7092e..c3230b9 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> @@ -55,13 +55,14 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>                                 unsigned int *rlib_size,
>>                                 bool *over_subscription)
>>  {
>> -       unsigned int process_count, queue_count;
>> +       unsigned int process_count, queue_count, compute_queue_count;
>>         unsigned int map_queue_size;
>>         unsigned int max_proc_per_quantum = 1;
>>         struct kfd_dev *dev = pm->dqm->dev;
>>
>>         process_count = pm->dqm->processes_count;
>>         queue_count = pm->dqm->queue_count;
>> +       compute_queue_count = queue_count - pm->dqm->sdma_queue_count;
>>
>>         /* check if there is over subscription
>>          * Note: the arbitration between the number of VMIDs and
>> @@ -74,7 +75,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>                 max_proc_per_quantum = dev->max_proc_per_quantum;
>>
>>         if ((process_count > max_proc_per_quantum) ||
>> -           queue_count > get_queues_num(pm->dqm)) {
>> +           compute_queue_count > get_queues_num(pm->dqm)) {
>>                 *over_subscription = true;
>>                 pr_debug("Over subscribed runlist\n");
>>         }
>> --
>> 2.7.4
>>
> Don't you need to update this line as well (I'm less familiar with the
> runlist so just asking) ?
>
> *rlib_size = process_count * sizeof(struct pm4_mes_map_process) +
> queue_count * map_queue_size;

No. This change doesn't directly affect the runlist size. It deals with
HW resource limitations and whether the HWS needs to handle compute
queue oversubscription. SDMA queues don't count against the limited
number of HQDs for compute queues. So we should not count them for
determining compute queue oversubscription. But the SDMA queues are
still part of the runlist IB, so rlib_size doesn't change.

rlib_size will be indirectly affected, because just below this code
modifies the runlist size for the oversubscription case:

        /*
         * Increase the allocation size in case we need a chained run list
         * when over subscription
         */
        if (*over_subscription)
                *rlib_size += sizeof(struct pm4_mes_runlist);

Regards,
  Felix

>
> Oded



More information about the amd-gfx mailing list