[PATCH] drm/amdkfd: fix add queue process context clear for hsa non-init cases

Felix Kuehling felix.kuehling at amd.com
Wed Sep 13 01:01:19 UTC 2023


On 2023-09-12 20:53, Kim, Jonathan wrote:
> [Public]
>
>> -----Original Message-----
>> From: Kuehling, Felix <Felix.Kuehling at amd.com>
>> Sent: Tuesday, September 12, 2023 8:36 PM
>> To: Kim, Jonathan <Jonathan.Kim at amd.com>; amd-gfx at lists.freedesktop.org
>> Cc: Ji, Ruili <Ruili.Ji at amd.com>; Guo, Shikai <Shikai.Guo at amd.com>;
>> JinHuiEricHuang at amd.com
>> Subject: Re: [PATCH] drm/amdkfd: fix add queue process context clear for hsa
>> non-init cases
>>
>> On 2023-09-12 8:17, Jonathan Kim wrote:
>>> There are cases where HSA is not initialized when adding queues
>> This statement doesn't make sense to me. If HSA is not initialized, it
>> means user mode hasn't opened the KFD device. So it can't create queues.
>> What do you really mean here?
> I meant the call to runtime enable e.g. KFD test can add a queue without runtime enable call.

OK, this can also happen when you run an older version of the HSA 
runtime that doesn't support the ROCm debugger yet. Please update the 
patch description accordingly.

Thanks,
   Felix


>
> Thanks,
>
> Jon
>
>> Regards,
>>     Felix
>>
>>
>>>    and
>>> the ADD_QUEUE API should clear the MES process context instead of
>>> SET_SHADER_DEBUGGER.
>>>
>>> The only time ADD_QUEUE.skip_process_ctx_clear is required is for
>>> debugger use cases and a debugged process is always runtime enabled
>>> when adding a queue.
>>>
>>> Tested-by: Shikai Guo <shikai.guo at amd.com>
>>> Signed-off-by: Jonathan Kim <jonathan.kim at amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 6 ++++--
>>>    1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>>> index 6d07a5dd2648..77159b03a422 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>>> @@ -227,8 +227,10 @@ static int add_queue_mes(struct
>> device_queue_manager *dqm, struct queue *q,
>>>      queue_input.tba_addr = qpd->tba_addr;
>>>      queue_input.tma_addr = qpd->tma_addr;
>>>      queue_input.trap_en = !kfd_dbg_has_cwsr_workaround(q->device);
>>> -   queue_input.skip_process_ctx_clear = qpd->pqm->process-
>>> debug_trap_enabled ||
>>> -
>> kfd_dbg_has_ttmps_always_setup(q->device);
>>> +   queue_input.skip_process_ctx_clear =
>>> +           qpd->pqm->process->runtime_info.runtime_state ==
>> DEBUG_RUNTIME_STATE_ENABLED &&
>>> +                                           (qpd->pqm->process-
>>> debug_trap_enabled ||
>>> +
>> kfd_dbg_has_ttmps_always_setup(q->device));
>>>      queue_type = convert_to_mes_queue_type(q->properties.type);
>>>      if (queue_type < 0) {


More information about the amd-gfx mailing list