[PATCH 21/24] drm/amdkfd: add queue remapping

James Zhu jamesz at amd.com
Thu Nov 23 16:25:15 UTC 2023


On 2023-11-22 17:35, Felix Kuehling wrote:
>
> On 2023-11-03 09:11, James Zhu wrote:
>> Add queue remapping to force the waves in any running
>> processes to complete a CWSR trap.
>
> Please add an explanation why this is needed.

[JZ] Even though the profiling-enabled bits is turned off, the CWSR trap 
handlers for some kernels with this process may still in running stage, 
this will

force the waves in any running processes to complete a CWSR trap, and 
make sure pc sampling is completely stopped with this process.   I will 
add it later.

>
>
>>
>> Signed-off-by: James Zhu <James.Zhu at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 11 +++++++++++
>>   drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h |  5 +++++
>>   drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c          |  3 +++
>>   3 files changed, 19 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> index c0e71543389a..a3f57be63f4f 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> @@ -3155,6 +3155,17 @@ int debug_refresh_runlist(struct 
>> device_queue_manager *dqm)
>>       return debug_map_and_unlock(dqm);
>>   }
>>   +void remap_queue(struct device_queue_manager *dqm,
>> +                enum kfd_unmap_queues_filter filter,
>> +                uint32_t filter_param,
>> +                uint32_t grace_period)
>
> Not sure if you need the filter and grace period parameters in this 
> function. What's the point of exposing that to callers who just want 
> to trigger a CWSR? You could also change the function name to reflect 
> the purpose of the function, rather than the implementation.
[JZ] Just want to create a general function in case that used by others. 
I am fine to remove passing filter_param/grace_period
>
> Regards,
>   Felix
>
>
>> +{
>> +    dqm_lock(dqm);
>> +    if (!dqm->dev->kfd->shared_resources.enable_mes)
>> +        execute_queues_cpsch(dqm, filter, filter_param, grace_period);
>> +    dqm_unlock(dqm);
>> +}
>> +
>>   #if defined(CONFIG_DEBUG_FS)
>>     static void seq_reg_dump(struct seq_file *m,
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> index cf7e182588f8..f8aae3747a36 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> @@ -303,6 +303,11 @@ int debug_lock_and_unmap(struct 
>> device_queue_manager *dqm);
>>   int debug_map_and_unlock(struct device_queue_manager *dqm);
>>   int debug_refresh_runlist(struct device_queue_manager *dqm);
>>   +void remap_queue(struct device_queue_manager *dqm,
>> +                enum kfd_unmap_queues_filter filter,
>> +                uint32_t filter_param,
>> +                uint32_t grace_period);
>> +
>>   static inline unsigned int get_sh_mem_bases_32(struct 
>> kfd_process_device *pdd)
>>   {
>>       return (pdd->lds_base >> 16) & 0xFF;
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
>> index e8f0559b618e..66670cdb813a 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
>> @@ -24,6 +24,7 @@
>>   #include "kfd_priv.h"
>>   #include "amdgpu_amdkfd.h"
>>   #include "kfd_pc_sampling.h"
>> +#include "kfd_device_queue_manager.h"
>>     struct supported_pc_sample_info {
>>       uint32_t ip_version;
>> @@ -164,6 +165,8 @@ static int kfd_pc_sample_stop(struct 
>> kfd_process_device *pdd,
>> cancel_work_sync(&pdd->dev->pcs_data.hosttrap_entry.base.pc_sampling_work);
>> kfd_process_set_trap_pc_sampling_flag(&pdd->qpd,
>> pdd->dev->pcs_data.hosttrap_entry.base.pc_sample_info.method, false);
>> +        remap_queue(pdd->dev->dqm,
>> +            KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0, 
>> USE_DEFAULT_GRACE_PERIOD);
>>             mutex_lock(&pdd->dev->pcs_data.mutex);
>> pdd->dev->pcs_data.hosttrap_entry.base.target_simd = 0;


More information about the amd-gfx mailing list