[PATCH 07/24] drm/amdkfd: check pcs_enrty valid

Felix Kuehling felix.kuehling at amd.com
Thu Nov 23 20:32:45 UTC 2023


On 2023-11-23 15:18, James Zhu wrote:
>
> On 2023-11-22 17:15, Felix Kuehling wrote:
>>
>> On 2023-11-03 09:11, James Zhu wrote:
>>> Check pcs_enrty valid for pc sampling ioctl.
>>>
>>> Signed-off-by: James Zhu <James.Zhu at amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 30 
>>> ++++++++++++++++++--
>>>   1 file changed, 27 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c 
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
>>> index 4c9fc48e1a6a..36366c8847de 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
>>> @@ -179,6 +179,21 @@ static int kfd_pc_sample_destroy(struct 
>>> kfd_process_device *pdd, uint32_t trace_
>>>   int kfd_pc_sample(struct kfd_process_device *pdd,
>>>                       struct kfd_ioctl_pc_sample_args __user *args)
>>>   {
>>> +    struct pc_sampling_entry *pcs_entry;
>>> +
>>> +    if (args->op != KFD_IOCTL_PCS_OP_QUERY_CAPABILITIES &&
>>> +        args->op != KFD_IOCTL_PCS_OP_CREATE) {
>>> +
>>> +        mutex_lock(&pdd->dev->pcs_data.mutex);
>>> +        pcs_entry = 
>>> idr_find(&pdd->dev->pcs_data.hosttrap_entry.base.pc_sampling_idr,
>>> +                args->trace_id);
>>> +        mutex_unlock(&pdd->dev->pcs_data.mutex);
>>
>> You need to keep holding the lock while the pcs_entry is still used. 
>> That includes any of the kfd_pc_sample_<op> functions below. 
>> Otherwise someone could free it concurrently. It would also simplify 
>> the ..._<op> functions, if they didn't have to worry about the 
>> locking themselves.
> [JZ] pcs_entry is only for this pc sampling process, which has 
> kfd_process->mutex protected here.

OK. That's not obvious. I'm also wary about depending too much on the 
big process lock. We will need to make that locking more granular soon, 
because it is causing performance issues with multi-threaded processes.

Regards,
   Felix


>>
>> Regards,
>>   Felix
>>
>>
>>> +
>>> +        if (!pcs_entry ||
>>> +            pcs_entry->pdd != pdd)
>>> +            return -EINVAL;
>>> +    }
>>> +
>>>       switch (args->op) {
>>>       case KFD_IOCTL_PCS_OP_QUERY_CAPABILITIES:
>>>           return kfd_pc_sample_query_cap(pdd, args);
>>> @@ -187,13 +202,22 @@ int kfd_pc_sample(struct kfd_process_device *pdd,
>>>           return kfd_pc_sample_create(pdd, args);
>>>         case KFD_IOCTL_PCS_OP_DESTROY:
>>> -        return kfd_pc_sample_destroy(pdd, args->trace_id);
>>> +        if (pcs_entry->enabled)
>>> +            return -EBUSY;
>>> +        else
>>> +            return kfd_pc_sample_destroy(pdd, args->trace_id);
>>>         case KFD_IOCTL_PCS_OP_START:
>>> -        return kfd_pc_sample_start(pdd);
>>> +        if (pcs_entry->enabled)
>>> +            return -EALREADY;
>>> +        else
>>> +            return kfd_pc_sample_start(pdd);
>>>         case KFD_IOCTL_PCS_OP_STOP:
>>> -        return kfd_pc_sample_stop(pdd);
>>> +        if (!pcs_entry->enabled)
>>> +            return -EALREADY;
>>> +        else
>>> +            return kfd_pc_sample_stop(pdd);
>>>       }
>>>         return -EINVAL;


More information about the amd-gfx mailing list