[PATCH libdrm 2/4] amdgpu: add a function to find bo by cpu mapping (v2)

Wed Aug 8 06:48:50 UTC 2018

Am 08.08.2018 um 06:23 schrieb zhoucm1:
>
>
> On 2018年08月08日 12:08, Junwei Zhang wrote:
>> Userspace needs to know if the user memory is from BO or malloc.
>>
>> v2: update mutex range and rebase
>>
>> Signed-off-by: Junwei Zhang <Jerry.Zhang at amd.com>
>> ---
>>   amdgpu/amdgpu.h    | 23 +++++++++++++++++++++++
>>   amdgpu/amdgpu_bo.c | 34 ++++++++++++++++++++++++++++++++++
>>   2 files changed, 57 insertions(+)
>>
>> diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
>> index be83b45..a8c353c 100644
>> --- a/amdgpu/amdgpu.h
>> +++ b/amdgpu/amdgpu.h
>> @@ -678,6 +678,29 @@ int 
>> amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev,
>>                       amdgpu_bo_handle *buf_handle);
>>     /**
>> + * Validate if the user memory comes from BO
>> + *
>> + * \param dev - [in] Device handle. See #amdgpu_device_initialize()
>> + * \param cpu - [in] CPU address of user allocated memory which we
>> + * want to map to GPU address space (make GPU accessible)
>> + * (This address must be correctly aligned).
>> + * \param size - [in] Size of allocation (must be correctly aligned)
>> + * \param buf_handle - [out] Buffer handle for the userptr memory
>> + * if the user memory is not from BO, the buf_handle will be NULL.
>> + * \param offset_in_bo - [out] offset in this BO for this user memory
>> + *
>> + *
>> + * \return   0 on success\n
>> + *          <0 - Negative POSIX Error code
>> + *
>> +*/
>> +int amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,
>> +                  void *cpu,
>> +                  uint64_t size,
>> +                  amdgpu_bo_handle *buf_handle,
>> +                  uint64_t *offset_in_bo);
>> +
>> +/**
>>    * Free previosuly allocated memory
>>    *
>>    * \param   dev           - \c [in] Device handle. See 
>> #amdgpu_device_initialize()
>> diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
>> index b24e698..a7f0662 100644
>> --- a/amdgpu/amdgpu_bo.c
>> +++ b/amdgpu/amdgpu_bo.c
>> @@ -529,6 +529,40 @@ int amdgpu_bo_wait_for_idle(amdgpu_bo_handle bo,
>>       }
>>   }
>>   +int amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,
>> +                  void *cpu,
>> +                  uint64_t size,
>> +                  amdgpu_bo_handle *buf_handle,
>> +                  uint64_t *offset_in_bo)
>> +{
>> +    int i;
>> +    struct amdgpu_bo *bo;
>> +
>> +    if (cpu == NULL || size == 0)
>> +        return -EINVAL;
>> +
>> +    pthread_mutex_lock(&dev->bo_table_mutex);
>> +    for (i = 0; i < dev->bo_handles.max_key; i++) {
> Hi Jerry,
>
> As Christian catched before, iterating all BOs of device will 
> introduce much CPU overhead, this isn't good direction.
> Since cpu virtual address is per-process, you should go to kernel to 
> find them from vm tree, which obviously takes less time.

Yeah, but is also much more overhead to maintain.

Since this is only to fix the behavior of a single buggy application at 
least I'm fine to keep the workaround as simple as this.

If we find a wider use we can still start to use the kernel 
implementation again.

Regards,
Christian.

>
> Regards,
> David Zhou
>> +        bo = handle_table_lookup(&dev->bo_handles, i);
>> +        if (!bo || !bo->cpu_ptr || size > bo->alloc_size)
>> +            continue;
>> +        if (cpu >= bo->cpu_ptr && cpu < (bo->cpu_ptr + bo->alloc_size))
>> +            break;
>> +    }
>> +
>> +    if (i < dev->bo_handles.max_key) {
>> +        atomic_inc(&bo->refcount);
>> +        *buf_handle = bo;
>> +        *offset_in_bo = cpu - bo->cpu_ptr;
>> +    } else {
>> +        *buf_handle = NULL;
>> +        *offset_in_bo = 0;
>> +    }
>> +    pthread_mutex_unlock(&dev->bo_table_mutex);
>> +
>> +    return 0;
>> +}
>> +
>>   int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev,
>>                       void *cpu,
>>                       uint64_t size,
>