[Intel-gfx] [PATCH] drm/i915/ttm: Fix access_memory null pointer exception

Das, Nirmoy nirmoy.das at intel.com
Fri Oct 14 10:38:48 UTC 2022


Hi Matt,

On 10/14/2022 12:13 PM, Matthew Auld wrote:
> On 14/10/2022 10:27, Das, Nirmoy wrote:
>> Hi Matt
>>
>> On 10/14/2022 10:39 AM, Matthew Auld wrote:
>>> On 13/10/2022 18:56, Jonathan Cavitt wrote:
>>>> i915_ttm_to_gem can return a NULL pointer, which is
>>>> dereferenced in i915_ttm_access_memory without first
>>>> checking if it is NULL.  Inspecting
>>>> i915_ttm_io_mem_reserve, it appears the correct
>>>> behavior in this case is to return -EINVAL.
>>>
>>> The GEM object has already been dereferenced before this point, if 
>>> you look at the caller (vm_access_ttm). The NULL obj thing is to 
>>> identify "ttm ghost objects", and I don't think a normal userpace 
>>> object can suddenly become one (access_memory comes from ptrace). 
>>> AFAIK ghost objects are just for temporarily hanging on to some 
>>> memory/state, while the dma-resv is busy. In the places where ttm is 
>>> the one giving us the object, then it might be possible to see these 
>>> types of objects, since ttm could in theory pass one in (like during 
>>> eviction).
>>
>>
>> Yes, we should not hit this.  Thanks for the nice "ttm ghost objects" 
>> reminder :)
>>
>>
>> I think we can still have this check to avoid code analysis tool 
>> warnings, what do you think ?
>
> IMHO I think it just makes it harder to understand the code, since 
> conceptually it should be impossible, given how "ghost objects" 
> actually work. Adding such a check gives the impression that it is 
> somehow now possible to be given one here (like with eviction etc). 
> AFAIK just letting it crash is fine, instead of littering the code 
> with NULL checks for stuff that is never meant to be NULL and would be 
> a driver bug. Also there are a bunch of other places not checking that 
> i915_ttm_to_gem() returns NULL, so why just here?

This is tricky because some place we might receive NULL and some other 
places we might not(from i915_ttm_to_gem). I also don't like the idea of 
sprinkling NULL check everywhere.

I think the issue is i915_ttm_to_gem  returns NULL for non-i915 BO. We 
should move "if (bo->destroy != i915_ttm_bo_destroy)" check to the 
respective function where we

expect ghost object. That should make the static code analyzer happy and 
also makes it very clear which function expect ghost objects.


> Did the code analysis tool find something? Also why doesn't it 
> complain about vm_access_ttm(), which is the one actually calling 
> access_memory() and is itself also doing i915_ttm_to_gem() and also 
> not checking for NULL?


Yes, I think the patch idea came from our static code analyzer warning 
but I can't seem to open the URL. I am also not sure why it doesn't 
complain for other cases.


Thanks,

Nirmoy

>
>>
>>
>> Thanks,
>>
>> Nirmoy
>>
>>>
>>>>
>>>> Fixes: 26b15eb0 ("drm/i915/ttm: implement access_memory")
>>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
>>>> Suggested-by: John C Harrison <John.C.Harrison at intel.com>
>>>> CC: Matthew Auld <matthew.auld at intel.com>
>>>> CC: Andrzej Hajda <andrzej.hajda at intel.com>
>>>> CC: Nirmoy Das <nirmoy.das at intel.com>
>>>> CC: Andi Shyti <andi.shyti at linux.intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 9 +++++++--
>>>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
>>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> index d63f30efd631..b569624f2ed9 100644
>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> @@ -704,11 +704,16 @@ static int i915_ttm_access_memory(struct 
>>>> ttm_buffer_object *bo,
>>>>                     int len, int write)
>>>>   {
>>>>       struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
>>>> -    resource_size_t iomap = obj->mm.region->iomap.base -
>>>> -        obj->mm.region->region.start;
>>>> +    resource_size_t iomap;
>>>>       unsigned long page = offset >> PAGE_SHIFT;
>>>>       unsigned long bytes_left = len;
>>>>   +    if (!obj)
>>>> +        return -EINVAL;
>>>> +
>>>> +    iomap = obj->mm.region->iomap.base -
>>>> +        obj->mm.region->region.start;
>>>> +
>>>>       /*
>>>>        * TODO: For now just let it fail if the resource is 
>>>> non-mappable,
>>>>        * otherwise we need to perform the memcpy from the gpu here, 
>>>> without


More information about the Intel-gfx mailing list