[Intel-gfx] [PATCH] drm/i915/ttm: Fix access_memory null pointer exception
Matthew Auld
matthew.auld at intel.com
Fri Oct 14 10:52:56 UTC 2022
On 14/10/2022 11:38, Das, Nirmoy wrote:
> Hi Matt,
>
> On 10/14/2022 12:13 PM, Matthew Auld wrote:
>> On 14/10/2022 10:27, Das, Nirmoy wrote:
>>> Hi Matt
>>>
>>> On 10/14/2022 10:39 AM, Matthew Auld wrote:
>>>> On 13/10/2022 18:56, Jonathan Cavitt wrote:
>>>>> i915_ttm_to_gem can return a NULL pointer, which is
>>>>> dereferenced in i915_ttm_access_memory without first
>>>>> checking if it is NULL. Inspecting
>>>>> i915_ttm_io_mem_reserve, it appears the correct
>>>>> behavior in this case is to return -EINVAL.
>>>>
>>>> The GEM object has already been dereferenced before this point, if
>>>> you look at the caller (vm_access_ttm). The NULL obj thing is to
>>>> identify "ttm ghost objects", and I don't think a normal userpace
>>>> object can suddenly become one (access_memory comes from ptrace).
>>>> AFAIK ghost objects are just for temporarily hanging on to some
>>>> memory/state, while the dma-resv is busy. In the places where ttm is
>>>> the one giving us the object, then it might be possible to see these
>>>> types of objects, since ttm could in theory pass one in (like during
>>>> eviction).
>>>
>>>
>>> Yes, we should not hit this. Thanks for the nice "ttm ghost objects"
>>> reminder :)
>>>
>>>
>>> I think we can still have this check to avoid code analysis tool
>>> warnings, what do you think ?
>>
>> IMHO I think it just makes it harder to understand the code, since
>> conceptually it should be impossible, given how "ghost objects"
>> actually work. Adding such a check gives the impression that it is
>> somehow now possible to be given one here (like with eviction etc).
>> AFAIK just letting it crash is fine, instead of littering the code
>> with NULL checks for stuff that is never meant to be NULL and would be
>> a driver bug. Also there are a bunch of other places not checking that
>> i915_ttm_to_gem() returns NULL, so why just here?
>
> This is tricky because some place we might receive NULL and some other
> places we might not(from i915_ttm_to_gem). I also don't like the idea of
> sprinkling NULL check everywhere.
>
> I think the issue is i915_ttm_to_gem returns NULL for non-i915 BO. We
> should move "if (bo->destroy != i915_ttm_bo_destroy)" check to the
> respective function where we
>
> expect ghost object. That should make the static code analyzer happy and
> also makes it very clear which function expect ghost objects.
Yeah, that sounds like a really nice idea to me. amdgpu looks to have
something like amdgpu_bo_is_amdgpu_bo() for the spots that might be
"ghost objects". Maybe we can add something like i915_ttm_is_ghost_bo()
or similar for our needs.
>
>
>> Did the code analysis tool find something? Also why doesn't it
>> complain about vm_access_ttm(), which is the one actually calling
>> access_memory() and is itself also doing i915_ttm_to_gem() and also
>> not checking for NULL?
>
>
> Yes, I think the patch idea came from our static code analyzer warning
> but I can't seem to open the URL. I am also not sure why it doesn't
> complain for other cases.
>
>
> Thanks,
>
> Nirmoy
>
>>
>>>
>>>
>>> Thanks,
>>>
>>> Nirmoy
>>>
>>>>
>>>>>
>>>>> Fixes: 26b15eb0 ("drm/i915/ttm: implement access_memory")
>>>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
>>>>> Suggested-by: John C Harrison <John.C.Harrison at intel.com>
>>>>> CC: Matthew Auld <matthew.auld at intel.com>
>>>>> CC: Andrzej Hajda <andrzej.hajda at intel.com>
>>>>> CC: Nirmoy Das <nirmoy.das at intel.com>
>>>>> CC: Andi Shyti <andi.shyti at linux.intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 9 +++++++--
>>>>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>>> index d63f30efd631..b569624f2ed9 100644
>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>>> @@ -704,11 +704,16 @@ static int i915_ttm_access_memory(struct
>>>>> ttm_buffer_object *bo,
>>>>> int len, int write)
>>>>> {
>>>>> struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
>>>>> - resource_size_t iomap = obj->mm.region->iomap.base -
>>>>> - obj->mm.region->region.start;
>>>>> + resource_size_t iomap;
>>>>> unsigned long page = offset >> PAGE_SHIFT;
>>>>> unsigned long bytes_left = len;
>>>>> + if (!obj)
>>>>> + return -EINVAL;
>>>>> +
>>>>> + iomap = obj->mm.region->iomap.base -
>>>>> + obj->mm.region->region.start;
>>>>> +
>>>>> /*
>>>>> * TODO: For now just let it fail if the resource is
>>>>> non-mappable,
>>>>> * otherwise we need to perform the memcpy from the gpu here,
>>>>> without
More information about the Intel-gfx
mailing list