[PATCH 1/2] drm/ttm: set ttm_buffer_object pointer as null after it's freed

Christian König ckoenig.leichtzumerken at gmail.com
Mon Sep 10 13:04:00 UTC 2018


Hi Tom,

I'm talking about adding new printks to figure out what the heck is 
going wrong here.

Thanks,
Christian.

Am 10.09.2018 um 14:59 schrieb Tom St Denis:
> Hi Christian,
>
> Are you adding new traces or turning on existing ones?  Would you like 
> me to try them out in my setup?
>
> Tom
>
> On 2018-09-10 8:49 a.m., Christian König wrote:
>> Am 10.09.2018 um 14:05 schrieb Huang Rui:
>>> On Mon, Sep 10, 2018 at 05:25:48PM +0800, Koenig, Christian wrote:
>>>> Am 10.09.2018 um 11:23 schrieb Huang Rui:
>>>>> On Mon, Sep 10, 2018 at 11:00:04AM +0200, Christian König wrote:
>>>>>> Hi Ray,
>>>>>>
>>>>>> well those patches doesn't make sense, the pointer is only local to
>>>>>> the function.
>>>>> You're right.
>>>>> I narrowed it with gdb dump from ttm_bo_bulk_move_lru_tail+0x2b, the
>>>>> use-after-free should be in below codes:
>>>>>
>>>>> man = &bulk->tt[i].first->bdev->man[TTM_PL_TT];
>>>>> ttm_bo_bulk_move_helper(&bulk->tt[i], &man->lru[i], false);
>>>>>
>>>>> Is there a case, when orignal bo is destroyed in the bulk pos, but it
>>>>> doesn't update pos->first pointer, then we still use it during the 
>>>>> bulk
>>>>> moving?
>>>> Only when a per VM BO is freed or the VM destroyed.
>>>>
>>>> The first case should now be handled by "drm/amdgpu: set bulk_moveable
>>>> to false when a per VM is released" and when we use a destroyed VM we
>>>> would see other problems as well.
>>>>
>>> If a VM instance is teared down, all BOs which belong that VM should be
>>> removed from LRU. But how can we submit cmd based on a destroyed VM? 
>>> You
>>> know, we do the bulk move at last step of submission.
>>
>> Well exactly that's the point this can't happen :)
>>
>> Otherwise we would crash because of using freed up memory much 
>> earlier in the command submission.
>>
>> The best idea I have to track this down further is to add some 
>> trace_printk in ttm_bo_bulk_move_helper and amdgpu_bo_destroy and see 
>> why and when we are actually using a destroyed BO.
>>
>> Christian.
>>
>>>
>>>
>>> Thanks,
>>> Ray
>>>
>>>> BTW: Just pushed this commit to the repository, should show up any 
>>>> second.
>>>>
>>>> Christian.
>>>>
>>>>> Thanks,
>>>>> Ray
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> Am 10.09.2018 um 10:57 schrieb Huang Rui:
>>>>>>> It avoids to be refered again after freed.
>>>>>>>
>>>>>>> Signed-off-by: Huang Rui <ray.huang at amd.com>
>>>>>>> Cc: Christian König <christian.koenig at amd.com>
>>>>>>> Cc: Tom StDenis <Tom.StDenis at amd.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/ttm/ttm_bo.c | 1 +
>>>>>>>    1 file changed, 1 insertion(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
>>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>> index 138c989..d3ef5f8 100644
>>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>> @@ -54,6 +54,7 @@ static struct attribute ttm_bo_count = {
>>>>>>>    static void ttm_bo_default_destroy(struct ttm_buffer_object *bo)
>>>>>>>    {
>>>>>>>        kfree(bo);
>>>>>>> +    bo = NULL;
>>>>>>>    }
>>>>>>>    static inline int ttm_mem_type_from_place(const struct 
>>>>>>> ttm_place *place,
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx at lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list