ttm crash on init
Tom St Denis
tom.stdenis at amd.com
Mon Sep 17 18:01:54 UTC 2018
On 2018-09-17 1:55 p.m., Christian König wrote:
> Am 17.09.2018 um 19:50 schrieb Tom St Denis:
>> On 2018-09-17 1:45 p.m., Christian König wrote:
>>> Mhm, not the slightest idea.
>>>
>>> That nearly looks like adev->stolen_vga_memory already contains
>>> something.
>>
>> Nope,
>>
>> [ 51.564605] >>>adev->stolen_vga_memory == (null)
>> [ 51.564619] kasan: CONFIG_KASAN_INLINE enabled
>> [ 51.564877] kasan: GPF could be caused by NULL-ptr deref or user
>> memory access
>> [ 51.565071] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
>> KASAN NOPTI
>> [ 51.565254] CPU: 6 PID: 3863 Comm: modprobe Not tainted 4.19.0-rc1+
>> #30
>> [ 51.565425] Hardware name: System manufacturer System Product
>> Name/TUF B350M-PLUS GAMING, BIOS 4011 04/19/2018
>> [ 51.565714] RIP: 0010:amdgpu_bo_create_kernel+0x59/0x1a0 [amdgpu]
>>
>> That's me printing out the value of the value for stolen_vga_memory
>> before the call to allocate it.
>
> What does amdgpu_bo_create_kernel+0x59 points to?
I've never really got line numbers to work with the kernel but if I had
to guess I'd say right here
int amdgpu_bo_create_kernel(struct amdgpu_device *adev,
unsigned long size, int align,
u32 domain, struct amdgpu_bo **bo_ptr,
u64 *gpu_addr, void **cpu_addr)
{
int r;
r = amdgpu_bo_create_reserved(adev, size, align, domain, bo_ptr,
gpu_addr, cpu_addr);
if (r)
return r;
*bo_ptr is NULL ===> amdgpu_bo_unreserve(*bo_ptr);
return 0;
}
Which then results in
static inline void amdgpu_bo_unreserve(struct amdgpu_bo *bo)
{
ttm_bo_unreserve(&bo->tbo);
}
Which then passes the address NULL + offsetof(tbo) to ttm_bo_unreserve:
static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo)
{
if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
spin_lock(&bo->bdev->glob->lru_lock);
ttm_bo_add_to_lru(bo);
spin_unlock(&bo->bdev->glob->lru_lock);
}
reservation_object_unlock(bo->resv);
}
Which likely faults on reading bo->mem.placement since the address is bogus.
The report is from amdgpu_bo_create_kernel because everything is a macro
or inlined... :-)
Tom
>
> Christian.
>
>>
>> Tom
>>
>>
>>>
>>> Christian.
>>>
>>> Am 17.09.2018 um 18:47 schrieb Tom St Denis:
>>>> On 2018-09-17 12:21 p.m., Tom St Denis wrote:
>>>>> (attached). I'll try to bisect in a second. Is anyone aware of this?
>>>>>
>>>>> Tom
>>>>
>>>> Bisection led to:
>>>>
>>>> a327772a5655ff4fb104c8aae6515faa461df466 is the first bad commit
>>>> commit a327772a5655ff4fb104c8aae6515faa461df466
>>>> Author: Christian König <christian.koenig at amd.com>
>>>> Date: Fri Sep 14 21:06:50 2018 +0200
>>>>
>>>> drm/amdgpu: drop size check
>>>>
>>>> We no don't allocate zero sized kernel BOs any longer.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>> Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
>>>>
>>>> :040000 040000 265e4fa231d367d354e4c66600b8f98a4d2f04c4
>>>> 3702baaeb2423361dcd7eac8c533edace760ae3e M drivers
>>>>
>>>>
>>>> As the culprit.
>>>>
>>>> Cheers,
>>>> Tom
>>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
More information about the amd-gfx
mailing list