[Intel-gfx] [PATCH 1/5] drm/i915: audit bo->resource usage v3

Christian König ckoenig.leichtzumerken at gmail.com
Wed Jan 25 11:35:39 UTC 2023


Am 25.01.23 um 11:21 schrieb Matthew Auld:
> On Wed, 25 Jan 2023 at 10:07, Christian König
> <ckoenig.leichtzumerken at gmail.com> wrote:
>> Am 25.01.23 um 10:56 schrieb Matthew Auld:
>>> On Tue, 24 Jan 2023 at 17:15, Matthew Auld
>>> <matthew.william.auld at gmail.com> wrote:
>>>> On Tue, 24 Jan 2023 at 13:48, Matthew Auld
>>>> <matthew.william.auld at gmail.com> wrote:
>>>>> On Tue, 24 Jan 2023 at 12:57, Christian König
>>>>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>>>>> From: Christian König <ckoenig.leichtzumerken at gmail.com>
>>>>>>
>>>>>> Make sure we can at least move and alloc TT objects without backing store.
>>>>>>
>>>>>> v2: clear the tt object even when no resource is allocated.
>>>>>> v3: add Matthews changes for i915 as well.
>>>>>>
>>>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>>> Reviewed-by: Matthew Auld <matthew.auld at intel.com>
>>>> Ofc that assumes intel-gfx CI is now happy with the series.
>>> There are still some nasty failures it seems (in the extended test
>>> list). But it looks like the series is already merged. Can we quickly
>>> revert and try again?
>> Ah, crap. I thought everything would be fine after the CI gave it's go.
>>
>> Which patch is causing the fallout?
> I'm not sure. I think all of the patches kind of interact with each
> other, but for sure there is an issue with the first patch. There is
> one splat like:

Well I would rather like to revert as less as possible.

Are you sure that this isn't only on some i915 specific branch with not 
yet upstream changes?

I can't even find the i915_gem_obj_copy_ttm function in drm-misc-next 
nor drm-next.

Regards,
Christian.

>
> <1>[  109.735148] BUG: kernel NULL pointer dereference, address:
> 0000000000000010
> <1>[  109.735151] #PF: supervisor read access in kernel mode
> <1>[  109.735152] #PF: error_code(0x0000) - not-present page
> <6>[  109.735153] PGD 0 P4D 0
> <4>[  109.735155] Oops: 0000 [#1] PREEMPT SMP NOPTI
> <4>[  109.735157] CPU: 1 PID: 92 Comm: kworker/u12:6 Not tainted
> 6.2.0-rc5-Patchwork_113269v1-gc4d436608c4e+ #1
> <4>[  109.735159] Hardware name: Gigabyte Technology Co., Ltd. GB-Z390
> Garuda/GB-Z390 Garuda-CF, BIOS IG1c 11/19/2019
> <4>[  109.735160] Workqueue: events_unbound async_run_entry_fn
> <4>[  109.735163] RIP: 0010:i915_ttm_resource_mappable+0x4/0x30 [i915]
> <4>[  109.735286] Code: b8 f9 ff ff ff eb c2 e8 aa 5e 52 e1 e9 4f 0f
> 18 00 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
> 66 0f 1f 00 <8b> 57 10 b8 01 00 00 00 85 d2 74 15 48 8b 47 08 48 05 ff
> 0f 00 00
> <4>[  109.735288] RSP: 0018:ffffc90000f339a8 EFLAGS: 00010246
> <4>[  109.735289] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> ffff88810cea3a00
> <4>[  109.735290] RDX: 0000000000000000 RSI: ffffc90000f33af0 RDI:
> 0000000000000000
> <4>[  109.735292] RBP: ffff88811645d7c0 R08: 0000000000000000 R09:
> ffff888123afa940
> <4>[  109.735292] R10: 0000000000000001 R11: ffff888104b70040 R12:
> 0000000000000000
> <4>[  109.735293] R13: 0000000000000000 R14: ffffc90000f33b08 R15:
> ffffc90000f33af0
> <4>[  109.735294] FS:  0000000000000000(0000)
> GS:ffff8884ad680000(0000) knlGS:0000000000000000
> <4>[  109.735295] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4>[  109.735296] CR2: 0000000000000010 CR3: 000000011f9c6003 CR4:
> 00000000003706e0
> <4>[  109.735297] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> <4>[  109.735298] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> <4>[  109.735299] Call Trace:
> <4>[  109.735300]  <TASK>
> <4>[  109.735301]  __i915_ttm_move+0x128/0x940 [i915]
> <4>[  109.735408]  ? dma_resv_iter_next+0x91/0xb0
> <4>[  109.735412]  ? dma_resv_iter_first+0x42/0xb0
> <4>[  109.735414]  ? i915_deps_add_resv+0x4c/0xc0 [i915]
> <4>[  109.735520]  i915_gem_obj_copy_ttm+0x12f/0x250 [i915]
> <4>[  109.735625]  i915_ttm_restore+0x167/0x250 [i915]
> <4>[  109.735759]  i915_gem_process_region+0x27a/0x3b0 [i915]
> <4>[  109.735881]  i915_ttm_restore_region+0x4b/0x70 [i915]
> <4>[  109.735999]  lmem_restore+0x3a/0x60 [i915]
> <4>[  109.736101]  i915_gem_resume+0x4c/0x100 [i915]
> <4>[  109.736202]  i915_drm_resume+0xc2/0x170 [i915]
>
> Plus some other less obvious issue(s) with some tests failing.
>
>> Christian.



More information about the Intel-gfx mailing list