[Intel-gfx] [PATCH] drm/i915/execlists: Backtrack along timeline
Mika Kuoppala
mika.kuoppala at linux.intel.com
Fri Aug 9 08:48:30 UTC 2019
Chris Wilson <chris at chris-wilson.co.uk> writes:
> After a preempt-to-busy, we may find an active request that is caught
> between execution states. Walk back along the timeline instead of the
> execution list to be safe.
>
> [ 106.417541] i915 0000:00:02.0: Resetting rcs0 for preemption time out
> [ 106.417659] ==================================================================
> [ 106.418041] BUG: KASAN: slab-out-of-bounds in __execlists_reset+0x2f2/0x440 [i915]
> [ 106.418123] Read of size 8 at addr ffff888703506b30 by task swapper/1/0
> [ 106.418194]
> [ 106.418267] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G U 5.3.0-rc3+ #5
> [ 106.418344] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
> [ 106.418434] Call Trace:
> [ 106.418508] <IRQ>
> [ 106.418585] dump_stack+0x5b/0x90
> [ 106.418941] ? __execlists_reset+0x2f2/0x440 [i915]
> [ 106.419022] print_address_description+0x67/0x32d
> [ 106.419376] ? __execlists_reset+0x2f2/0x440 [i915]
> [ 106.419731] ? __execlists_reset+0x2f2/0x440 [i915]
> [ 106.419810] __kasan_report.cold.6+0x1a/0x3c
> [ 106.419888] ? __trace_bprintk+0xc0/0xd0
> [ 106.420239] ? __execlists_reset+0x2f2/0x440 [i915]
> [ 106.420318] check_memory_region+0x144/0x1c0
> [ 106.420671] __execlists_reset+0x2f2/0x440 [i915]
> [ 106.421029] execlists_reset+0x3d/0x50 [i915]
> [ 106.421387] intel_engine_reset+0x203/0x3a0 [i915]
> [ 106.421744] ? igt_reset_nop+0x2b0/0x2b0 [i915]
> [ 106.421825] ? _raw_spin_trylock_bh+0xe0/0xe0
> [ 106.421901] ? rcu_core+0x1b9/0x6a0
> [ 106.422251] preempt_reset+0x9a/0xf0 [i915]
> [ 106.422333] tasklet_action_common.isra.15+0xc0/0x1e0
> [ 106.422685] ? execlists_submit_request+0x200/0x200 [i915]
> [ 106.422764] __do_softirq+0x106/0x3cf
> [ 106.422840] irq_exit+0xdc/0xf0
> [ 106.422914] smp_apic_timer_interrupt+0x81/0x1c0
> [ 106.422988] apic_timer_interrupt+0xf/0x20
> [ 106.423059] </IRQ>
> [ 106.423144] RIP: 0010:cpuidle_enter_state+0xc3/0x620
> [ 106.423222] Code: 24 0f 1f 44 00 00 31 ff e8 da 87 9c ff 80 7c 24 10 00 74 12 9c 58 f6 c4 02 0f 85 33 05 00 00 31 ff e8 c1 77 a3 ff fb 45 85 e4 <0f> 89 bf 02 00 00 48 8d 7d 10 e8 4e 45 b9 ff c7 45 10 00 00 00 00
> [ 106.423311] RSP: 0018:ffff88881c30fda8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
> [ 106.423390] RAX: 0000000000000000 RBX: ffffffff825b4c80 RCX: ffffffff810c8a00
> [ 106.423465] RDX: dffffc0000000000 RSI: 0000000039f89620 RDI: ffff88881f6b00a8
> [ 106.423540] RBP: ffff88881f6b5bf8 R08: 0000000000000002 R09: 000000000002ed80
> [ 106.423616] R10: 0000003fdd956146 R11: ffff88881c2d1e47 R12: 0000000000000008
> [ 106.423691] R13: 0000000000000008 R14: ffffffff825b4f80 R15: ffffffff825b4fc0
> [ 106.423772] ? sched_idle_set_state+0x20/0x30
> [ 106.423851] ? cpuidle_enter_state+0xa6/0x620
> [ 106.423874] ? tick_nohz_idle_stop_tick+0x1d1/0x3f0
> [ 106.423896] cpuidle_enter+0x37/0x60
> [ 106.423919] do_idle+0x246/0x280
> [ 106.423941] ? arch_cpu_idle_exit+0x30/0x30
> [ 106.423964] ? __wake_up_common+0x46/0x240
> [ 106.423986] cpu_startup_entry+0x14/0x20
> [ 106.424009] start_secondary+0x1b0/0x200
> [ 106.424031] ? set_cpu_sibling_map+0x990/0x990
> [ 106.424054] secondary_startup_64+0xa4/0xb0
> [ 106.424075]
> [ 106.424096] Allocated by task 626:
> [ 106.424119] save_stack+0x19/0x80
> [ 106.424143] __kasan_kmalloc.constprop.7+0xc1/0xd0
> [ 106.424165] kmem_cache_alloc+0xb2/0x1d0
> [ 106.424277] i915_sched_lookup_priolist+0x1ab/0x320 [i915]
> [ 106.424385] execlists_submit_request+0x73/0x200 [i915]
> [ 106.424498] submit_notify+0x59/0x60 [i915]
> [ 106.424600] __i915_sw_fence_complete+0x9b/0x330 [i915]
> [ 106.424713] __i915_request_commit+0x4bf/0x570 [i915]
> [ 106.424818] intel_engine_pulse+0x213/0x310 [i915]
> [ 106.424925] context_close+0x22f/0x470 [i915]
> [ 106.425033] i915_gem_context_destroy_ioctl+0x7b/0xa0 [i915]
> [ 106.425058] drm_ioctl_kernel+0x131/0x170
> [ 106.425081] drm_ioctl+0x2d9/0x4f1
> [ 106.425104] do_vfs_ioctl+0x115/0x890
> [ 106.425126] ksys_ioctl+0x35/0x70
> [ 106.425147] __x64_sys_ioctl+0x38/0x40
> [ 106.425169] do_syscall_64+0x66/0x220
> [ 106.425191] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 106.425213]
> [ 106.425234] Freed by task 0:
> [ 106.425255] (stack is not available)
> [ 106.425276]
> [ 106.425297] The buggy address belongs to the object at ffff888703506a40
> [ 106.425297] which belongs to the cache i915_priolist of size 104
> [ 106.425321] The buggy address is located 136 bytes to the right of
> [ 106.425321] 104-byte region [ffff888703506a40, ffff888703506aa8)
> [ 106.425345] The buggy address belongs to the page:
> [ 106.425367] page:ffffea001c0d4180 refcount:1 mapcount:0 mapping:ffff88873e1cf740 index:0xffff888703506e40 compound_mapcount: 0
> [ 106.425391] flags: 0x8000000000010200(slab|head)
> [ 106.425415] raw: 8000000000010200 ffffea0020192b88 ffff8888174b5450 ffff88873e1cf740
> [ 106.425439] raw: ffff888703506e40 000000000010000e 00000001ffffffff 0000000000000000
> [ 106.425464] page dumped because: kasan: bad access detected
> [ 106.425486]
> [ 106.425506] Memory state around the buggy address:
> [ 106.425528] ffff888703506a00: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
> [ 106.425551] ffff888703506a80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
> [ 106.425573] >ffff888703506b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [ 106.425597] ^
> [ 106.425619] ffff888703506b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [ 106.425642] ffff888703506c00: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
> [ 106.425664] ==================================================================
>
> Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
On agreement that the timeline is safer,
Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_lrc.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 191892f7b3a9..645f6b21d8c6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -2246,15 +2246,15 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
>
> static struct i915_request *active_request(struct i915_request *rq)
> {
> - const struct list_head * const list = &rq->engine->active.requests;
> - const struct intel_context * const context = rq->hw_context;
> + const struct list_head * const list = &rq->timeline->requests;
> + const struct intel_context * const ce = rq->hw_context;
> struct i915_request *active = NULL;
>
> - list_for_each_entry_from_reverse(rq, list, sched.link) {
> + list_for_each_entry_from_reverse(rq, list, link) {
> if (i915_request_completed(rq))
> break;
>
> - if (rq->hw_context != context)
> + if (rq->hw_context != ce)
> break;
>
> active = rq;
> --
> 2.23.0.rc1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
More information about the Intel-gfx
mailing list