[Intel-gfx] [PATCH] drm/i915/execlists: Backtrack along timeline

Mika Kuoppala mika.kuoppala at linux.intel.com
Fri Aug 9 08:48:30 UTC 2019


Chris Wilson <chris at chris-wilson.co.uk> writes:

> After a preempt-to-busy, we may find an active request that is caught
> between execution states. Walk back along the timeline instead of the
> execution list to be safe.
>
> [  106.417541] i915 0000:00:02.0: Resetting rcs0 for preemption time out
> [  106.417659] ==================================================================
> [  106.418041] BUG: KASAN: slab-out-of-bounds in __execlists_reset+0x2f2/0x440 [i915]
> [  106.418123] Read of size 8 at addr ffff888703506b30 by task swapper/1/0
> [  106.418194]
> [  106.418267] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G     U            5.3.0-rc3+ #5
> [  106.418344] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
> [  106.418434] Call Trace:
> [  106.418508]  <IRQ>
> [  106.418585]  dump_stack+0x5b/0x90
> [  106.418941]  ? __execlists_reset+0x2f2/0x440 [i915]
> [  106.419022]  print_address_description+0x67/0x32d
> [  106.419376]  ? __execlists_reset+0x2f2/0x440 [i915]
> [  106.419731]  ? __execlists_reset+0x2f2/0x440 [i915]
> [  106.419810]  __kasan_report.cold.6+0x1a/0x3c
> [  106.419888]  ? __trace_bprintk+0xc0/0xd0
> [  106.420239]  ? __execlists_reset+0x2f2/0x440 [i915]
> [  106.420318]  check_memory_region+0x144/0x1c0
> [  106.420671]  __execlists_reset+0x2f2/0x440 [i915]
> [  106.421029]  execlists_reset+0x3d/0x50 [i915]
> [  106.421387]  intel_engine_reset+0x203/0x3a0 [i915]
> [  106.421744]  ? igt_reset_nop+0x2b0/0x2b0 [i915]
> [  106.421825]  ? _raw_spin_trylock_bh+0xe0/0xe0
> [  106.421901]  ? rcu_core+0x1b9/0x6a0
> [  106.422251]  preempt_reset+0x9a/0xf0 [i915]
> [  106.422333]  tasklet_action_common.isra.15+0xc0/0x1e0
> [  106.422685]  ? execlists_submit_request+0x200/0x200 [i915]
> [  106.422764]  __do_softirq+0x106/0x3cf
> [  106.422840]  irq_exit+0xdc/0xf0
> [  106.422914]  smp_apic_timer_interrupt+0x81/0x1c0
> [  106.422988]  apic_timer_interrupt+0xf/0x20
> [  106.423059]  </IRQ>
> [  106.423144] RIP: 0010:cpuidle_enter_state+0xc3/0x620
> [  106.423222] Code: 24 0f 1f 44 00 00 31 ff e8 da 87 9c ff 80 7c 24 10 00 74 12 9c 58 f6 c4 02 0f 85 33 05 00 00 31 ff e8 c1 77 a3 ff fb 45 85 e4 <0f> 89 bf 02 00 00 48 8d 7d 10 e8 4e 45 b9 ff c7 45 10 00 00 00 00
> [  106.423311] RSP: 0018:ffff88881c30fda8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
> [  106.423390] RAX: 0000000000000000 RBX: ffffffff825b4c80 RCX: ffffffff810c8a00
> [  106.423465] RDX: dffffc0000000000 RSI: 0000000039f89620 RDI: ffff88881f6b00a8
> [  106.423540] RBP: ffff88881f6b5bf8 R08: 0000000000000002 R09: 000000000002ed80
> [  106.423616] R10: 0000003fdd956146 R11: ffff88881c2d1e47 R12: 0000000000000008
> [  106.423691] R13: 0000000000000008 R14: ffffffff825b4f80 R15: ffffffff825b4fc0
> [  106.423772]  ? sched_idle_set_state+0x20/0x30
> [  106.423851]  ? cpuidle_enter_state+0xa6/0x620
> [  106.423874]  ? tick_nohz_idle_stop_tick+0x1d1/0x3f0
> [  106.423896]  cpuidle_enter+0x37/0x60
> [  106.423919]  do_idle+0x246/0x280
> [  106.423941]  ? arch_cpu_idle_exit+0x30/0x30
> [  106.423964]  ? __wake_up_common+0x46/0x240
> [  106.423986]  cpu_startup_entry+0x14/0x20
> [  106.424009]  start_secondary+0x1b0/0x200
> [  106.424031]  ? set_cpu_sibling_map+0x990/0x990
> [  106.424054]  secondary_startup_64+0xa4/0xb0
> [  106.424075]
> [  106.424096] Allocated by task 626:
> [  106.424119]  save_stack+0x19/0x80
> [  106.424143]  __kasan_kmalloc.constprop.7+0xc1/0xd0
> [  106.424165]  kmem_cache_alloc+0xb2/0x1d0
> [  106.424277]  i915_sched_lookup_priolist+0x1ab/0x320 [i915]
> [  106.424385]  execlists_submit_request+0x73/0x200 [i915]
> [  106.424498]  submit_notify+0x59/0x60 [i915]
> [  106.424600]  __i915_sw_fence_complete+0x9b/0x330 [i915]
> [  106.424713]  __i915_request_commit+0x4bf/0x570 [i915]
> [  106.424818]  intel_engine_pulse+0x213/0x310 [i915]
> [  106.424925]  context_close+0x22f/0x470 [i915]
> [  106.425033]  i915_gem_context_destroy_ioctl+0x7b/0xa0 [i915]
> [  106.425058]  drm_ioctl_kernel+0x131/0x170
> [  106.425081]  drm_ioctl+0x2d9/0x4f1
> [  106.425104]  do_vfs_ioctl+0x115/0x890
> [  106.425126]  ksys_ioctl+0x35/0x70
> [  106.425147]  __x64_sys_ioctl+0x38/0x40
> [  106.425169]  do_syscall_64+0x66/0x220
> [  106.425191]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  106.425213]
> [  106.425234] Freed by task 0:
> [  106.425255] (stack is not available)
> [  106.425276]
> [  106.425297] The buggy address belongs to the object at ffff888703506a40
> [  106.425297]  which belongs to the cache i915_priolist of size 104
> [  106.425321] The buggy address is located 136 bytes to the right of
> [  106.425321]  104-byte region [ffff888703506a40, ffff888703506aa8)
> [  106.425345] The buggy address belongs to the page:
> [  106.425367] page:ffffea001c0d4180 refcount:1 mapcount:0 mapping:ffff88873e1cf740 index:0xffff888703506e40 compound_mapcount: 0
> [  106.425391] flags: 0x8000000000010200(slab|head)
> [  106.425415] raw: 8000000000010200 ffffea0020192b88 ffff8888174b5450 ffff88873e1cf740
> [  106.425439] raw: ffff888703506e40 000000000010000e 00000001ffffffff 0000000000000000
> [  106.425464] page dumped because: kasan: bad access detected
> [  106.425486]
> [  106.425506] Memory state around the buggy address:
> [  106.425528]  ffff888703506a00: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
> [  106.425551]  ffff888703506a80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
> [  106.425573] >ffff888703506b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [  106.425597]                                      ^
> [  106.425619]  ffff888703506b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [  106.425642]  ffff888703506c00: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
> [  106.425664] ==================================================================
>
> Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>

On agreement that the timeline is safer,

Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 191892f7b3a9..645f6b21d8c6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -2246,15 +2246,15 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
>  
>  static struct i915_request *active_request(struct i915_request *rq)
>  {
> -	const struct list_head * const list = &rq->engine->active.requests;
> -	const struct intel_context * const context = rq->hw_context;
> +	const struct list_head * const list = &rq->timeline->requests;
> +	const struct intel_context * const ce = rq->hw_context;
>  	struct i915_request *active = NULL;
>  
> -	list_for_each_entry_from_reverse(rq, list, sched.link) {
> +	list_for_each_entry_from_reverse(rq, list, link) {
>  		if (i915_request_completed(rq))
>  			break;
>  
> -		if (rq->hw_context != context)
> +		if (rq->hw_context != ce)
>  			break;
>  
>  		active = rq;
> -- 
> 2.23.0.rc1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


More information about the Intel-gfx mailing list