[Intel-gfx] [PATCH] drm/i915: Acquire breadcrumb ref before cancelling

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Mar 4 16:26:56 UTC 2019


On 04/03/2019 11:41, Chris Wilson wrote:
> We may race the interrupt signaling with retirement, in which case the
> order in which we acquire the reference inside the interrupt is vital to
> provide the correct barrier against the request being freed in
> retirement, i.e. we need to acquire our reference before marking the
> breadcrumb as cancelled (as soon as the breadcrumb is cancelled
> retirement may drop its reference to the request without serialisation
> with the interrupt handler).
> 
> <3>[  683.372226] BUG i915_request (Tainted: G     U           ): Object already free
> <3>[  683.372269] -----------------------------------------------------------------------------
> 
> <4>[  683.372323] Disabling lock debugging due to kernel taint
> <3>[  683.372393] INFO: Allocated in i915_request_alloc+0x169/0x810 [i915] age=0 cpu=2 pid=1420
> <3>[  683.372412] 	kmem_cache_alloc+0x21c/0x280
> <3>[  683.372478] 	i915_request_alloc+0x169/0x810 [i915]
> <3>[  683.372540] 	i915_gem_do_execbuffer+0x84e/0x1ae0 [i915]
> <3>[  683.372603] 	i915_gem_execbuffer2_ioctl+0x11b/0x420 [i915]
> <3>[  683.372617] 	drm_ioctl_kernel+0x83/0xf0
> <3>[  683.372626] 	drm_ioctl+0x2f3/0x3b0
> <3>[  683.372636] 	do_vfs_ioctl+0xa0/0x6e0
> <3>[  683.372645] 	ksys_ioctl+0x35/0x60
> <3>[  683.372654] 	__x64_sys_ioctl+0x11/0x20
> <3>[  683.372664] 	do_syscall_64+0x55/0x190
> <3>[  683.372675] 	entry_SYSCALL_64_after_hwframe+0x49/0xbe
> <3>[  683.372740] INFO: Freed in i915_request_retire_upto+0xfb/0x2e0 [i915] age=0 cpu=0 pid=1419
> <3>[  683.372807] 	i915_request_retire_upto+0xfb/0x2e0 [i915]
> <3>[  683.372870] 	i915_request_add+0x3bd/0x9d0 [i915]
> <3>[  683.372931] 	i915_gem_do_execbuffer+0x141c/0x1ae0 [i915]
> <3>[  683.372991] 	i915_gem_execbuffer2_ioctl+0x11b/0x420 [i915]
> <3>[  683.373001] 	drm_ioctl_kernel+0x83/0xf0
> <3>[  683.373008] 	drm_ioctl+0x2f3/0x3b0
> <3>[  683.373015] 	do_vfs_ioctl+0xa0/0x6e0
> <3>[  683.373023] 	ksys_ioctl+0x35/0x60
> <3>[  683.373030] 	__x64_sys_ioctl+0x11/0x20
> <3>[  683.373037] 	do_syscall_64+0x55/0x190
> <3>[  683.373045] 	entry_SYSCALL_64_after_hwframe+0x49/0xbe
> <3>[  683.373054] INFO: Slab 0x0000000079bcdd71 objects=30 used=2 fp=0x000000006d77b8af flags=0x8000000000010201
> <3>[  683.373069] INFO: Object 0x000000006d77b8af @offset=24000 fp=0x000000007b061eab
> 
> <3>[  683.373083] Redzone 00000000ee47ef28: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
> <3>[  683.373097] Redzone 000000000cb91471: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
> <3>[  683.373111] Redzone 00000000cf2b86ee: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
> <3>[  683.373125] Redzone 00000000f1f5a2cd: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
> <3>[  683.373139] Object 000000006d77b8af: 00 00 00 00 5a 5a 5a 5a 00 3c 49 c0 ff ff ff ff  ....ZZZZ.<I.....
> <3>[  683.373153] Object 000000006f9b6204: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> <3>[  683.373167] Object 0000000091410ffb: e0 dd 6b fa 87 9f ff ff e0 dd 6b fa 87 9f ff ff  ..k.......k.....
> <3>[  683.373181] Object 000000004cdf799d: 20 de 6b fa 87 9f ff ff 3d 00 00 00 00 00 00 00   .k.....=.......
> <3>[  683.373195] Object 00000000545afebc: aa b3 00 00 00 00 00 00 0f 00 00 00 00 00 00 00  ................
> <3>[  683.373209] Object 00000000e4a394a8: 25 bd bd 1b 9f 00 00 00 00 00 00 00 5a 5a 5a 5a  %...........ZZZZ
> <3>[  683.373223] Object 0000000029a7878a: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
> <3>[  683.373237] Object 00000000d37797b3: ff ff ff ff ff ff ff ff e8 6e 57 c0 ff ff ff ff  .........nW.....
> <3>[  683.373251] Object 00000000d50414f6: 00 b3 c8 8e ff ff ff ff 80 b0 c8 8e ff ff ff ff  ................
> <3>[  683.373265] Object 00000000c28e8847: 41 01 4b c0 ff ff ff ff 00 00 88 8e 88 9f ff ff  A.K.............
> <3>[  683.373279] Object 00000000c74212ab: 38 c1 6d 8a 88 9f ff ff 58 21 74 8a 88 9f ff ff  8.m.....X!t.....
> <3>[  683.373293] Object 000000000d8012cf: c0 c1 6d 8a 88 9f ff ff 58 79 dd d9 87 9f ff ff  ..m.....Xy......
> <3>[  683.373306] Object 00000000c9900b91: 98 d0 4e 8a 88 9f ff ff 58 3c e8 9b 88 9f ff ff  ..N.....X<......
> <3>[  683.373320] Object 0000000044bb8c3d: 58 3c e8 9b 88 9f ff ff 64 f5 04 00 00 00 00 00  X<......d.......
> <3>[  683.373334] Object 00000000180c4cca: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
> <3>[  683.373348] Object 00000000c9044498: ff ff ff ff ff ff ff ff e0 6e 57 c0 ff ff ff ff  .........nW.....
> <3>[  683.373362] Object 0000000072d0dfb3: 00 00 00 00 00 00 00 00 c0 b1 c8 8e ff ff ff ff  ................
> <3>[  683.373376] Object 0000000081f198b9: 55 01 4b c0 ff ff ff ff d8 de 6b fa 87 9f ff ff  U.K.......k.....
> <3>[  683.373390] Object 000000006a375a13: d8 de 6b fa 87 9f ff ff cc 05 39 c0 ff ff ff ff  ..k.......9.....
> <3>[  683.373404] Object 00000000b8392dd1: ff ff ff ff 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ....ZZZZZZZZZZZZ
> <3>[  683.373418] Object 00000000e5c1bbcb: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> <3>[  683.373432] Object 00000000199feccd: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> <3>[  683.373446] Object 0000000020f5e08b: 20 df 6b fa 87 9f ff ff 20 df 6b fa 87 9f ff ff   .k..... .k.....
> <3>[  683.373460] Object 0000000090591b0f: 30 df 6b fa 87 9f ff ff 30 df 6b fa 87 9f ff ff  0.k.....0.k.....
> <3>[  683.373473] Object 00000000232f7cd0: 40 df 6b fa 87 9f ff ff 40 df 6b fa 87 9f ff ff  @.k..... at .k.....
> <3>[  683.373487] Object 0000000060458027: 50 df 6b fa 87 9f ff ff 50 df 6b fa 87 9f ff ff  P.k.....P.k.....
> <3>[  683.373501] Object 00000000e3c82ce2: 06 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a  ........ZZZZZZZZ
> <3>[  683.373515] Object 00000000ec804eb8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> <3>[  683.373529] Object 00000000ce7ccc08: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> <3>[  683.373543] Object 000000002dbc575c: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> <3>[  683.373557] Object 00000000b86d3417: 5a 5a 5a 5a 5a 5a 5a 5a 00 de 6b fa 87 9f ff ff  ZZZZZZZZ..k.....
> <3>[  683.373571] Object 00000000d1e82276: b8 61 dd d9 87 9f ff ff a0 06 00 00 d0 06 00 00  .a..............
> <3>[  683.373585] Object 00000000cc53f969: e8 06 00 00 20 07 00 00 28 07 00 00 00 00 00 00  .... ...(.......
> <3>[  683.373599] Object 00000000ea2426d2: 40 0c 8c 7b 88 9f ff ff 00 00 00 00 00 00 00 00  @..{............
> <3>[  683.373613] Object 00000000b860c1c3: 68 0d 8c 7b 88 9f ff ff 68 25 8c 7b 88 9f ff ff  h..{....h%.{....
> <3>[  683.373627] Object 0000000016455ea0: 96 d5 05 00 01 00 00 00 00 5a 5a 5a 5a 5a 5a 5a  .........ZZZZZZZ
> <3>[  683.373640] Object 00000000e66ede82: 00 e0 6b fa 87 9f ff ff 00 e0 6b fa 87 9f ff ff  ..k.......k.....
> <3>[  683.373654] Object 0000000080964939: 10 e0 6b fa 87 9f ff ff 10 e0 6b fa 87 9f ff ff  ..k.......k.....
> <3>[  683.373668] Object 00000000e7ffc5dd: 00 00 00 00 00 00 00 00 00 01 00 00 00 00 ad de  ................
> <3>[  683.373682] Object 000000000ce9d6ca: 00 02 00 00 00 00 ad de 5a 5a 5a 5a 5a 5a 5a 5a  ........ZZZZZZZZ
> <3>[  683.373696] Object 00000000386659d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> <3>[  683.373710] Redzone 0000000075d2069d: bb bb bb bb bb bb bb bb                          ........
> <3>[  683.373723] Padding 0000000054e14c6b: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> <3>[  683.373737] Padding 00000000425e5b34: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> <3>[  683.373751] Padding 00000000ad3d4db9: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> <4>[  683.373767] CPU: 1 PID: 151 Comm: kworker/1:2 Tainted: G    BU            5.0.0-rc8-g39139489403b-drmtip_236+ #1
> <4>[  683.373769] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake Y LPDDR4x T4 RVP TLC, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019
> <4>[  683.373773] Workqueue: events delayed_fput
> <4>[  683.373775] Call Trace:
> <4>[  683.373777]  <IRQ>
> <4>[  683.373781]  dump_stack+0x67/0x9b
> <4>[  683.373783]  free_debug_processing+0x344/0x370
> <4>[  683.373832]  ? intel_engine_breadcrumbs_irq+0x2e4/0x380 [i915]
> <4>[  683.373836]  __slab_free+0x337/0x4f0
> <4>[  683.373840]  ? _raw_spin_unlock_irqrestore+0x39/0x60
> <4>[  683.373844]  ? debug_check_no_obj_freed+0x132/0x210
> <4>[  683.373889]  ? intel_engine_breadcrumbs_irq+0x2e4/0x380 [i915]
> <4>[  683.373892]  ? kmem_cache_free+0x275/0x2e0
> <4>[  683.373894]  kmem_cache_free+0x275/0x2e0
> <4>[  683.373939]  intel_engine_breadcrumbs_irq+0x2e4/0x380 [i915]
> <4>[  683.373984]  gen8_cs_irq_handler+0x4e/0xa0 [i915]
> <4>[  683.374026]  gen11_irq_handler+0x24b/0x330 [i915]
> <4>[  683.374032]  __handle_irq_event_percpu+0x41/0x2d0
> <4>[  683.374034]  ? handle_irq_event+0x27/0x50
> <4>[  683.374038]  handle_irq_event_percpu+0x2b/0x70
> <4>[  683.374040]  handle_irq_event+0x2f/0x50
> <4>[  683.374044]  handle_edge_irq+0xe7/0x190
> <4>[  683.374048]  handle_irq+0x67/0x160
> <4>[  683.374051]  do_IRQ+0x5e/0x130
> <4>[  683.374054]  common_interrupt+0xf/0xf
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109827
> Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking")
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
>   drivers/gpu/drm/i915/intel_breadcrumbs.c | 18 ++++++++----------
>   1 file changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index cacaa1d04d17..09ed90c0ba00 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -106,16 +106,6 @@ bool intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
>   
>   			GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_SIGNAL,
>   					     &rq->fence.flags));
> -			clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
> -
> -			/*
> -			 * We may race with direct invocation of
> -			 * dma_fence_signal(), e.g. i915_request_retire(),
> -			 * in which case we can skip processing it ourselves.
> -			 */
> -			if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
> -				     &rq->fence.flags))
> -				continue;
>   
>   			/*
>   			 * Queue for execution after dropping the signaling
> @@ -123,6 +113,14 @@ bool intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
>   			 * more signalers to the same context or engine.
>   			 */
>   			i915_request_get(rq);
> +
> +			/*
> +			 * We may race with direct invocation of
> +			 * dma_fence_signal(), e.g. i915_request_retire(),
> +			 * so we need to acquire our reference to the request
> +			 * before we cancel the breadcrumb.
> +			 */
> +			clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
>   			list_add_tail(&rq->signal_link, &signal);
>   		}
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko


More information about the Intel-gfx mailing list