[Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle

Mika Kuoppala mika.kuoppala at linux.intel.com
Wed Jan 11 14:16:07 UTC 2017


Chris Wilson <chris at chris-wilson.co.uk> writes:

> It is an error to start a new request on the same timeline (ringbuffer)
> as the current one before the current is submitted. If there are two
> requests emitting to the ringbuffer at the same time, the operation is
> undefined. We can catch this by checking for the timeline having a later
> seqno than ours when we come to submit out request.
>
> Currently we have this check at the end of __i915_add_request, but
> having an early check as well isolates a failure in the caller versus a
> failure in sealing the request (i.e. from inside __i915_add_request
> itself). For example, CI is currently tripping over this late assertion
> on ctg/ilk:
>
> [  100.329399] [IGT] gem_cs_tlb: starting subtest basic-default
> [  100.336333] ------------[ cut here ]------------
> [  100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908!
> [  100.336347] invalid opcode: 0000 [#1] PREEMPT SMP
> [  100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei e1000e ptp pps_core [last unloaded: i915]
> [  100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G     U          4.10.0-rc3-CI-CI_DRM_2045+ #1
> [  100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 04/22/2009
> [  100.336386] task: ffff88012b738040 task.stack: ffffc90000560000
> [  100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915]
> [  100.336445] RSP: 0018:ffffc90000563ac0 EFLAGS: 00010212
> [  100.336451] RAX: 0000000000005d52 RBX: ffff880133bb84c0 RCX: 0000000000000001
> [  100.336456] RDX: 0000000080000001 RSI: ffff88012b738860 RDI: 00000000ffffffff
> [  100.336461] RBP: ffffc90000563b00 R08: ffff880133bb8780 R09: 0000000000000000
> [  100.336466] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88012f53d950
> [  100.336472] R13: ffff88012a2b0af8 R14: ffff88012a5b0008 R15: ffff88012f53d960
> [  100.336477] FS:  00007f0d19da38c0(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000
> [  100.336483] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  100.336488] CR2: 00007f0d17706000 CR3: 000000012aa3e000 CR4: 00000000000406f0
> [  100.336496] Call Trace:
> [  100.336527]  i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915]
> [  100.336559]  i915_gem_evict_vm+0x202/0x2b0 [i915]
> [  100.336590]  i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915]
> [  100.336623]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
> [  100.336656]  i915_gem_execbuffer2+0xc0/0x250 [i915]
> [  100.336666]  drm_ioctl+0x200/0x450
> [  100.336697]  ? i915_gem_execbuffer+0x330/0x330 [i915]
> [  100.336708]  do_vfs_ioctl+0x90/0x6e0
> [  100.336716]  ? up_read+0x1a/0x40
> [  100.336723]  ? trace_hardirqs_on_caller+0x122/0x1b0
> [  100.336730]  SyS_ioctl+0x3c/0x70
> [  100.336738]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  100.336745] RIP: 0033:0x7f0d187cb357
> [  100.336750] RSP: 002b:00007ffe0b2f7c28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [  100.336761] RAX: ffffffffffffffda RBX: 00007ffe0b2f7d60 RCX: 00007f0d187cb357
> [  100.336768] RDX: 00007ffe0b2f7d00 RSI: 0000000040406469 RDI: 0000000000000003
> [  100.336775] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000022
> [  100.336782] R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000002
> [  100.336789] R13: 0000000000419101 R14: 00007ffe0b2f7d60 R15: 00007ffe0b2f7d50
> [  100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be
> [  100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: ffffc90000563ac0
> [  100.336886] ---[ end trace 22b36545479e5eb7 ]---
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>

Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>

> ---
>  drivers/gpu/drm/i915/i915_gem_request.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index 99056b948eda..1ad47f08e1fd 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -851,6 +851,13 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
>  	lockdep_assert_held(&request->i915->drm.struct_mutex);
>  	trace_i915_gem_request_add(request);
>  
> +	/* Make sure that no request gazzumped us - if it was allocated after
> +	 * our i915_gem_request_alloc() and called __i915_add_request() before
> +	 * us, the timeline will hold its seqno which is later than ours.
> +	 */
> +	GEM_BUG_ON(i915_seqno_passed(timeline->last_submitted_seqno,
> +				     request->fence.seqno));
> +
>  	/*
>  	 * To ensure that this call will not fail, space for its emissions
>  	 * should already have been reserved in the ring buffer. Let the ring
> -- 
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


More information about the Intel-gfx mailing list