[Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle
Mika Kuoppala
mika.kuoppala at linux.intel.com
Wed Jan 11 14:16:07 UTC 2017
Chris Wilson <chris at chris-wilson.co.uk> writes:
> It is an error to start a new request on the same timeline (ringbuffer)
> as the current one before the current is submitted. If there are two
> requests emitting to the ringbuffer at the same time, the operation is
> undefined. We can catch this by checking for the timeline having a later
> seqno than ours when we come to submit out request.
>
> Currently we have this check at the end of __i915_add_request, but
> having an early check as well isolates a failure in the caller versus a
> failure in sealing the request (i.e. from inside __i915_add_request
> itself). For example, CI is currently tripping over this late assertion
> on ctg/ilk:
>
> [ 100.329399] [IGT] gem_cs_tlb: starting subtest basic-default
> [ 100.336333] ------------[ cut here ]------------
> [ 100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908!
> [ 100.336347] invalid opcode: 0000 [#1] PREEMPT SMP
> [ 100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei e1000e ptp pps_core [last unloaded: i915]
> [ 100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G U 4.10.0-rc3-CI-CI_DRM_2045+ #1
> [ 100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 04/22/2009
> [ 100.336386] task: ffff88012b738040 task.stack: ffffc90000560000
> [ 100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915]
> [ 100.336445] RSP: 0018:ffffc90000563ac0 EFLAGS: 00010212
> [ 100.336451] RAX: 0000000000005d52 RBX: ffff880133bb84c0 RCX: 0000000000000001
> [ 100.336456] RDX: 0000000080000001 RSI: ffff88012b738860 RDI: 00000000ffffffff
> [ 100.336461] RBP: ffffc90000563b00 R08: ffff880133bb8780 R09: 0000000000000000
> [ 100.336466] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88012f53d950
> [ 100.336472] R13: ffff88012a2b0af8 R14: ffff88012a5b0008 R15: ffff88012f53d960
> [ 100.336477] FS: 00007f0d19da38c0(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000
> [ 100.336483] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 100.336488] CR2: 00007f0d17706000 CR3: 000000012aa3e000 CR4: 00000000000406f0
> [ 100.336496] Call Trace:
> [ 100.336527] i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915]
> [ 100.336559] i915_gem_evict_vm+0x202/0x2b0 [i915]
> [ 100.336590] i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915]
> [ 100.336623] i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
> [ 100.336656] i915_gem_execbuffer2+0xc0/0x250 [i915]
> [ 100.336666] drm_ioctl+0x200/0x450
> [ 100.336697] ? i915_gem_execbuffer+0x330/0x330 [i915]
> [ 100.336708] do_vfs_ioctl+0x90/0x6e0
> [ 100.336716] ? up_read+0x1a/0x40
> [ 100.336723] ? trace_hardirqs_on_caller+0x122/0x1b0
> [ 100.336730] SyS_ioctl+0x3c/0x70
> [ 100.336738] entry_SYSCALL_64_fastpath+0x1c/0xb1
> [ 100.336745] RIP: 0033:0x7f0d187cb357
> [ 100.336750] RSP: 002b:00007ffe0b2f7c28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [ 100.336761] RAX: ffffffffffffffda RBX: 00007ffe0b2f7d60 RCX: 00007f0d187cb357
> [ 100.336768] RDX: 00007ffe0b2f7d00 RSI: 0000000040406469 RDI: 0000000000000003
> [ 100.336775] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000022
> [ 100.336782] R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000002
> [ 100.336789] R13: 0000000000419101 R14: 00007ffe0b2f7d60 R15: 00007ffe0b2f7d50
> [ 100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be
> [ 100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: ffffc90000563ac0
> [ 100.336886] ---[ end trace 22b36545479e5eb7 ]---
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem_request.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index 99056b948eda..1ad47f08e1fd 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -851,6 +851,13 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
> lockdep_assert_held(&request->i915->drm.struct_mutex);
> trace_i915_gem_request_add(request);
>
> + /* Make sure that no request gazzumped us - if it was allocated after
> + * our i915_gem_request_alloc() and called __i915_add_request() before
> + * us, the timeline will hold its seqno which is later than ours.
> + */
> + GEM_BUG_ON(i915_seqno_passed(timeline->last_submitted_seqno,
> + request->fence.seqno));
> +
> /*
> * To ensure that this call will not fail, space for its emissions
> * should already have been reserved in the ring buffer. Let the ring
> --
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
More information about the Intel-gfx
mailing list