[Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle

Chris Wilson chris at chris-wilson.co.uk
Wed Jan 11 14:32:28 UTC 2017


On Wed, Jan 11, 2017 at 04:19:19PM +0200, Joonas Lahtinen wrote:
> On ke, 2017-01-11 at 14:08 +0000, Chris Wilson wrote:
> > It is an error to start a new request on the same timeline (ringbuffer)
> > as the current one before the current is submitted. If there are two
> > requests emitting to the ringbuffer at the same time, the operation is
> > undefined. We can catch this by checking for the timeline having a later
> > seqno than ours when we come to submit out request.
> > 
> > Currently we have this check at the end of __i915_add_request, but
> > having an early check as well isolates a failure in the caller versus a
> > failure in sealing the request (i.e. from inside __i915_add_request
> > itself). For example, CI is currently tripping over this late assertion
> > on ctg/ilk:
> > 
> > [  100.329399] [IGT] gem_cs_tlb: starting subtest basic-default
> > [  100.336333] ------------[ cut here ]------------
> > [  100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908!
> > [  100.336347] invalid opcode: 0000 [#1] PREEMPT SMP
> > [  100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei e1000e ptp pps_core [last unloaded: i915]
> > [  100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G     U          4.10.0-rc3-CI-CI_DRM_2045+ #1
> > [  100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 04/22/2009
> > [  100.336386] task: ffff88012b738040 task.stack: ffffc90000560000
> > [  100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915]
> > [  100.336445] RSP: 0018:ffffc90000563ac0 EFLAGS: 00010212
> > [  100.336451] RAX: 0000000000005d52 RBX: ffff880133bb84c0 RCX: 0000000000000001
> > [  100.336456] RDX: 0000000080000001 RSI: ffff88012b738860 RDI: 00000000ffffffff
> > [  100.336461] RBP: ffffc90000563b00 R08: ffff880133bb8780 R09: 0000000000000000
> > [  100.336466] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88012f53d950
> > [  100.336472] R13: ffff88012a2b0af8 R14: ffff88012a5b0008 R15: ffff88012f53d960
> > [  100.336477] FS:  00007f0d19da38c0(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000
> > [  100.336483] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  100.336488] CR2: 00007f0d17706000 CR3: 000000012aa3e000 CR4: 00000000000406f0
> > [  100.336496] Call Trace:
> > [  100.336527]  i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915]
> > [  100.336559]  i915_gem_evict_vm+0x202/0x2b0 [i915]
> > [  100.336590]  i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915]
> > [  100.336623]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
> > [  100.336656]  i915_gem_execbuffer2+0xc0/0x250 [i915]
> > [  100.336666]  drm_ioctl+0x200/0x450
> > [  100.336697]  ? i915_gem_execbuffer+0x330/0x330 [i915]
> > [  100.336708]  do_vfs_ioctl+0x90/0x6e0
> > [  100.336716]  ? up_read+0x1a/0x40
> > [  100.336723]  ? trace_hardirqs_on_caller+0x122/0x1b0
> > [  100.336730]  SyS_ioctl+0x3c/0x70
> > [  100.336738]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> > [  100.336745] RIP: 0033:0x7f0d187cb357
> > [  100.336750] RSP: 002b:00007ffe0b2f7c28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > [  100.336761] RAX: ffffffffffffffda RBX: 00007ffe0b2f7d60 RCX: 00007f0d187cb357
> > [  100.336768] RDX: 00007ffe0b2f7d00 RSI: 0000000040406469 RDI: 0000000000000003
> > [  100.336775] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000022
> > [  100.336782] R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000002
> > [  100.336789] R13: 0000000000419101 R14: 00007ffe0b2f7d60 R15: 00007ffe0b2f7d50
> > [  100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be
> > [  100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: ffffc90000563ac0
> > [  100.336886] ---[ end trace 22b36545479e5eb7 ]---
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> 
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>

Thanks, pushed this promptly since I'm trying to understand the CI
failure.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list