[Intel-gfx] [PATCH] drm/i915: Remove assertion of active_rings must be non-empty if active_requests
Chris Wilson
chris at chris-wilson.co.uk
Fri May 4 12:06:51 UTC 2018
Quoting Tvrtko Ursulin (2018-05-04 11:37:25)
>
> On 04/05/2018 11:11, Chris Wilson wrote:
> > "An outstanding request must still be on an active ring somewhere" is
> > only true if we haven't just been interrupted by the shrinker in the
> > middle of allocating the request itself. (At the start of
> > i915_request_alloc() we pin the context and prepare the GT for activity,
> > marking it as active, and then try to allocate the request. If this
> > allocation invokes the shrinker, we try to reclaim some space by calling
> > i915_retire_requests() which may then be confused by the pre-reservation
> > of active_requests.)
> >
> > <3>[ 125.472695] i915_retire_requests:1429 GEM_BUG_ON(list_empty(&i915->gt.active_rings))
> > <2>[ 125.472792] kernel BUG at drivers/gpu/drm/i915/i915_request.c:1429!
> > <4>[ 125.472822] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
> > <4>[ 125.498764] Modules linked in: snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel btusb btrtl btbcm btintel cdc_ether snd_hda_codec_realtek bluetooth i915 snd_hda_codec_generic usbnet r8152 mii ecdh_generic lpc_ich mei_me snd_hda_intel snd_hda_codec mei snd_hwdep snd_hda_core snd_pcm prime_numbers
> > <4>[ 125.498923] CPU: 0 PID: 1115 Comm: gem_exec_create Tainted: G U 4.17.0-rc3-gc49cbe0d1eb8-kasan_32+ #1
> > <4>[ 125.498955] Hardware name: GOOGLE Peppy/Peppy, BIOS MrChromebox 02/04/2018
> > <4>[ 125.499074] RIP: 0010:i915_retire_requests+0x3f2/0x590 [i915]
> > <4>[ 125.499095] RSP: 0018:ffff88004e5dec40 EFLAGS: 00010282
> > <4>[ 125.499117] RAX: 0000000000000010 RBX: ffff8800458f0000 RCX: 0000000000000000
> > <4>[ 125.499140] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff880060c2f6f0
> > <4>[ 125.499164] RBP: ffff88004e5dee30 R08: ffffed000c185ee6 R09: ffffed000c185ee6
> > <4>[ 125.499187] R10: 0000000000000001 R11: ffffed000c185ee5 R12: ffff8800553da160
> > <4>[ 125.499210] R13: dffffc0000000000 R14: 0000000000000000 R15: ffff8800458faed0
> > <4>[ 125.499235] FS: 00007fe18f052980(0000) GS:ffff880065400000(0000) knlGS:0000000000000000
> > <4>[ 125.499262] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > <4>[ 125.499282] CR2: 00007f01df11efb8 CR3: 00000000518d4001 CR4: 00000000000606f0
> > <4>[ 125.499304] Call Trace:
> > <4>[ 125.499417] i915_gem_shrink+0x576/0xb50 [i915]
> > <4>[ 125.499532] ? i915_gem_shrinker_count+0x2f0/0x2f0 [i915]
> > <4>[ 125.499561] ? trace_hardirqs_on_thunk+0x1a/0x1c
> > <4>[ 125.499671] ? i915_gem_shrinker_count+0x1d6/0x2f0 [i915]
> > <4>[ 125.499782] ? i915_gem_shrinker_scan+0xc4/0x320 [i915]
> > <4>[ 125.499889] i915_gem_shrinker_scan+0xc4/0x320 [i915]
> > <4>[ 125.499997] ? i915_gem_shrinker_vmap+0x3a0/0x3a0 [i915]
> > <4>[ 125.500021] ? do_raw_spin_unlock+0x4f/0x240
> > <4>[ 125.500042] ? _raw_spin_unlock+0x29/0x40
> > <4>[ 125.500149] ? i915_gem_shrinker_count+0x1d6/0x2f0 [i915]
> > <4>[ 125.500177] shrink_slab.part.18+0x23e/0x8f0
> > <4>[ 125.500202] ? unregister_shrinker+0x1f0/0x1f0
> > <4>[ 125.500226] ? mem_cgroup_iter+0x379/0xcc0
> > <4>[ 125.500249] shrink_node+0xa7e/0x1180
> > <4>[ 125.500276] ? shrink_node_memcg+0x11f0/0x11f0
> > <4>[ 125.500297] ? __delayacct_freepages_start+0x38/0x80
> > <4>[ 125.500319] ? __is_insn_slot_addr+0xe3/0x1a0
> > <4>[ 125.500342] ? recalibrate_cpu_khz+0x10/0x10
> > <4>[ 125.500361] ? ktime_get+0xb2/0x140
> > <4>[ 125.500382] do_try_to_free_pages+0x2d3/0xe40
> > <4>[ 125.500407] ? allow_direct_reclaim.part.23+0x1e0/0x1e0
> > <4>[ 125.500429] ? shrink_node+0x1180/0x1180
> > <4>[ 125.500450] ? __read_once_size_nocheck.constprop.4+0x10/0x10
> > <4>[ 125.500476] try_to_free_pages+0x1af/0x560
> > <4>[ 125.500497] ? do_try_to_free_pages+0xe40/0xe40
> > <4>[ 125.500525] __alloc_pages_nodemask+0xadc/0x2130
> > <4>[ 125.500553] ? gfp_pfmemalloc_allowed+0x150/0x150
> > <4>[ 125.500654] ? i915_gem_do_execbuffer+0x219d/0x32e0 [i915]
> > <4>[ 125.500678] ? debug_check_no_locks_freed+0x2a0/0x2a0
> > <4>[ 125.500701] ? __debug_object_init+0x322/0xd90
> > <4>[ 125.500722] ? debug_check_no_locks_freed+0x2a0/0x2a0
> > <4>[ 125.500827] ? i915_gem_do_execbuffer+0xdc2/0x32e0 [i915]
> > <4>[ 125.500942] ? i915_request_alloc+0x5b5/0x13f0 [i915]
> > <4>[ 125.500964] ? page_frag_free+0x170/0x170
> > <4>[ 125.500984] ? debug_check_no_locks_freed+0x2a0/0x2a0
> > <4>[ 125.501008] new_slab+0x21d/0x5c0
> > <4>[ 125.501029] ___slab_alloc.constprop.35+0x322/0x3e0
> > <4>[ 125.501052] ? reservation_object_reserve_shared+0x10b/0x250
> > <4>[ 125.501074] ? __ww_mutex_lock.constprop.3+0x1104/0x2cf0
> > <4>[ 125.501097] ? _raw_spin_unlock_irqrestore+0x39/0x60
> > <4>[ 125.501120] ? fs_reclaim_acquire+0x10/0x10
> > <4>[ 125.501138] ? lock_acquire+0x138/0x3c0
> > <4>[ 125.501156] ? lock_acquire+0x3c0/0x3c0
> > <4>[ 125.501176] ? reservation_object_reserve_shared+0x10b/0x250
> > <4>[ 125.501198] ? __slab_alloc.isra.27.constprop.34+0x3d/0x70
> > <4>[ 125.501219] __slab_alloc.isra.27.constprop.34+0x3d/0x70
> > <4>[ 125.501243] ? reservation_object_reserve_shared+0x10b/0x250
> > <4>[ 125.501265] __kmalloc_track_caller+0x313/0x350
> > <4>[ 125.501287] krealloc+0x62/0xb0
> > <4>[ 125.501305] reservation_object_reserve_shared+0x10b/0x250
> > <4>[ 125.501411] i915_gem_do_execbuffer+0x2040/0x32e0 [i915]
> > <4>[ 125.501522] ? eb_relocate_slow+0xad0/0xad0 [i915]
> > <4>[ 125.501544] ? debug_check_no_locks_freed+0x2a0/0x2a0
> > <4>[ 125.501646] ? i915_gem_execbuffer2_ioctl+0x108/0x770 [i915]
> > <4>[ 125.501755] ? i915_gem_execbuffer2_ioctl+0x108/0x770 [i915]
> > <4>[ 125.501779] ? drm_dev_get+0x20/0x20
> > <4>[ 125.501803] ? __might_fault+0xea/0x1a0
> > <4>[ 125.501902] ? i915_gem_execbuffer2_ioctl+0x108/0x770 [i915]
> > <4>[ 125.502012] ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[ 125.502116] ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[ 125.502218] i915_gem_execbuffer2_ioctl+0x3c5/0x770 [i915]
> > <4>[ 125.502243] ? drm_dev_enter+0xe0/0xe0
> > <4>[ 125.502260] ? lock_acquire+0x138/0x3c0
> > <4>[ 125.502362] ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[ 125.502470] ? i915_gem_object_create.part.28+0x570/0x570 [i915]
> > <4>[ 125.502575] ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[ 125.502680] ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[ 125.502702] drm_ioctl_kernel+0x151/0x200
> > <4>[ 125.502721] ? drm_ioctl_permit+0x2a0/0x2a0
> > <4>[ 125.502746] drm_ioctl+0x63a/0x920
> > <4>[ 125.502844] ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[ 125.502868] ? drm_getstats+0x20/0x20
> > <4>[ 125.502886] ? trace_hardirqs_on_thunk+0x1a/0x1c
> > <4>[ 125.502919] do_vfs_ioctl+0x173/0xe90
> > <4>[ 125.502936] ? trace_hardirqs_on_thunk+0x1a/0x1c
> > <4>[ 125.502957] ? ioctl_preallocate+0x170/0x170
> > <4>[ 125.502978] ? trace_hardirqs_on_thunk+0x1a/0x1c
> > <4>[ 125.503002] ? retint_kernel+0x2d/0x2d
> > <4>[ 125.503024] ksys_ioctl+0x35/0x60
> > <4>[ 125.503043] __x64_sys_ioctl+0x6a/0xb0
> > <4>[ 125.503061] do_syscall_64+0x97/0x400
> > <4>[ 125.503081] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > <4>[ 125.503101] RIP: 0033:0x7fe18e4f65d7
> > <4>[ 125.503116] RSP: 002b:00007ffe2ffc06a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > <4>[ 125.503145] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe18e4f65d7
> > <4>[ 125.503168] RDX: 00007ffe2ffc07f0 RSI: 0000000040406469 RDI: 0000000000000003
> > <4>[ 125.503191] RBP: 00007ffe2ffc07f0 R08: 0000000000000004 R09: 00007ffe2ffcf080
> > <4>[ 125.503215] R10: 000000000002c7de R11: 0000000000000246 R12: 0000000040406469
> > <4>[ 125.503238] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
> > <4>[ 125.503268] Code: e8 18 a0 c9 da 48 8b 35 25 3a 47 00 49 c7 c0 a0 3b 88 c0 b9 95 05 00 00 48 c7 c2 e0 49 88 c0 48 c7 c7 8d 3b 5d c0 e8 ee 7e db da <0f> 0b 48 89 ef e8 a4 26 f5 da e9 51 fe ff ff e8 8a 26 f5 da e9
> > <1>[ 125.503548] RIP: i915_retire_requests+0x3f2/0x590 [i915] RSP: ffff88004e5dec40
> >
> > Fixes: 643b450a594e ("drm/i915: Only track live rings for retiring")
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_request.c | 3 ---
> > 1 file changed, 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index d68739b94dac..e4cf76ec14a6 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1426,9 +1426,6 @@ void i915_retire_requests(struct drm_i915_private *i915)
> > if (!i915->gt.active_requests)
> > return;
> >
> > - /* An outstanding request must be on a still active ring somewhere */
> > - GEM_BUG_ON(list_empty(&i915->gt.active_rings));
> > -
> > list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link)
> > ring_retire_requests(ring);
> > }
> >
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
And pushed before anyone else notices the oops...
Thanks for the review,
-Chris
More information about the Intel-gfx
mailing list