[Intel-gfx] [PATCH] drm/i915: Remove assertion of active_rings must be non-empty if active_requests

Chris Wilson chris at chris-wilson.co.uk
Fri May 4 12:06:51 UTC 2018


Quoting Tvrtko Ursulin (2018-05-04 11:37:25)
> 
> On 04/05/2018 11:11, Chris Wilson wrote:
> > "An outstanding request must still be on an active ring somewhere" is
> > only true if we haven't just been interrupted by the shrinker in the
> > middle of allocating the request itself. (At the start of
> > i915_request_alloc() we pin the context and prepare the GT for activity,
> > marking it as active, and then try to allocate the request. If this
> > allocation invokes the shrinker, we try to reclaim some space by calling
> > i915_retire_requests() which may then be confused by the pre-reservation
> > of active_requests.)
> > 
> > <3>[  125.472695] i915_retire_requests:1429 GEM_BUG_ON(list_empty(&i915->gt.active_rings))
> > <2>[  125.472792] kernel BUG at drivers/gpu/drm/i915/i915_request.c:1429!
> > <4>[  125.472822] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
> > <4>[  125.498764] Modules linked in: snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel btusb btrtl btbcm btintel cdc_ether snd_hda_codec_realtek bluetooth i915 snd_hda_codec_generic usbnet r8152 mii ecdh_generic lpc_ich mei_me snd_hda_intel snd_hda_codec mei snd_hwdep snd_hda_core snd_pcm prime_numbers
> > <4>[  125.498923] CPU: 0 PID: 1115 Comm: gem_exec_create Tainted: G     U            4.17.0-rc3-gc49cbe0d1eb8-kasan_32+ #1
> > <4>[  125.498955] Hardware name: GOOGLE Peppy/Peppy, BIOS MrChromebox 02/04/2018
> > <4>[  125.499074] RIP: 0010:i915_retire_requests+0x3f2/0x590 [i915]
> > <4>[  125.499095] RSP: 0018:ffff88004e5dec40 EFLAGS: 00010282
> > <4>[  125.499117] RAX: 0000000000000010 RBX: ffff8800458f0000 RCX: 0000000000000000
> > <4>[  125.499140] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff880060c2f6f0
> > <4>[  125.499164] RBP: ffff88004e5dee30 R08: ffffed000c185ee6 R09: ffffed000c185ee6
> > <4>[  125.499187] R10: 0000000000000001 R11: ffffed000c185ee5 R12: ffff8800553da160
> > <4>[  125.499210] R13: dffffc0000000000 R14: 0000000000000000 R15: ffff8800458faed0
> > <4>[  125.499235] FS:  00007fe18f052980(0000) GS:ffff880065400000(0000) knlGS:0000000000000000
> > <4>[  125.499262] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > <4>[  125.499282] CR2: 00007f01df11efb8 CR3: 00000000518d4001 CR4: 00000000000606f0
> > <4>[  125.499304] Call Trace:
> > <4>[  125.499417]  i915_gem_shrink+0x576/0xb50 [i915]
> > <4>[  125.499532]  ? i915_gem_shrinker_count+0x2f0/0x2f0 [i915]
> > <4>[  125.499561]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> > <4>[  125.499671]  ? i915_gem_shrinker_count+0x1d6/0x2f0 [i915]
> > <4>[  125.499782]  ? i915_gem_shrinker_scan+0xc4/0x320 [i915]
> > <4>[  125.499889]  i915_gem_shrinker_scan+0xc4/0x320 [i915]
> > <4>[  125.499997]  ? i915_gem_shrinker_vmap+0x3a0/0x3a0 [i915]
> > <4>[  125.500021]  ? do_raw_spin_unlock+0x4f/0x240
> > <4>[  125.500042]  ? _raw_spin_unlock+0x29/0x40
> > <4>[  125.500149]  ? i915_gem_shrinker_count+0x1d6/0x2f0 [i915]
> > <4>[  125.500177]  shrink_slab.part.18+0x23e/0x8f0
> > <4>[  125.500202]  ? unregister_shrinker+0x1f0/0x1f0
> > <4>[  125.500226]  ? mem_cgroup_iter+0x379/0xcc0
> > <4>[  125.500249]  shrink_node+0xa7e/0x1180
> > <4>[  125.500276]  ? shrink_node_memcg+0x11f0/0x11f0
> > <4>[  125.500297]  ? __delayacct_freepages_start+0x38/0x80
> > <4>[  125.500319]  ? __is_insn_slot_addr+0xe3/0x1a0
> > <4>[  125.500342]  ? recalibrate_cpu_khz+0x10/0x10
> > <4>[  125.500361]  ? ktime_get+0xb2/0x140
> > <4>[  125.500382]  do_try_to_free_pages+0x2d3/0xe40
> > <4>[  125.500407]  ? allow_direct_reclaim.part.23+0x1e0/0x1e0
> > <4>[  125.500429]  ? shrink_node+0x1180/0x1180
> > <4>[  125.500450]  ? __read_once_size_nocheck.constprop.4+0x10/0x10
> > <4>[  125.500476]  try_to_free_pages+0x1af/0x560
> > <4>[  125.500497]  ? do_try_to_free_pages+0xe40/0xe40
> > <4>[  125.500525]  __alloc_pages_nodemask+0xadc/0x2130
> > <4>[  125.500553]  ? gfp_pfmemalloc_allowed+0x150/0x150
> > <4>[  125.500654]  ? i915_gem_do_execbuffer+0x219d/0x32e0 [i915]
> > <4>[  125.500678]  ? debug_check_no_locks_freed+0x2a0/0x2a0
> > <4>[  125.500701]  ? __debug_object_init+0x322/0xd90
> > <4>[  125.500722]  ? debug_check_no_locks_freed+0x2a0/0x2a0
> > <4>[  125.500827]  ? i915_gem_do_execbuffer+0xdc2/0x32e0 [i915]
> > <4>[  125.500942]  ? i915_request_alloc+0x5b5/0x13f0 [i915]
> > <4>[  125.500964]  ? page_frag_free+0x170/0x170
> > <4>[  125.500984]  ? debug_check_no_locks_freed+0x2a0/0x2a0
> > <4>[  125.501008]  new_slab+0x21d/0x5c0
> > <4>[  125.501029]  ___slab_alloc.constprop.35+0x322/0x3e0
> > <4>[  125.501052]  ? reservation_object_reserve_shared+0x10b/0x250
> > <4>[  125.501074]  ? __ww_mutex_lock.constprop.3+0x1104/0x2cf0
> > <4>[  125.501097]  ? _raw_spin_unlock_irqrestore+0x39/0x60
> > <4>[  125.501120]  ? fs_reclaim_acquire+0x10/0x10
> > <4>[  125.501138]  ? lock_acquire+0x138/0x3c0
> > <4>[  125.501156]  ? lock_acquire+0x3c0/0x3c0
> > <4>[  125.501176]  ? reservation_object_reserve_shared+0x10b/0x250
> > <4>[  125.501198]  ? __slab_alloc.isra.27.constprop.34+0x3d/0x70
> > <4>[  125.501219]  __slab_alloc.isra.27.constprop.34+0x3d/0x70
> > <4>[  125.501243]  ? reservation_object_reserve_shared+0x10b/0x250
> > <4>[  125.501265]  __kmalloc_track_caller+0x313/0x350
> > <4>[  125.501287]  krealloc+0x62/0xb0
> > <4>[  125.501305]  reservation_object_reserve_shared+0x10b/0x250
> > <4>[  125.501411]  i915_gem_do_execbuffer+0x2040/0x32e0 [i915]
> > <4>[  125.501522]  ? eb_relocate_slow+0xad0/0xad0 [i915]
> > <4>[  125.501544]  ? debug_check_no_locks_freed+0x2a0/0x2a0
> > <4>[  125.501646]  ? i915_gem_execbuffer2_ioctl+0x108/0x770 [i915]
> > <4>[  125.501755]  ? i915_gem_execbuffer2_ioctl+0x108/0x770 [i915]
> > <4>[  125.501779]  ? drm_dev_get+0x20/0x20
> > <4>[  125.501803]  ? __might_fault+0xea/0x1a0
> > <4>[  125.501902]  ? i915_gem_execbuffer2_ioctl+0x108/0x770 [i915]
> > <4>[  125.502012]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[  125.502116]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[  125.502218]  i915_gem_execbuffer2_ioctl+0x3c5/0x770 [i915]
> > <4>[  125.502243]  ? drm_dev_enter+0xe0/0xe0
> > <4>[  125.502260]  ? lock_acquire+0x138/0x3c0
> > <4>[  125.502362]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[  125.502470]  ? i915_gem_object_create.part.28+0x570/0x570 [i915]
> > <4>[  125.502575]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[  125.502680]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[  125.502702]  drm_ioctl_kernel+0x151/0x200
> > <4>[  125.502721]  ? drm_ioctl_permit+0x2a0/0x2a0
> > <4>[  125.502746]  drm_ioctl+0x63a/0x920
> > <4>[  125.502844]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
> > <4>[  125.502868]  ? drm_getstats+0x20/0x20
> > <4>[  125.502886]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> > <4>[  125.502919]  do_vfs_ioctl+0x173/0xe90
> > <4>[  125.502936]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> > <4>[  125.502957]  ? ioctl_preallocate+0x170/0x170
> > <4>[  125.502978]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> > <4>[  125.503002]  ? retint_kernel+0x2d/0x2d
> > <4>[  125.503024]  ksys_ioctl+0x35/0x60
> > <4>[  125.503043]  __x64_sys_ioctl+0x6a/0xb0
> > <4>[  125.503061]  do_syscall_64+0x97/0x400
> > <4>[  125.503081]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > <4>[  125.503101] RIP: 0033:0x7fe18e4f65d7
> > <4>[  125.503116] RSP: 002b:00007ffe2ffc06a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > <4>[  125.503145] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe18e4f65d7
> > <4>[  125.503168] RDX: 00007ffe2ffc07f0 RSI: 0000000040406469 RDI: 0000000000000003
> > <4>[  125.503191] RBP: 00007ffe2ffc07f0 R08: 0000000000000004 R09: 00007ffe2ffcf080
> > <4>[  125.503215] R10: 000000000002c7de R11: 0000000000000246 R12: 0000000040406469
> > <4>[  125.503238] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
> > <4>[  125.503268] Code: e8 18 a0 c9 da 48 8b 35 25 3a 47 00 49 c7 c0 a0 3b 88 c0 b9 95 05 00 00 48 c7 c2 e0 49 88 c0 48 c7 c7 8d 3b 5d c0 e8 ee 7e db da <0f> 0b 48 89 ef e8 a4 26 f5 da e9 51 fe ff ff e8 8a 26 f5 da e9
> > <1>[  125.503548] RIP: i915_retire_requests+0x3f2/0x590 [i915] RSP: ffff88004e5dec40
> > 
> > Fixes: 643b450a594e ("drm/i915: Only track live rings for retiring")
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 3 ---
> >   1 file changed, 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index d68739b94dac..e4cf76ec14a6 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1426,9 +1426,6 @@ void i915_retire_requests(struct drm_i915_private *i915)
> >       if (!i915->gt.active_requests)
> >               return;
> >   
> > -     /* An outstanding request must be on a still active ring somewhere */
> > -     GEM_BUG_ON(list_empty(&i915->gt.active_rings));
> > -
> >       list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link)
> >               ring_retire_requests(ring);
> >   }
> > 
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

And pushed before anyone else notices the oops...

Thanks for the review,
-Chris


More information about the Intel-gfx mailing list