[Intel-gfx] [PATCH] drm/i915: Check rq->timeline before deference

Chris Wilson chris at chris-wilson.co.uk
Wed Mar 14 14:33:08 UTC 2018


Quoting Mika Kuoppala (2018-03-14 14:16:08)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> 
> > Not only is the context suspect to disappearing, but so is it's
> > timeline. Under a lockless inspection of the requests for
> > debugging from intel_engine_dump(), the context may already have been
> > freed and we have to check before chasing the dangling pointer.
> >
> > [28033.681755] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel crct10dif_pclmul crc32_pclmul snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel snd_pcm mei_me mei i915 r8169 mii prime_numbers i2c_hid
> > [28033.681796] CPU: 3 PID: 3058 Comm: gem_exec_schedu Tainted: G     U           4.16.0-rc5+ #9
> > [28033.681804] Hardware name: Acer Aspire E5-575G/Ironman_SK  , BIOS V1.12 08/02/2016
> > [28033.681834] RIP: 0010:print_request+0x2b/0xb0 [i915]
> > [28033.681840] RSP: 0018:ffffc90004afbc18 EFLAGS: 00010202
> > [28033.681847] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8801921b5a40 RCX: 0000000000000006
> > [28033.681854] RDX: ffffc90004afbc60 RSI: ffff8801921b5a40 RDI: 0000000000000004
> > [28033.681861] RBP: ffffc90004afbd80 R08: 0000000000000000 R09: 0000000000000001
> > [28033.681868] R10: ffffc90004afbbd0 R11: ffffc90004afbc73 R12: ffffc90004afbc60
> > [28033.681875] R13: ffffc90004afbd80 R14: ffff8801d40ec670 R15: ffff8801921b5a40
> > [28033.681883] FS:  00007fbba5f6c8c0(0000) GS:ffff8801e8400000(0000) knlGS:0000000000000000
> > [28033.681891] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [28033.681897] CR2: 00007fbba5f8f000 CR3: 00000001b2efa002 CR4: 00000000003606e0
> > [28033.681904] Call Trace:
> > [28033.681932]  intel_engine_print_registers+0x6a7/0x930 [i915]
> > [28033.681962]  intel_engine_dump+0x30d/0x740 [i915]
> > [28033.681971]  ? seq_printf+0x3a/0x50
> > [28033.681995]  i915_engine_info+0xb8/0xe0 [i915]
> > [28033.682003]  ? drm_get_color_range_name+0x20/0x20
> > [28033.682010]  seq_read+0xe1/0x440
> > [28033.682018]  full_proxy_read+0x51/0x80
> > [28033.682025]  __vfs_read+0x21/0x130
> > [28033.682031]  ? do_sys_open+0x134/0x220
> > [28033.682037]  ? kmem_cache_free+0x177/0x2b0
> > [28033.682043]  vfs_read+0xa1/0x150
> > [28033.682049]  SyS_read+0x40/0xa0
> > [28033.682055]  do_syscall_64+0x6b/0x1b0
> > [28033.682063]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> > [28033.682069] RIP: 0033:0x7fbba4655d11
> > [28033.682074] RSP: 002b:00007ffd8c49da58 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> > [28033.682082] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbba4655d11
> > [28033.682089] RDX: 000000000000003f RSI: 00005647bfbfc260 RDI: 0000000000000006
> > [28033.682096] RBP: 000000000000003f R08: 00000000ffffffff R09: 0000000000000000
> > [28033.682104] R10: 0000000000000000 R11: 0000000000000246 R12: 00005647bfbfc260
> > [28033.682111] R13: 0000000000000006 R14: 0000000000000000 R15: 00005647bfbfc260
> > [28033.682119] Code: 41 55 41 54 49 89 d4 55 53 48 89 fd 48 8b 86 c8 00 00 00 48 8b 3d d6 1e 14 e2 48 89 f3 48 2b be a8 02 00 00 48 8b 80 b0 00 00 00 <4c> 8b 68 18 e8 bc 80 02 e1 8b 8b 70 02 00 00 8b b3 28 02 00 00
> > [28033.682206] RIP: print_request+0x2b/0xb0 [i915] RSP: ffffc90004afbc18
> >
> > Reported-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_engine_cs.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> > index a2b1e9e2c008..f22c5f72df8d 100644
> > --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> > @@ -1716,13 +1716,15 @@ static void print_request(struct drm_printer *m,
> >                         struct i915_request *rq,
> >                         const char *prefix)
> >  {
> > +     const char *name = rq->fence.ops->get_timeline_name(&rq->fence);
> > +
> 
> So we get "signaled"

And as we hold the rcu lock here, context can't also vanish between the
test inside get_timeline_name() and our use in the printf.
(Probably could do with another sentence to in get_timeline_name() to
illustrate the rcu_read_lock() requirement in the caller.)

> Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>

Thanks, hopefully you will be able to run gem_ctx_switch all night now!
-Chris


More information about the Intel-gfx mailing list