[Intel-gfx] [RFC PATCH i-g-t] tests/gem_userptr_blits: Refresh map-fixed-invalidate* subtests

Michał Winiarski michal.winiarski at intel.com
Thu Feb 13 14:37:31 UTC 2020


On Thu, Feb 13, 2020 at 01:46:41PM +0100, Janusz Krzysztofik wrote:
> map-fixed-invalidate* subtests utilize gem_set_tiling() which may fail,
> e.g. on hardware with no mappable aperture, due to missing fences.
> Skip those subtests if fences are not available.
> 
> Moreover, those subtests use GEM_MMAP_GTT IOCTL which may also fail,
> e.g. on hardware with no mappable aperture.  Use GEM_MMAP_OFFSET
> instead and iterate MMAP_OFFSET coherent types to find one that works.
> 
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
> Cc: Michał Winiarski <michal.winiarski at intel.com>
> ---
> Hi Michał,
> 
> As you are the author of those subtests, let me ask you a few questions
> I'm not sure about:
> 1. How critical is the use of gem_set_tiling() to those subtests?  Can
>    we just skip those operations if not supported?  If not, can you
>    propose a replacement that should work on hardware with no mappable
>    aperture?

It's a regression test, see:
e4b946bfe1e3 ("drm/i915: Fix userptr deadlock with aliased GTT mmappings")

The idea was to cause a deadlock by triggerring i915_gem_object_release_mmap
from set_tiling underneath struct_mutex, which would then cause invalidate on a
stale userptr (which would also attempt to acquire struct mutex - classic
recursive lock). Original trace which served as an example to write the IGT:

[23106.066196] [drm:i915_gem_open] 
[23279.022598] INFO: task test_api64:1359 blocked for more than 120 seconds.
[23279.032453]       Tainted: G     U          4.1.0-rc7+ #16
[23279.039440] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[23279.049191] test_api64      D ffff880146837a18 12760  1359    530 0x00000000
[23279.058101]  ffff880146837a18 ffff8801468379d8 ffffffff8101efcd ffff880147e22e20
[23279.067372]  ffff8800a34b8000 ffff8801468379f8 ffff880146838000 00000000ffffffff
[23279.076599]  ffff8801459911b0 0000000000000246 ffff8801459911a8 ffff880146837a38
[23279.085923] Call Trace:
[23279.089510]  [<ffffffff8101efcd>] ? native_sched_clock+0x2d/0x90
[23279.097223]  [<ffffffff817a34a7>] schedule+0x37/0x90
[23279.103662]  [<ffffffff817a389e>] schedule_preempt_disabled+0xe/0x10
[23279.111665]  [<ffffffff817a5a05>] mutex_lock_nested+0x165/0x3d0
[23279.119287]  [<ffffffffa01f7fac>] ? cancel_userptr+0x2c/0x150 [i915]
[23279.127376]  [<ffffffff817abf88>] ? ftrace_graph_caller+0x78/0xa8
[23279.135204]  [<ffffffffa01f7fac>] ? cancel_userptr+0x2c/0x150 [i915]
[23279.143329]  [<ffffffffa01f7fac>] cancel_userptr+0x2c/0x150 [i915]
[23279.151263]  [<ffffffff817abfb8>] ftrace_graph_caller+0xa8/0xa8
[23279.158801]  [<ffffffffa01f8211>] i915_gem_userptr_mn_invalidate_range_start+0x141/0x200 [i915]
[23279.169559]  [<ffffffffa01f80d5>] ? i915_gem_userptr_mn_invalidate_range_start+0x5/0x200 [i915]
[23279.180364]  [<ffffffff817abfb8>] ftrace_graph_caller+0xa8/0xa8
[23279.187919]  [<ffffffff8120fbb3>] __mmu_notifier_invalidate_range_start+0x83/0xd0
[23279.197342]  [<ffffffff8120fb35>] ? __mmu_notifier_invalidate_range_start+0x5/0xd0
[23279.206804]  [<ffffffff811e9036>] zap_page_range_single+0xf6/0x120
[23279.214727]  [<ffffffff811e9114>] ? unmap_mapping_range+0x64/0x140
[23279.222515]  [<ffffffff811e9114>] ? unmap_mapping_range+0x64/0x140
[23279.230489]  [<ffffffff811e9114>] ? unmap_mapping_range+0x64/0x140
[23279.238442]  [<ffffffff811e91ca>] unmap_mapping_range+0x11a/0x140
[23279.246141]  [<ffffffffa01ecba5>] ? i915_gem_release_mmap+0x5/0x20 [i915]
[23279.254753]  [<ffffffffa01eb866>] i915_gem_release_mmap.part.44+0x46/0x60 [i915]
[23279.264002]  [<ffffffffa01ecbb9>] ? i915_gem_release_mmap+0x19/0x20 [i915]
[23279.272723]  [<ffffffff817abfb8>] ftrace_graph_caller+0xa8/0xa8
[23279.280239]  [<ffffffffa01ecbb9>] i915_gem_release_mmap+0x19/0x20 [i915]
[23279.288780]  [<ffffffff817abfb8>] ftrace_graph_caller+0xa8/0xa8
[23279.296323]  [<ffffffffa01f6df4>] i915_gem_set_tiling+0x204/0x540 [i915]
[23279.304854]  [<ffffffff817abfb8>] ftrace_graph_caller+0xa8/0xa8
[23279.312559]  [<ffffffffa0119a8f>] drm_ioctl+0x12f/0x620 [drm]
[23279.319986]  [<ffffffffa01f6bf0>] ? i915_gem_detect_bit_6_swizzle+0x200/0x200 [i915]
[23279.329744]  [<ffffffff811eac5e>] ? handle_mm_fault+0xe1e/0x1780
[23279.337380]  [<ffffffff8101efcd>] ? native_sched_clock+0x2d/0x90
[23279.345081]  [<ffffffff8101f039>] ? sched_clock+0x9/0x10
[23279.352029]  [<ffffffff810d3fd5>] ? local_clock+0x25/0x30
[23279.359059]  [<ffffffff810ef31f>] ? lock_release_holdtime.part.25+0xf/0x1f0
[23279.367914]  [<ffffffff8124fc96>] do_vfs_ioctl+0x2c6/0x4f0
[23279.375068]  [<ffffffff810edf13>] ? up_read+0x23/0x40
[23279.381794]  [<ffffffff81068cfc>] ? __do_page_fault+0x1bc/0x450
[23279.389356]  [<ffffffff8124ff41>] SyS_ioctl+0x81/0xa0
[23279.396025]  [<ffffffff817a992e>] system_call_fastpath+0x12/0x76
[23279.403823] 4 locks held by test_api64/1359:
[23279.409505]  #0:  (&dev->struct_mutex){......}, at: [<ffffffffa01f6ce6>] i915_gem_set_tiling+0xf6/0x540 [i915]
[23279.421939]  #1:  (&mapping->i_mmap_rwsem){......}, at: [<ffffffff811e9114>] unmap_mapping_range+0x64/0x140
[23279.434071]  #2:  (&srcu){......}, at: [<ffffffff8120fb35>] __mmu_notifier_invalidate_range_start+0x5/0xd0
[23279.446103]  #3:  (&dev->struct_mutex){......}, at: [<ffffffffa01f7fac>] cancel_userptr+0x2c/0x150 [i915]

Since then, i915 has changed a lot... (no more struct mutex), so it's hard to
immediately say whether there's still any deadlock potential there.

-Michał

> 2. Which of MMAP_OFFSET types should do the job if mappable aperture is
>    not available?  Should any of them work, as I've assumed?  Is my
>    approach of succeeding a subtest on first successful MMAP_OFFSET
>    type correct?  Or should we examine all types?
> 
> Thanks,
> Janusz


More information about the Intel-gfx mailing list