[PATCH] dma-resv: lockdep-prime address_space->i_mmap_rwsem for dma-resv
Daniel Vetter
daniel at ffwll.ch
Thu Sep 17 13:19:42 UTC 2020
On Thu, Jul 30, 2020 at 06:45:14PM +0200, Thomas Hellström (Intel) wrote:
>
> On 7/30/20 3:17 PM, Daniel Vetter wrote:
> > On Thu, Jul 30, 2020 at 2:17 PM Thomas Hellström (Intel)
> > <thomas_os at shipmail.org> wrote:
> > >
> > > On 7/28/20 3:58 PM, Daniel Vetter wrote:
> > > > GPU drivers need this in their shrinkers, to be able to throw out
> > > > mmap'ed buffers. Note that we also need dma_resv_lock in shrinkers,
> > > > but that loop is resolved by trylocking in shrinkers.
> > > >
> > > > So full hierarchy is now (ignore some of the other branches we already
> > > > have primed):
> > > >
> > > > mmap_read_lock -> dma_resv -> shrinkers -> i_mmap_lock_write
> > > >
> > > > I hope that's not inconsistent with anything mm or fs does, adding
> > > > relevant people.
> > > >
> > > Looks OK to me. The mapping_dirty_helpers run under the i_mmap_lock, but
> > > don't allocate any memory AFAICT.
> > >
> > > Since huge page-table-entry splitting may happen under the i_mmap_lock
> > > from unmap_mapping_range() it might be worth figuring out how new page
> > > directory pages are allocated, though.
> > ofc I'm not an mm expert at all, but I did try to scroll through all
> > i_mmap_lock_write/read callers. Found the following:
> >
> > - kernel/events/uprobes.c in build_map_info:
> >
> > /*
> > * Needs GFP_NOWAIT to avoid i_mmap_rwsem recursion through
> > * reclaim. This is optimistic, no harm done if it fails.
> > */
> >
> > - I got lost in the hugetlb.c code and couldn't convince myself it's
> > not allocating page directories at various levels with something else
> > than GFP_KERNEL.
> >
> > So looks like the recursion is clearly there and known, but the
> > hugepage code is too complex and flying over my head.
> > -Daniel
>
> OK, so I inverted your annotation and ran a memory hog, and got the below
> splat. So clearly your proposed reclaim->i_mmap_lock locking order is an
> already established one.
>
> So
>
> Reviewed-by: Thomas Hellström <thomas.hellstrom at intel.com>
No one complaining that this is a terrible idea and two reviews from
people who know stuff, so I went ahead and pushed this to drm-misc-next.
Thanks for taking a look at this.
-Daniel
>
> 8<---------------------------------------------------------------------------------------------
>
> [ 308.324654] WARNING: possible circular locking dependency detected
> [ 308.324655] 5.8.0-rc2+ #16 Not tainted
> [ 308.324656] ------------------------------------------------------
> [ 308.324657] kswapd0/98 is trying to acquire lock:
> [ 308.324658] ffff92a16f758428 (&mapping->i_mmap_rwsem){++++}-{3:3}, at:
> rmap_walk_file+0x1c0/0x2f0
> [ 308.324663]
> but task is already holding lock:
> [ 308.324664] ffffffffb0960240 (fs_reclaim){+.+.}-{0:0}, at:
> __fs_reclaim_acquire+0x5/0x30
> [ 308.324666]
> which lock already depends on the new lock.
>
> [ 308.324667]
> the existing dependency chain (in reverse order) is:
> [ 308.324667]
> -> #1 (fs_reclaim){+.+.}-{0:0}:
> [ 308.324670] fs_reclaim_acquire+0x34/0x40
> [ 308.324672] dma_resv_lockdep+0x186/0x224
> [ 308.324675] do_one_initcall+0x5d/0x2c0
> [ 308.324676] kernel_init_freeable+0x222/0x288
> [ 308.324678] kernel_init+0xa/0x107
> [ 308.324679] ret_from_fork+0x1f/0x30
> [ 308.324680]
> -> #0 (&mapping->i_mmap_rwsem){++++}-{3:3}:
> [ 308.324682] __lock_acquire+0x119f/0x1fc0
> [ 308.324683] lock_acquire+0xa4/0x3b0
> [ 308.324685] down_read+0x2d/0x110
> [ 308.324686] rmap_walk_file+0x1c0/0x2f0
> [ 308.324687] page_referenced+0x133/0x150
> [ 308.324689] shrink_active_list+0x142/0x610
> [ 308.324690] balance_pgdat+0x229/0x620
> [ 308.324691] kswapd+0x200/0x470
> [ 308.324693] kthread+0x11f/0x140
> [ 308.324694] ret_from_fork+0x1f/0x30
> [ 308.324694]
> other info that might help us debug this:
>
> [ 308.324695] Possible unsafe locking scenario:
>
> [ 308.324695] CPU0 CPU1
> [ 308.324696] ---- ----
> [ 308.324696] lock(fs_reclaim);
> [ 308.324697] lock(&mapping->i_mmap_rwsem);
> [ 308.324698] lock(fs_reclaim);
> [ 308.324699] lock(&mapping->i_mmap_rwsem);
> [ 308.324699]
> *** DEADLOCK ***
>
> [ 308.324700] 1 lock held by kswapd0/98:
> [ 308.324701] #0: ffffffffb0960240 (fs_reclaim){+.+.}-{0:0}, at:
> __fs_reclaim_acquire+0x5/0x30
> [ 308.324702]
> stack backtrace:
> [ 308.324704] CPU: 1 PID: 98 Comm: kswapd0 Not tainted 5.8.0-rc2+ #16
> [ 308.324705] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
> Desktop Reference Platform, BIOS 6.00 07/29/2019
> [ 308.324706] Call Trace:
> [ 308.324710] dump_stack+0x92/0xc8
> [ 308.324711] check_noncircular+0x12d/0x150
> [ 308.324713] __lock_acquire+0x119f/0x1fc0
> [ 308.324715] lock_acquire+0xa4/0x3b0
> [ 308.324716] ? rmap_walk_file+0x1c0/0x2f0
> [ 308.324717] ? __lock_acquire+0x394/0x1fc0
> [ 308.324719] down_read+0x2d/0x110
> [ 308.324720] ? rmap_walk_file+0x1c0/0x2f0
> [ 308.324721] rmap_walk_file+0x1c0/0x2f0
> [ 308.324722] page_referenced+0x133/0x150
> [ 308.324724] ? __page_set_anon_rmap+0x70/0x70
> [ 308.324725] ? page_get_anon_vma+0x190/0x190
> [ 308.324726] shrink_active_list+0x142/0x610
> [ 308.324728] balance_pgdat+0x229/0x620
> [ 308.324730] kswapd+0x200/0x470
> [ 308.324731] ? lockdep_hardirqs_on_prepare+0xf5/0x170
> [ 308.324733] ? finish_wait+0x80/0x80
> [ 308.324734] ? balance_pgdat+0x620/0x620
> [ 308.324736] kthread+0x11f/0x140
> [ 308.324737] ? kthread_create_worker_on_cpu+0x40/0x40
> [ 308.324739] ret_from_fork+0x1f/0x30
>
>
>
> > > /Thomas
> > >
> > >
> > >
> >
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the dri-devel
mailing list