[PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking
jgg at mellanox.com
Mon Oct 21 19:06:00 UTC 2019
On Mon, Oct 21, 2019 at 02:40:41PM -0400, Jerome Glisse wrote:
> On Tue, Oct 15, 2019 at 03:12:27PM -0300, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <jgg at mellanox.com>
> > 8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp, hfi1,
> > scif_dma, vhost, gntdev, hmm) drivers are using a common pattern where
> > they only use invalidate_range_start/end and immediately check the
> > invalidating range against some driver data structure to tell if the
> > driver is interested. Half of them use an interval_tree, the others are
> > simple linear search lists.
> > Of the ones I checked they largely seem to have various kinds of races,
> > bugs and poor implementation. This is a result of the complexity in how
> > the notifier interacts with get_user_pages(). It is extremely difficult to
> > use it correctly.
> > Consolidate all of this code together into the core mmu_notifier and
> > provide a locking scheme similar to hmm_mirror that allows the user to
> > safely use get_user_pages() and reliably know if the page list still
> > matches the mm.
> > This new arrangment plays nicely with the !blockable mode for
> > OOM. Scanning the interval tree is done such that the intersection test
> > will always succeed, and since there is no invalidate_range_end exposed to
> > drivers the scheme safely allows multiple drivers to be subscribed.
> > Four places are converted as an example of how the new API is used.
> > Four are left for future patches:
> > - i915_gem has complex locking around destruction of a registration,
> > needs more study
> > - hfi1 (2nd user) needs access to the rbtree
> > - scif_dma has a complicated logic flow
> > - vhost's mmu notifiers are already being rewritten
> > This is still being tested, but I figured to send it to start getting help
> > from the xen, amd and hfi drivers which I cannot test here.
> It might be a good oportunity to also switch those users to
> hmm_range_fault() instead of GUP as GUP is pointless for those
> users. In fact the GUP is an impediment to normal mm operations.
I think vhost can use hmm_range_fault
hfi1 does actually need to have the page pin, it doesn't fence DMA
i915_gem feels alot like amdgpu, so probably it would benefit
No idea about scif_dma
> I will test on nouveau.
Thanks, hopefully it still works, I think Ralph was able to do some
basic checks. But it is a pretty complicated series, I probably made
FWIW, I know that nouveau gets a lockdep splat now from Daniel
Vetter's recent changes, it tries to do GFP_KERENEL allocations under
a lock also held by the invalidate_range_start path.
Thanks for looking at it!
More information about the dri-devel