[PATCH v2 02/15] mm/mmu_notifier: add an interval tree notifier

Jason Gunthorpe jgg at mellanox.com
Thu Nov 7 20:11:06 UTC 2019


On Wed, Nov 06, 2019 at 09:08:07PM -0500, Jerome Glisse wrote:

> > 
> > Extra credit: IMHO, this clearly deserves to all be in a new mmu_range_notifier.h
> > header file, but I know that's extra work. Maybe later as a follow-up patch,
> > if anyone has the time.
> 
> The range notifier should get the event too, it would be a waste, i think it is
> an oversight here. The release event is fine so NAK to you separate event. Event
> is really an helper for notifier i had a set of patch for nouveau to leverage
> this i need to resucite them. So no need to split thing, i would just forward
> the event ie add event to mmu_range_notifier_ops.invalidate() i failed to catch
> that in v1 sorry.

I think what you mean is already done?

struct mmu_range_notifier_ops {
	bool (*invalidate)(struct mmu_range_notifier *mrn,
			   const struct mmu_notifier_range *range,
			   unsigned long cur_seq);

> No it is always odd, you must call mmu_range_set_seq() only from the
> op->invalidate_range() callback at which point the seq is odd. As well
> when mrn is added and its seq first set it is set to an odd value
> always. Maybe the comment, should read:
> 
>  * mrn->invalidate_seq is always, yes always, set to an odd value. This ensures
> 
> To stress that it is not an error.

I went with this:

	/*
	 * mrn->invalidate_seq must always be set to an odd value via
	 * mmu_range_set_seq() using the provided cur_seq from
	 * mn_itree_inv_start_range(). This ensures that if seq does wrap we
	 * will always clear the below sleep in some reasonable time as
	 * mmn_mm->invalidate_seq is even in the idle state.
	 */

> > > +	spin_lock(&mmn_mm->lock);
> > > +	if (mmn_mm->active_invalidate_ranges) {
> > > +		if (mn_itree_is_invalidating(mmn_mm))
> > > +			hlist_add_head(&mrn->deferred_item,
> > > +				       &mmn_mm->deferred_list);
> > > +		else {
> > > +			mmn_mm->invalidate_seq |= 1;
> > > +			interval_tree_insert(&mrn->interval_tree,
> > > +					     &mmn_mm->itree);
> > > +		}
> > > +		mrn->invalidate_seq = mmn_mm->invalidate_seq;
> > > +	} else {
> > > +		WARN_ON(mn_itree_is_invalidating(mmn_mm));
> > > +		mrn->invalidate_seq = mmn_mm->invalidate_seq - 1;
> > 
> > Ohhh, checkmate. I lose. Why is *subtracting* the right thing to do
> > for seq numbers here?  I'm acutely unhappy trying to figure this out.
> > I suspect it's another unfortunate side effect of trying to use the
> > lower bit of the seq number (even/odd) for something else.
> 
> If there is no mmn_mm->active_invalidate_ranges then it means that
> mmn_mm->invalidate_seq is even and thus mmn_mm->invalidate_seq - 1
> is an odd number which means that mrn->invalidate_seq is initialized
> to odd value and if you follow the rule for calling mmu_range_set_seq()
> then it will _always_ be an odd number and this close the loop with
> the above comments :)

The key thing is that it is an odd value that will take a long time
before mmn_mm->invalidate seq reaches it

> > > +	might_lock(&mm->mmap_sem);
> > > +
> > > +	mmn_mm = smp_load_acquire(&mm->mmu_notifier_mm);
> > 
> > What does the above pair with? Should have a comment that specifies that.
> 
> It was discussed in v1 but maybe a comment of what was said back then would
> be helpful. Something like:
> 
> /*
>  * We need to insure that all writes to mm->mmu_notifier_mm are visible before
>  * any checks we do on mmn_mm below as otherwise CPU might re-order write done
>  * by another CPU core to mm->mmu_notifier_mm structure fields after the read
>  * belows.
>  */

This comment made it, just at the store side:

	/*
	 * Serialize the update against mmu_notifier_unregister. A
	 * side note: mmu_notifier_release can't run concurrently with
	 * us because we hold the mm_users pin (either implicitly as
	 * current->mm or explicitly with get_task_mm() or similar).
	 * We can't race against any other mmu notifier method either
	 * thanks to mm_take_all_locks().
	 *
	 * release semantics on the initialization of the mmu_notifier_mm's
         * contents are provided for unlocked readers.  acquire can only be
         * used while holding the mmgrab or mmget, and is safe because once
         * created the mmu_notififer_mm is not freed until the mm is
         * destroyed.  As above, users holding the mmap_sem or one of the
         * mm_take_all_locks() do not need to use acquire semantics.
	 */
	if (mmu_notifier_mm)
		smp_store_release(&mm->mmu_notifier_mm, mmu_notifier_mm);

Which I think is really overly belaboring the typical smp
store/release pattern, but people do seem unfamiliar with them...

Thanks,
Jason


More information about the dri-devel mailing list