[RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

Alistair Popple apopple at nvidia.com
Mon Aug 21 09:02:28 UTC 2023


"Kasireddy, Vivek" <vivek.kasireddy at intel.com> writes:

> Hi Jason,
>
>> > >
>> > > > No, adding HMM_PFN_REQ_WRITE still doesn't help in fixing the issue.
>> > > > Although, I do not have THP enabled (or built-in), shmem does not evict
>> > > > the pages after hole punch as noted in the comment in
>> shmem_fallocate():
>> > >
>> > > This is the source of all your problems.
>> > >
>> > > Things that are mm-centric are supposed to track the VMAs and changes
>> to
>> > > the PTEs. If you do something in userspace and it doesn't cause the
>> > > CPU page tables to change then it certainly shouldn't cause any mmu
>> > > notifiers or hmm_range_fault changes.
>> > I am not doing anything out of the blue in the userspace. I think the
>> behavior
>> > I am seeing with shmem (where an invalidation event
>> (MMU_NOTIFY_CLEAR)
>> > does occur because of a hole punch but the PTEs don't really get updated)
>> > can arguably be considered an optimization.
>> 
>> Your explanations don't make sense.
>> 
>> If MMU_NOTIFER_CLEAR was sent but the PTEs were left present then:
>> 
>> > > There should still be an invalidation notifier at some point when the
>> > > CPU tables do eventually change, whenever that is. Missing that
>> > > notification would be a bug.
>> > I clearly do not see any notification getting triggered (from both
>> shmem_fault()
>> > and hugetlb_fault()) when the PTEs do get updated as the hole is refilled
>> > due to writes. Are you saying that there needs to be an invalidation event
>> > (MMU_NOTIFY_CLEAR?) dispatched at this point?
>> 
>> You don't get to get shmem_fault in the first place.
> What I am observing is that even after MMU_NOTIFY_CLEAR (hole punch) is sent,
> hmm_range_fault() finds that the PTEs associated with the hole are still pte_present().
> I think it remains this way as long as there are reads on the hole. Once there are
> writes, it triggers shmem_fault() which results in PTEs getting updated but without
> any notification.

Oh wait, this is shmem. The read from hmm_range_fault() (assuming you
specified HMM_PFN_REQ_FAULT) will trigger shmem_fault() due to the
missing PTE. Subsequent writes will just upgrade PTE permissions
assuming the read didn't map them RW to begin with. If you want to
actually see the hole with hmm_range_fault() don't specify
HMM_PFN_REQ_FAULT (or _WRITE).

>> 
>> If they were marked non-prsent during the CLEAR then the shadow side
>> remains non-present until it gets its own fault.
>> 
>> If they were made non-present without an invalidation then that is a
>> bug.
>> 
>> > > hmm_range_fault() is the correct API to use if you are working with
>> > > notifiers. Do not hack something together using pin_user_pages.
>> 
>> > I noticed that hmm_range_fault() does not seem to be working as expected
>> > given that it gets stuck(hangs) while walking hugetlb pages.
>> 
>> You are the first to report that, it sounds like a serious bug. Please
>> try to fix it.
>> 
>> > Regardless, as I mentioned above, the lack of notification when PTEs
>> > do get updated due to writes is the crux of the issue
>> > here. Therefore, AFAIU, triggering an invalidation event or some
>> > other kind of notification would help in fixing this issue.
>> 
>> You seem to be facing some kind of bug in the mm, it sounds pretty
>> serious, and it almost certainly is a missing invalidation.
>> 
>> Basically, anything that changes a PTE must eventually trigger an
>> invalidation. It is illegal to change a PTE from one present value to
>> another present value without invalidation notification.
>> 
>> It is not surprising something would be missed here.
> As you suggest, it looks like the root-cause of this issue is the missing
> invalidation notification when the PTEs are changed from one present

I don't think there's a missing invalidation here. You say you're seeing
the MMU_NOTIFY_CLEAR when hole punching which is when the PTE is
cleared. When else do you expect a notification?

> value to another. I'd like to fix this issue eventually but I first need to
> focus on addressing udmabuf page migration (out of movable zone)
> and also look into the locking concerns Daniel mentioned about pairing
> static and dynamic dmabuf exporters and importers.
>
> Thanks,
> Vivek



More information about the dri-devel mailing list