[RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)
Jason Gunthorpe
jgg at nvidia.com
Tue Aug 8 12:42:17 UTC 2023
On Tue, Aug 08, 2023 at 07:37:19AM +0000, Kasireddy, Vivek wrote:
> Hi Jason,
>
> >
> > > No, adding HMM_PFN_REQ_WRITE still doesn't help in fixing the issue.
> > > Although, I do not have THP enabled (or built-in), shmem does not evict
> > > the pages after hole punch as noted in the comment in shmem_fallocate():
> >
> > This is the source of all your problems.
> >
> > Things that are mm-centric are supposed to track the VMAs and changes to
> > the PTEs. If you do something in userspace and it doesn't cause the
> > CPU page tables to change then it certainly shouldn't cause any mmu
> > notifiers or hmm_range_fault changes.
> I am not doing anything out of the blue in the userspace. I think the behavior
> I am seeing with shmem (where an invalidation event (MMU_NOTIFY_CLEAR)
> does occur because of a hole punch but the PTEs don't really get updated)
> can arguably be considered an optimization.
Your explanations don't make sense.
If MMU_NOTIFER_CLEAR was sent but the PTEs were left present then:
> > There should still be an invalidation notifier at some point when the
> > CPU tables do eventually change, whenever that is. Missing that
> > notification would be a bug.
> I clearly do not see any notification getting triggered (from both shmem_fault()
> and hugetlb_fault()) when the PTEs do get updated as the hole is refilled
> due to writes. Are you saying that there needs to be an invalidation event
> (MMU_NOTIFY_CLEAR?) dispatched at this point?
You don't get to get shmem_fault in the first place.
If they were marked non-prsent during the CLEAR then the shadow side
remains non-present until it gets its own fault.
If they were made non-present without an invalidation then that is a
bug.
> > hmm_range_fault() is the correct API to use if you are working with
> > notifiers. Do not hack something together using pin_user_pages.
> I noticed that hmm_range_fault() does not seem to be working as expected
> given that it gets stuck(hangs) while walking hugetlb pages.
You are the first to report that, it sounds like a serious bug. Please
try to fix it.
> Regardless, as I mentioned above, the lack of notification when PTEs
> do get updated due to writes is the crux of the issue
> here. Therefore, AFAIU, triggering an invalidation event or some
> other kind of notification would help in fixing this issue.
You seem to be facing some kind of bug in the mm, it sounds pretty
serious, and it almost certainly is a missing invalidation.
Basically, anything that changes a PTE must eventually trigger an
invalidation. It is illegal to change a PTE from one present value to
another present value without invalidation notification.
It is not surprising something would be missed here.
Jason
More information about the dri-devel
mailing list