[RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)
Jason Gunthorpe
jgg at nvidia.com
Thu Jul 27 11:58:03 UTC 2023
On Thu, Jul 27, 2023 at 07:34:30AM +0000, Kasireddy, Vivek wrote:
> Hi Jason,
>
> >
> > On Tue, Jul 25, 2023 at 10:44:09PM +0000, Kasireddy, Vivek wrote:
> > > > If you still need the memory mapped then you re-call hmm_range_fault
> > > > and re-obtain it. hmm_range_fault will resolve all the races and you
> > > > get new pages.
> >
> > > IIUC, for my udmabuf use-case, it looks like calling hmm_range_fault
> > > immediately after an invalidate (range notification) would preemptively
> > fault in
> > > new pages before a write. The problem with that is if a read occurs on
> > those
> > > new pages, then the data is incorrect as a write may not have
> > > happened yet.
> >
> > It cannot be, if you use hmm_range_fault correctly you cannot get
> > corruption no matter what is done to the mmap'd memfd. If there is
> > otherwise it is a hmm_range_fault bug plain and simple.
> >
> > > Ideally, what I am looking for is for getting new pages at the time of or after
> > > a write; until then, it is ok to use the old pages given my use-case.
> >
> > It is wrong, if you are synchronizing the vma then you must use the
> > latest copy. If your use case can tolerate it then keep a 'not
> > present' indication for the missing pages until you actually need
> > them, but dmabuf doesn't really provide an API for that.
> >
> > > I think the difference comes down to whether we (udmabuf driver) want to
> > > grab the new pages after getting notified about a PTE update because
> > > of a fault
> >
> > Why? You still haven't explained why you want this.
> Ok, let me explain using one of the udmabuf selftests (added in patch #3)
> to describe the problem (sorry, I'd have to use the terms memfd, hole, etc)
> I am trying to solve:
> size = MEMFD_SIZE * page_size;
> memfd = create_memfd_with_seals(size, false);
> addr1 = mmap_fd(memfd, size);
> write_to_memfd(addr1, size, 'a');
> buf = create_udmabuf_list(devfd, memfd, size);
> addr2 = mmap_fd(buf, NUM_PAGES * NUM_ENTRIES * getpagesize());
> punch_hole(memfd, MEMFD_SIZE / 2);
> -> At this point, if I were to read addr1, it'd still have "a" in relevant areas
> because a new write hasn't happened yet. And, since this results in an
> invalidation (notification) of the associated VMA range, I could register
> a callback in udmabuf driver and get notified but I am not sure how or
> why that would be useful.
When you get an invalidation you trigger dmabuf move, which revokes
the importes use of the dmabuf because the underlying memory has
changed. This is exactly the same as a GPU driver migrating memory
to/fro CPU memory.
>
> write_to_memfd(addr1, size, 'b');
> -> Here, the hole gets refilled as a result of the above writes which trigger
> faults and the PTEs are updated to point to new pages. When this happens,
> the udmabuf driver needs to be made aware of the new pages that were
> faulted in because of the new writes.
You only need this because you are not processing the invalidate.
> a way to get notified when the hole is written to, the solution I came up
> with is to either add a new notifier or add calls to change_pte() when the
> PTEs do get updated. However, considering your suggestion to use
> hmm_range_fault(), it is not clear to me how it would help while the hole
> is being written to as the writes occur outside of the
> udmabuf driver.
You have the design backwards.
When a dmabuf importer asks for the dmabuf to be present you call
hmm_range_fault() and you get back whatever memory is appropriate. The
importer can then use it.
If the underlying memory changes then you get the invalidation and you
trigger move. The importer stops using the memory and the underlying
pages change.
Later the importer decides it needs the memory again so it again asks
for the dmabuf to be present, which does hmm_range_fault and gets
whatever is appropriate at the time.
Jason
More information about the dri-devel
mailing list