[PATCH v2 hmm 05/11] mm/hmm: Remove duplicate condition test before wait_event_timeout

Jason Gunthorpe jgg at ziepe.ca
Sat Jun 8 01:47:20 UTC 2019


On Fri, Jun 07, 2019 at 03:13:00PM -0700, Ralph Campbell wrote:
> > Do you see a reason why the find_vma() ever needs to be before the
> > 'again' in my above example? range.vma does not need to be set for
> > range_register.
> 
> Yes, for the GPU case, there can be many faults in an event queue
> and the goal is to try to handle more than one page at a time.
> The vma is needed to limit the amount of coalescing and checking
> for pages that could be speculatively migrated or mapped.

I'd need to see an algorithm sketch to see what you are thinking..

But, I guess a driver would have figure out a list of what virtual
pages it wants to fault under the mmap sem (maybe use find_vam, etc),
then drop mmap_sem, and start processing those pages for mirroring
under the hmm side.

ie they are two seperate unrelated tasks.

I looked at the hmm.rst again, and that reference algorithm is already
showing that that upon retry the mmap_sem is released:

      take_lock(driver->update);
      if (!range.valid) {
          release_lock(driver->update);
          up_read(&mm->mmap_sem);
          goto again;

So a driver cannot assume that any VMA related work done under the
mmap before the hmm 'critical section' can carry into the hmm critical
section as the lock can be released upon retry and invalidate all that
data..

Forcing the hmm_range_start_and_lock() to acquire the mmap_sem is a
rough way to force the driver author to realize there are two locking
domains and lock protected information cannot cross between.

> OK, I understand.
> If you come up with a set of changes, I can try testing them.

Thanks, I make a sketch of the patch, I have to get back to it after
this series is done.

Jason


More information about the amd-gfx mailing list