[RFC PATCH] mm/hmm, mm/migrate_device: Allow p2p access and p2p migration

Thomas Hellström thomas.hellstrom at linux.intel.com
Tue Oct 15 12:41:24 UTC 2024


Hi, Jason.

Thanks for the feedback.

On Tue, 2024-10-15 at 09:17 -0300, Jason Gunthorpe wrote:
> On Tue, Oct 15, 2024 at 01:13:22PM +0200, Thomas Hellström wrote:
> > Introduce a way for hmm_range_fault() and migrate_vma_setup() to
> > identify
> > foreign devices with fast interconnect and thereby allow
> > both direct access over the interconnect and p2p migration.
> > 
> > The need for a callback arises because without it, the p2p ability
> > would
> > need to be static and determined at dev_pagemap creation time. With
> > a callback it can be determined dynamically, and in the migrate
> > case
> > the callback could separate out local device pages.
> 
> 
> > +static bool hmm_allow_devmem(struct hmm_range *range, struct page
> > *page)
> > +{
> > +	if (likely(page->pgmap->owner == range-
> > >dev_private_owner))
> > +		return true;
> > +	if (likely(!range->p2p))
> > +		return false;
> > +	return range->p2p->ops->p2p_allow(range->p2p, page);
> > +}
> > +
> >  static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long
> > addr,
> >  			      unsigned long end, pmd_t *pmdp,
> > pte_t *ptep,
> >  			      unsigned long *hmm_pfn)
> > @@ -248,8 +258,7 @@ static int hmm_vma_handle_pte(struct mm_walk
> > *walk, unsigned long addr,
> >  		 * just report the PFN.
> >  		 */
> >  		if (is_device_private_entry(entry) &&
> > -		    pfn_swap_entry_to_page(entry)->pgmap->owner ==
> > -		    range->dev_private_owner) {
> > +		    hmm_allow_devmem(range,
> > pfn_swap_entry_to_page(entry))) {
> >  			cpu_flags = HMM_PFN_VALID;
> >  			if
> > (is_writable_device_private_entry(entry))
> >  				cpu_flags |= HMM_PFN_WRITE;
> 
> This is really misnamed and took me a while to get it.
> 
> It has nothing to do with kernel P2P, you are just allowing more
> selective filtering of dev_private_owner. You should focus on that in
> the naming, not p2p. ie allow_dev_private()
> 
> P2P is stuff that is dealing with MEMORY_DEVICE_PCI_P2PDMA.

Yes, although the intention was to incorporate also other fast
interconnects in "P2P", not just "PCIe P2P", but I'll definitely take a
look at the naming.

> 
> This is just allowing more instances of the same driver to co-
> ordinate
> their device private memory handle, for whatever purpose.

Exactly, or theoretically even cross-driver.

> 
> Otherwise I don't see a particular problem, though we have talked
> about widening the matching for device_private more broadly using
> some
> kind of grouping tag or something like that instead of a callback.
> You
> may consider that as an alternative

Yes. Looked at that, but (if I understand you correctly) that would be
the case mentioned in the commit message where the group would be set
up statically at dev_pagemap creation time? 

> 
> I would also probably try to have less indirection, you can embedd
> the
> hmm_range struct inside a caller private data struct and use that
> instead if inventing a whole new struct and pointer.

Our first attempt was based on that but then that wouldn't be reusable
in the migrate_device.c code. Hence the extra indirection.

Thanks,
Thomas


> 
> Jason



More information about the Intel-xe mailing list