[PATCH v1 1/9] mm: add zone device coherent type memory support

Alistair Popple apopple at nvidia.com
Mon Nov 22 02:40:40 UTC 2021


> >> diff --git a/mm/migrate.c b/mm/migrate.c
> >> index 1852d787e6ab..f74422a42192 100644
> >> --- a/mm/migrate.c
> >> +++ b/mm/migrate.c
> >> @@ -362,7 +362,7 @@ static int expected_page_refs(struct address_space *mapping, struct page *page)
> >>  	 * Device private pages have an extra refcount as they are
> >>  	 * ZONE_DEVICE pages.
> >>  	 */
> >> -	expected_count += is_device_private_page(page);
> >> +	expected_count += is_device_page(page);
> >>  	if (mapping)
> >>  		expected_count += thp_nr_pages(page) + page_has_private(page);
> >>  
> >> @@ -2503,7 +2503,7 @@ static bool migrate_vma_check_page(struct page *page)
> >>  		 * FIXME proper solution is to rework migration_entry_wait() so
> >>  		 * it does not need to take a reference on page.
> >>  		 */
> > Note that I have posted a patch to fix this - see
> > https://lore.kernel.org/all/20211118020754.954425-1-apopple@nvidia.com/ This
> > looks ok for now assuming coherent pages can never be pinned.
> >
> > However that raises a question - what happens when something calls
> > get_user_pages() on a pfn pointing to a coherent device page? I can't see
> > anything in this series that prevents pinning of coherent device pages, so we
> > can't just assume they aren't pinned.
> 
> I agree. I think we need to depend on your patch to go in first.
> 
> I'm also wondering if we need to do something to prevent get_user_pages
> from pinning device pages. And by "pin", I think migrate_vma_check_page
> is not talking about FOLL_PIN, but any get_user_pages call. As far as I
> can tell, there should be nothing fundamentally wrong with pinning
> device pages for a short time. But I think we'll want to avoid
> FOLL_LONGTERM because that would affect our memory manager's ability to
> evict device memory.

Right, so long as my fix goes in I don't think there is anything wrong with
pinning device public pages. Agree that we should avoid FOLL_LONGTERM pins for
device memory though. I think the way to do that is update is_pinnable_page()
so we treat device pages the same as other unpinnable pages ie. long-term pins
will migrate the page.

> >
> > In the case of device-private pages this is enforced by the fact they never
> > have present pte's, so any attempt to GUP them results in a fault. But if I'm
> > understanding this series correctly that won't be the case for coherent device
> > pages right?
> 
> Right.
> 
> Regards,
>   Felix
> 
> 
> >
> >> -		return is_device_private_page(page);
> >> +		return is_device_page(page);
> >>  	}
> >>  
> >>  	/* For file back page */
> >> @@ -2791,7 +2791,7 @@ EXPORT_SYMBOL(migrate_vma_setup);
> >>   *     handle_pte_fault()
> >>   *       do_anonymous_page()
> >>   * to map in an anonymous zero page but the struct page will be a ZONE_DEVICE
> >> - * private page.
> >> + * private or coherent page.
> >>   */
> >>  static void migrate_vma_insert_page(struct migrate_vma *migrate,
> >>  				    unsigned long addr,
> >> @@ -2867,10 +2867,15 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
> >>  				swp_entry = make_readable_device_private_entry(
> >>  							page_to_pfn(page));
> >>  			entry = swp_entry_to_pte(swp_entry);
> >> +		} else if (is_device_page(page)) {
> > How about adding an explicit `is_device_coherent_page()` helper? It would make
> > the test more explicit that this is expected to handle just coherent pages and
> > I bet there will be future changes that need to differentiate between private
> > and coherent pages anyway.
> >
> >> +			entry = pte_mkold(mk_pte(page,
> >> +						 READ_ONCE(vma->vm_page_prot)));
> >> +			if (vma->vm_flags & VM_WRITE)
> >> +				entry = pte_mkwrite(pte_mkdirty(entry));
> >>  		} else {
> >>  			/*
> >> -			 * For now we only support migrating to un-addressable
> >> -			 * device memory.
> >> +			 * We support migrating to private and coherent types
> >> +			 * for device zone memory.
> >>  			 */
> >>  			pr_warn_once("Unsupported ZONE_DEVICE page type.\n");
> >>  			goto abort;
> >> @@ -2976,10 +2981,10 @@ void migrate_vma_pages(struct migrate_vma *migrate)
> >>  		mapping = page_mapping(page);
> >>  
> >>  		if (is_zone_device_page(newpage)) {
> >> -			if (is_device_private_page(newpage)) {
> >> +			if (is_device_page(newpage)) {
> >>  				/*
> >> -				 * For now only support private anonymous when
> >> -				 * migrating to un-addressable device memory.
> >> +				 * For now only support private and coherent
> >> +				 * anonymous when migrating to device memory.
> >>  				 */
> >>  				if (mapping) {
> >>  					migrate->src[i] &= ~MIGRATE_PFN_MIGRATE;
> >>
> >
> >
> 






More information about the amd-gfx mailing list