[PATCH v1 5/5] drm/pagemap: Allocate folios when possible

Matthew Brost matthew.brost at intel.com
Sun Jul 20 20:53:29 UTC 2025


On Thu, Jul 17, 2025 at 10:49:48PM -0700, Matthew Brost wrote:
> On Thu, Jul 17, 2025 at 09:41:23PM -0700, Matthew Brost wrote:
> > On Thu, Jul 17, 2025 at 03:38:27PM +0200, Francois Dugast wrote:
> > > If the order is greater than zero, allocate a folio when populating the
> > > RAM PFNs instead of allocating individual pages one after the other. For
> > > example if 2MB folios are used instead of 4KB pages, this reduces the
> > > number of calls to the allocation API by 512.
> > > 
> > > Signed-off-by: Francois Dugast <francois.dugast at intel.com>
> > > Cc: Matthew Brost <matthew.brost at intel.com>
> > > ---
> > >  drivers/gpu/drm/drm_pagemap.c | 33 ++++++++++++++++++++++-----------
> > >  1 file changed, 22 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> > > index de15d96f6393..4f67c6173ee5 100644
> > > --- a/drivers/gpu/drm/drm_pagemap.c
> > > +++ b/drivers/gpu/drm/drm_pagemap.c
> > > @@ -438,6 +438,7 @@ EXPORT_SYMBOL_GPL(drm_pagemap_migrate_to_devmem);
> > >   * @src_mpfn: Source array of migrate PFNs
> > >   * @mpfn: Array of migrate PFNs to populate
> > >   * @addr: Start address for PFN allocation
> > > + * @order: Page order
> > >   *
> > >   * This function populates the RAM migrate page frame numbers (PFNs) for the
> > >   * specified VM area structure. It allocates and locks pages in the VM area for
> > > @@ -452,35 +453,45 @@ static int drm_pagemap_migrate_populate_ram_pfn(struct vm_area_struct *vas,
> > >  						unsigned long *mpages,
> > >  						unsigned long *src_mpfn,
> > >  						unsigned long *mpfn,
> > > -						unsigned long addr)
> > > +						unsigned long addr,
> > > +						unsigned int order)
> > 
> > I don't think an order argument is needed. A better approach would be to
> > look at the order of the src_mpfn (device) page and allocate based on
> > that. This would maintain congruence between the initial GPU fault—which
> > creates device pages either as THP or not—and the migration path back,
> > where we'd insert a THP or not accordingly. In other words, it would
> > preserve consistency throughout the entire flow.
> > 
> > Also, if you look at the migrate_vma_* functions for THP, it's never
> > allowed to upgrade from non-THP to THP—only a downgrade from THP to
> > non-THP is permitted.
> > 
> > >  {
> > >  	unsigned long i;
> > >  
> > > -	for (i = 0; i < npages; ++i, addr += PAGE_SIZE) {
> > > +	for (i = 0; i < npages;) {
> > >  		struct page *page, *src_page;
> > >  
> > >  		if (!(src_mpfn[i] & MIGRATE_PFN_MIGRATE))
> > > -			continue;
> > > +			goto next;
> > >  
> > >  		src_page = migrate_pfn_to_page(src_mpfn[i]);
> > >  		if (!src_page)
> > > -			continue;
> > > +			goto next;
> > >  
> > >  		if (fault_page) {
> > >  			if (src_page->zone_device_data !=
> > >  			    fault_page->zone_device_data)
> > > -				continue;
> > > +				goto next;
> > >  		}
> > >  
> > > -		if (vas)
> > > -			page = alloc_page_vma(GFP_HIGHUSER, vas, addr);
> > > -		else
> > > -			page = alloc_page(GFP_HIGHUSER);
> > > +		if (order) {
> > > +			page = folio_page(vma_alloc_folio(GFP_HIGHUSER | __GFP_ZERO,
> > > +							  order, vas, addr), 0);
> > 
> >                         if (vas)
> >                                 page = folio_page(vma_alloc_folio(GFP_HIGHUSER,
> >                                                                   order, vas, addr), 0);
> >                         else
> >                                 page = alloc_pages(GFP_HIGHUSER, order);

s/alloc_pages/folio_alloc actually.

Also I think calling folio_page before checking the return of
vma_alloc_folio is dangerous too as vma_alloc_folio can return NULL on
failure.

Matt 

> > 
> > We may also want to consider a downgrade path—for example, if THP
> > allocation fails here, we could fall back to allocating single pages.
> > That would complicate things across GPU SVM and Xe, so maybe we table it
> > for now. But eventually, we'll need to handle this as per Nvidia's
> > comments, THP allocation failure seems possible.
> > 
> > Maybe add a comment indicating that, something like:
> > 
> > /* TODO: Support fallback to single pages if THP allocation fails */
> > 
> > > +		} else {
> > > +			if (vas)
> > > +				page = alloc_page_vma(GFP_HIGHUSER, vas, addr);
> > > +			else
> > > +				page = alloc_page(GFP_HIGHUSER);
> > > +		}
> > >  
> > >  		if (!page)
> > >  			goto free_pages;
> > >  
> > >  		mpfn[i] = migrate_pfn(page_to_pfn(page));
> > > +
> > > +next:
> > > +		i += 0x1 << order;
> > > +		addr += page_size(page);
> > >  	}
> > >  
> > 
> > The loops below need to be updated to loop based on order too.
> > 
> 
> Also the mpages return based on order.
> 
> Matt
> 
> > Matt
> > 
> > >  	for (i = 0; i < npages; ++i) {
> > > @@ -554,7 +565,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
> > >  		goto err_free;
> > >  
> > >  	err = drm_pagemap_migrate_populate_ram_pfn(NULL, NULL, npages, &mpages,
> > > -						   src, dst, 0);
> > > +						   src, dst, 0, 0);
> > >  	if (err || !mpages)
> > >  		goto err_finalize;
> > >  
> > > @@ -690,7 +701,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
> > >  
> > >  	err = drm_pagemap_migrate_populate_ram_pfn(vas, page, npages, &mpages,
> > >  						   migrate.src, migrate.dst,
> > > -						   start);
> > > +						   start, 0);
> > >  	if (err)
> > >  		goto err_finalize;
> > >  
> > > -- 
> > > 2.43.0
> > > 


More information about the Intel-xe mailing list