[PATCH v2 6/6] drm/xe/svm: Migrate folios when possible

Matthew Brost matthew.brost at intel.com
Mon Jul 28 20:09:35 UTC 2025


On Mon, Jul 28, 2025 at 04:10:10PM +0200, Francois Dugast wrote:
> On Fri, Jul 25, 2025 at 04:14:17PM -0700, Matthew Brost wrote:
> > On Fri, Jul 25, 2025 at 05:39:30PM +0200, Francois Dugast wrote:
> > > The DMA mapping can now correspond to a folio (order > 0), so move the
> > > iterator by the number of pages in the folio in order to migrate all
> > > pages at once. This will improve efficiency compared to migrating pages
> > > one by one.
> > > 
> > > For this to work, the BOs must be contiguous in memory.
> > > 
> > 
> > I'd mention since SVM BOs are a max of 2M and it is very unlikely
> > forcing to continuous has any negative affects (e.g., extra eviction),
> > it greatly simplifies the code, and 2M contiguous memory will enable 2M
> > device pages which is huge perf win.
> 
> Sure, will add.
> 
> > 
> > We might need a small adjustment to populate_devmem_pfn to communicate
> > contigous memory was found to drm_pagemap too as our driver can easily
> > allocate contiguous memory, but we shouldn't assume other drivers can
> > do so. This can be done in follow up which adds the 2M device page
> > support.
> 
> Do I understand correctly: change populate_devmem_pfn() to provide the
> information to the caller in drm_pagemap whether the allocated memory
> is contiguous or not, so that it can let the driver know when calling
> copy_to_devmem(). This way a fallback can be used in the driver when
> the memory is not contiguous (not implemented below because this seems
> not necessary in Xe).
> 

More so to know if it can create folios out of the return pages from
populate_devmem_pfn() ahead of the copy_to_devmem() call. Yes,
contigious inform can be inferred from examining all of them, but this
would provide a quick short circuit to know it can immediately create
folios.

The pages in copy_to_devmem() should provide order, so need for an extra
argument there.

Matt

> > 
> > > Signed-off-by: Francois Dugast <francois.dugast at intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/xe_bo.c  | 2 ++
> > >  drivers/gpu/drm/xe/xe_svm.c | 5 +++++
> > >  2 files changed, 7 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> > > index ffca1cea5585..59994a978a8c 100644
> > > --- a/drivers/gpu/drm/xe/xe_bo.c
> > > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > > @@ -200,6 +200,8 @@ static bool force_contiguous(u32 bo_flags)
> > >  	else if (bo_flags & XE_BO_FLAG_PINNED &&
> > >  		 !(bo_flags & XE_BO_FLAG_PINNED_LATE_RESTORE))
> > >  		return true; /* needs vmap */
> > > +	else if (bo_flags & XE_BO_FLAG_CPU_ADDR_MIRROR)
> > > +		return true;
> > >  
> > >  	/*
> > >  	 * For eviction / restore on suspend / resume objects pinned in VRAM
> > > diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> > > index 1d097e76aabc..2759db5f7407 100644
> > > --- a/drivers/gpu/drm/xe/xe_svm.c
> > > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > > @@ -382,6 +382,11 @@ static int xe_svm_copy(struct page **pages,
> > >  				pos = i;
> > >  			}
> > >  
> > > +			if (pagemap_addr[i].order) {
> > > +				i += NR_PAGES(pagemap_addr[i].order);
> > 
> > I think this needs to be NR_PAGES(pagemap_addr[i].order) - 1, right?
> 
> Yes correct because i is already incremented by 1 at this point.
> 
> > 
> > > +				last = (i + 1) == npages;
> > 
> > Then set the chunk here too?
> > 
> > > +			}
> > > +
> > >  			match = vram_addr + PAGE_SIZE * (i - pos) == __vram_addr;
> > 
> > I think the match can be dropped actually and just assume it is true if
> > we are allocating continuous memory now.
> 
> True, let me try this.
> 
> Francois
> 
> > 
> > Matt
> > 
> > >  		}
> > >  
> > > -- 
> > > 2.43.0
> > > 


More information about the Intel-xe mailing list