[PATCH v5 0/3] drm/gpusvm, drm/pagemap, drm/xe: Restructure migration in preparation for multi-device

Thomas Hellström thomas.hellstrom at linux.intel.com
Thu Jun 19 10:07:10 UTC 2025


Hi, Alistair, Thanks for having a look!

On Thu, 2025-06-19 at 14:52 +1000, Alistair Popple wrote:
> On Wed, Jun 18, 2025 at 10:16:14PM +0200, Thomas Hellström wrote:
> > This patchset modifies the migration part of drm_gpusvm to
> > drm_pagemap and
> > adds a populate_mm() op to drm_pagemap.
> > 
> > The idea is that the device that receives a pagefault determines if
> > it wants to
> > migrate content and to where. It then calls the populate_mm()
> > method of relevant
> > drm_pagemap.
> > 
> > This functionality was mostly already in place, but hard-coded for
> > xe only without
> > going through a pagemap op. Since we might be dealing with separate
> > devices moving
> > forward, it also now becomes the responsibilit of the populate_mm()
> > op to
> > grab any necessary local device runtime pm references and keep them
> > held while
> > its pages are present in an mm (struct mm_struct).
> > 
> > On thing to decide here is whether the populate_mm() callback
> > should sit on a
> > struct drm_pagemap for now while we sort multi-device usability out
> > or whether
> > we should add it (or something equivalent) to struct dev_pagemap.
> 
> I'm still looking at this series (sorry it took until v5 for me to
> notice
> it!) but my immediate reaction here is why do/would you need to add
> anything
> to struct dev_pagemap? The common approach here has been to embed
> struct
> dev_pagemap in some driver defined struct and use container_of to go
> from the
> page to the driver (or in this case DRM) specific pagemap.
> 
> See for example dmirror_page_to_chunk() in the HMM self test or
> nouveau_page_to_chunk(). Is there some reason something like that
> would work
> here?

Future patches will, as they are currently written, do something like
this for embedding:

struct xe_pagemap {
	struct dev_pagemap pagemap;
	struct drm_pagemap dpagemap;
};

So that the drm_pagemap can be obtained that way. The reason for that
is to avoid diamond inheritance if the driver wanted to embed a
pcie_p2p pagemap instead of a struct dev_pagemap. But if that becomes
unlikely we could of course embed the dev_pagemap directly into the
drm_pagemap.


> 
> Actually I notice the Xe driver currently does use this to point to a
> struct
> xe_vram_region which contains drm_pagemap pointer. If I understand
> correctly
> we're trying to move a lot of the SVM functionality into a generic
> DRM layer,
> so would it make more sense to have dev_pgmap embeded in drm_pgmap
> and have that
> contain the pointer to any required driver-specific data?
> 
> Also FWIW I don't think zone_device_data is strictly required. It's
> convenient,
> but I suspect it only exists because it could be easily provided
> within the
> footprint of the existing struct page due to not using all the fields
> for
> ZONE_DEVICE pages. I can imagine we might eventually remove it, once
> we no
> longer need struct pages and move to folios/memdescs.

It looks like, correct me if I'm wrong, like the nouveau version has
one dev_pagemap per bo. The code at hand here use multiple bos for the
memory of a single pagemap so in essence the zone_device_data provides
a pointer to the bo used for that particular page. If we lose the
zone_device_data there is probably other ways we can do that backwards
lookup, but it may become nasty.

So while I fully agree we should use some form of embedding of the
dev_pagemap, and that could be the authoritative way to go from a page
to a drm_pagemap, that is not really sufficient to go from a page to a
bo.

But more on this in upcoming multi-device patches. This series is
mostly about separating the drm_gpusvm (GPU mapping) and
drm_pagemap(migration) functionality that is already i place.

But please let me know if there are any concerns with this.

Thanks,
Thomas


> 
> > v2:
> > - Rebase.
> > v3:
> > - Documentation updates (CI, Matt Brost)
> > - Don't change TTM buffer object type for VRAM allocations (Matt
> > Brost)
> > v4:
> > - Documentation Updates (Himal Ghimiray, Matt Brost)
> > - Add an assert (Matt Brost)
> > v5:
> > - Rebase
> > - Add R-Bs and SOBs.
> > 
> > Matthew Brost (1):
> >   drm/gpusvm, drm/pagemap: Move migration functionality to
> > drm_pagemap
> > 
> > Thomas Hellström (2):
> >   drm/pagemap: Add a populate_mm op
> >   drm/xe: Implement and use the drm_pagemap populate_mm op
> > 
> >  Documentation/gpu/rfc/gpusvm.rst     |  12 +-
> >  drivers/gpu/drm/Makefile             |   6 +-
> >  drivers/gpu/drm/drm_gpusvm.c         | 761 +----------------------
> > -
> >  drivers/gpu/drm/drm_pagemap.c        | 838
> > +++++++++++++++++++++++++++
> >  drivers/gpu/drm/xe/Kconfig           |  10 +-
> >  drivers/gpu/drm/xe/xe_bo_types.h     |   2 +-
> >  drivers/gpu/drm/xe/xe_device_types.h |   2 +-
> >  drivers/gpu/drm/xe/xe_svm.c          | 125 ++--
> >  drivers/gpu/drm/xe/xe_svm.h          |  10 +-
> >  drivers/gpu/drm/xe/xe_tile.h         |  11 +
> >  drivers/gpu/drm/xe/xe_vm.c           |   2 +-
> >  include/drm/drm_gpusvm.h             |  96 ---
> >  include/drm/drm_pagemap.h            | 135 +++++
> >  13 files changed, 1098 insertions(+), 912 deletions(-)
> >  create mode 100644 drivers/gpu/drm/drm_pagemap.c
> > 
> > -- 
> > 2.49.0
> > 



More information about the Intel-xe mailing list