[PATCH v5 0/3] drm/gpusvm, drm/pagemap, drm/xe: Restructure migration in preparation for multi-device
Thomas Hellström
thomas.hellstrom at linux.intel.com
Thu Jun 19 10:07:10 UTC 2025
Hi, Alistair, Thanks for having a look!
On Thu, 2025-06-19 at 14:52 +1000, Alistair Popple wrote:
> On Wed, Jun 18, 2025 at 10:16:14PM +0200, Thomas Hellström wrote:
> > This patchset modifies the migration part of drm_gpusvm to
> > drm_pagemap and
> > adds a populate_mm() op to drm_pagemap.
> >
> > The idea is that the device that receives a pagefault determines if
> > it wants to
> > migrate content and to where. It then calls the populate_mm()
> > method of relevant
> > drm_pagemap.
> >
> > This functionality was mostly already in place, but hard-coded for
> > xe only without
> > going through a pagemap op. Since we might be dealing with separate
> > devices moving
> > forward, it also now becomes the responsibilit of the populate_mm()
> > op to
> > grab any necessary local device runtime pm references and keep them
> > held while
> > its pages are present in an mm (struct mm_struct).
> >
> > On thing to decide here is whether the populate_mm() callback
> > should sit on a
> > struct drm_pagemap for now while we sort multi-device usability out
> > or whether
> > we should add it (or something equivalent) to struct dev_pagemap.
>
> I'm still looking at this series (sorry it took until v5 for me to
> notice
> it!) but my immediate reaction here is why do/would you need to add
> anything
> to struct dev_pagemap? The common approach here has been to embed
> struct
> dev_pagemap in some driver defined struct and use container_of to go
> from the
> page to the driver (or in this case DRM) specific pagemap.
>
> See for example dmirror_page_to_chunk() in the HMM self test or
> nouveau_page_to_chunk(). Is there some reason something like that
> would work
> here?
Future patches will, as they are currently written, do something like
this for embedding:
struct xe_pagemap {
struct dev_pagemap pagemap;
struct drm_pagemap dpagemap;
};
So that the drm_pagemap can be obtained that way. The reason for that
is to avoid diamond inheritance if the driver wanted to embed a
pcie_p2p pagemap instead of a struct dev_pagemap. But if that becomes
unlikely we could of course embed the dev_pagemap directly into the
drm_pagemap.
>
> Actually I notice the Xe driver currently does use this to point to a
> struct
> xe_vram_region which contains drm_pagemap pointer. If I understand
> correctly
> we're trying to move a lot of the SVM functionality into a generic
> DRM layer,
> so would it make more sense to have dev_pgmap embeded in drm_pgmap
> and have that
> contain the pointer to any required driver-specific data?
>
> Also FWIW I don't think zone_device_data is strictly required. It's
> convenient,
> but I suspect it only exists because it could be easily provided
> within the
> footprint of the existing struct page due to not using all the fields
> for
> ZONE_DEVICE pages. I can imagine we might eventually remove it, once
> we no
> longer need struct pages and move to folios/memdescs.
It looks like, correct me if I'm wrong, like the nouveau version has
one dev_pagemap per bo. The code at hand here use multiple bos for the
memory of a single pagemap so in essence the zone_device_data provides
a pointer to the bo used for that particular page. If we lose the
zone_device_data there is probably other ways we can do that backwards
lookup, but it may become nasty.
So while I fully agree we should use some form of embedding of the
dev_pagemap, and that could be the authoritative way to go from a page
to a drm_pagemap, that is not really sufficient to go from a page to a
bo.
But more on this in upcoming multi-device patches. This series is
mostly about separating the drm_gpusvm (GPU mapping) and
drm_pagemap(migration) functionality that is already i place.
But please let me know if there are any concerns with this.
Thanks,
Thomas
>
> > v2:
> > - Rebase.
> > v3:
> > - Documentation updates (CI, Matt Brost)
> > - Don't change TTM buffer object type for VRAM allocations (Matt
> > Brost)
> > v4:
> > - Documentation Updates (Himal Ghimiray, Matt Brost)
> > - Add an assert (Matt Brost)
> > v5:
> > - Rebase
> > - Add R-Bs and SOBs.
> >
> > Matthew Brost (1):
> > drm/gpusvm, drm/pagemap: Move migration functionality to
> > drm_pagemap
> >
> > Thomas Hellström (2):
> > drm/pagemap: Add a populate_mm op
> > drm/xe: Implement and use the drm_pagemap populate_mm op
> >
> > Documentation/gpu/rfc/gpusvm.rst | 12 +-
> > drivers/gpu/drm/Makefile | 6 +-
> > drivers/gpu/drm/drm_gpusvm.c | 761 +----------------------
> > -
> > drivers/gpu/drm/drm_pagemap.c | 838
> > +++++++++++++++++++++++++++
> > drivers/gpu/drm/xe/Kconfig | 10 +-
> > drivers/gpu/drm/xe/xe_bo_types.h | 2 +-
> > drivers/gpu/drm/xe/xe_device_types.h | 2 +-
> > drivers/gpu/drm/xe/xe_svm.c | 125 ++--
> > drivers/gpu/drm/xe/xe_svm.h | 10 +-
> > drivers/gpu/drm/xe/xe_tile.h | 11 +
> > drivers/gpu/drm/xe/xe_vm.c | 2 +-
> > include/drm/drm_gpusvm.h | 96 ---
> > include/drm/drm_pagemap.h | 135 +++++
> > 13 files changed, 1098 insertions(+), 912 deletions(-)
> > create mode 100644 drivers/gpu/drm/drm_pagemap.c
> >
> > --
> > 2.49.0
> >
More information about the Intel-xe
mailing list