[PATCH 1/2] drm/xe_migrate: Switch from drm to dev managed actions

Mon Mar 3 20:27:03 UTC 2025

On Fri, Feb 28, 2025 at 11:11:59AM +0000, Thomas Hellstrom wrote:
>Hi, Matthew
>
>On Fri, 2025-02-28 at 10:21 +0000, Matthew Auld wrote:
>> On 28/02/2025 06:52, Aradhya Bhatia wrote:
>> > Change the scope of the migrate subsystem to be dev managed instead
>> > of
>> > drm managed.
>> >
>> > The parent pci struct &device, that the xe struct &drm_device is a
>> > part
>> > of, gets removed when a hot unplug is triggered, which causes the
>> > underlying iommu group to get destroyed as well.
>>
>> Nice find. But if that's the case then the migrate BO here is just
>> one
>> of many where we will see this. Basically all system memory BO can
>> suffer from this AFAICT, including userspace owned. So I think we
>> need
>> to rather solve this for all, and I don't think we can really tie the
>> lifetime of all BOs to devm, so likely need a different approach.
>>
>> I think we might instead need to teardown all dma mappings when
>> removing
>> the device, leaving the BO intact. It looks like there is already a
>> helper for this, so maybe something roughly like this:
>>
>> @@ -980,6 +980,8 @@ void xe_device_remove(struct xe_device *xe)
>>
>>          drm_dev_unplug(&xe->drm);
>>
>> +       ttm_device_clear_dma_mappings(&xe->ttm);
>> +
>>          xe_display_driver_remove(xe);
>>
>>          xe_heci_gsc_fini(xe);
>>
>> But I don't think userptr will be covered by that, just all BOs, so
>> likely need an extra step to also nuke all usersptr dma mappings
>> somewhere.
>
>I have been discussing this a bit with Aradhya, and the problem is a
>bit complex.
>
>Really if the PCI device goes away, we need to move *all* bos to
>system. That should clear out any dma mappings, since we map dma when
>moving to TT. Also VRAM bos should be moved to system which will also
>handle things like SVM migration.

if the PCI device really goes away, how are you going to move anything
without HW access? I think we just don't reproduce this issue more often
because it's not a real hotplug, but just a driver unbind. I wonder what
would happen with our current logic if the code path was actually
a hotplug and triggering this code path:

pci_stop_and_remove_bus_device() -> pci_stop_bus_device() ->
-> pci_stop_dev() -> device_release_driver()

Lucas De Marchi

>
>But that doesn't really help with pinned bos. So IMO subsystems that
>allocate pinned bos need to be devm managed, and this is a step in that
>direction. But we probably need to deal with some fallout. For example
>when we take down the vram managers using drmmm_ they'd want to call
>evict_all() and the migrate subsystem is already down.
>
>So I think a natural place to deal with such fallouts is the remove()
>callback, while subsystems with pinned bos use devm_
>
>But admittedly to me it's not really clear how to *best* handle this
>situation. I suspect if we really stress-test device unbinding on a
>running app we're going to hit a lot of problems.
>
>As for the userptrs, I just posted a series that introduces a per-
>userptr dedicated unmap function. We could probably put them on a
>device list or something similar that calls that function (or just
>general userptr invalidation) for all userptrs.
>
>/Thomas
>
>
>
>
>
>>
>> >
>> > The migrate subsystem, which handles the lifetime of the page-table
>> > tree
>> > (pt) BO, doesn't get a chance to keep the BO back during the hot
>> > unplug,
>> > as all the references to DRM haven't been put back.
>> > When all the references to DRM are indeed put back later, the
>> > migrate
>> > subsystem tries to put back the pt BO. Since the underlying iommu
>> > group
>> > has been already destroyed, a kernel NULL ptr dereference takes
>> > place
>> > while attempting to keep back the pt BO.
>> >
>> > Signed-off-by: Aradhya Bhatia <aradhya.bhatia at intel.com>
>> > ---
>> >   drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
>> >   1 file changed, 3 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/xe/xe_migrate.c
>> > b/drivers/gpu/drm/xe/xe_migrate.c
>> > index 278bc96cf593..4e23adfa208a 100644
>> > --- a/drivers/gpu/drm/xe/xe_migrate.c
>> > +++ b/drivers/gpu/drm/xe/xe_migrate.c
>> > @@ -97,7 +97,7 @@ struct xe_exec_queue
>> > *xe_tile_migrate_exec_queue(struct xe_tile *tile)
>> >   	return tile->migrate->q;
>> >   }
>> >  
>> > -static void xe_migrate_fini(struct drm_device *dev, void *arg)
>> > +static void xe_migrate_fini(void *arg)
>> >   {
>> >   	struct xe_migrate *m = arg;
>> >  
>> > @@ -401,7 +401,7 @@ struct xe_migrate *xe_migrate_init(struct
>> > xe_tile *tile)
>> >   	struct xe_vm *vm;
>> >   	int err;
>> >  
>> > -	m = drmm_kzalloc(&xe->drm, sizeof(*m), GFP_KERNEL);
>> > +	m = devm_kzalloc(xe->drm.dev, sizeof(*m), GFP_KERNEL);
>> >   	if (!m)
>> >   		return ERR_PTR(-ENOMEM);
>> >  
>> > @@ -455,7 +455,7 @@ struct xe_migrate *xe_migrate_init(struct
>> > xe_tile *tile)
>> >   	might_lock(&m->job_mutex);
>> >   	fs_reclaim_release(GFP_KERNEL);
>> >  
>> > -	err = drmm_add_action_or_reset(&xe->drm, xe_migrate_fini,
>> > m);
>> > +	err = devm_add_action_or_reset(xe->drm.dev,
>> > xe_migrate_fini, m);
>> >   	if (err)
>> >   		return ERR_PTR(err);
>> >  
>>
>