[PATCH v3 3/5] drm/xe/bo: Add a bo remove callback
Thomas Hellström
thomas.hellstrom at linux.intel.com
Tue Mar 25 16:45:42 UTC 2025
On Tue, 2025-03-25 at 10:08 +0000, Matthew Auld wrote:
> On 25/03/2025 09:07, Thomas Hellström wrote:
> > On Tue, 2025-03-25 at 09:02 +0000, Matthew Auld wrote:
> > > On 24/03/2025 16:54, Thomas Hellström wrote:
> > > > On device unbind, migrate exported bos, including pagemap bos
> > > > to
> > > > system. This allows importers to take proper action without
> > > > disruption. In particular, SVM clients on remote devices may
> > > > continue as if nothing happened, and can chose a different
> > > > placement.
> > > >
> > > > The evict_flags() placement is chosen in such a way that bos
> > > > that
> > > > aren't exported are purged.
> > > >
> > > > For pinned bos, we unmap DMA, but their pages are not freed yet
> > > > since we can't be 100% sure they are not accessed.
> > > >
> > > > All pinned external bos (not just the VRAM ones) are put on the
> > > > pinned.external list with this patch. But this only affects the
> > > > xe_bo_pci_dev_remove_pinned() function since !VRAM bos are
> > > > ignored by the suspend / resume functionality. As a follow-up
> > > > we
> > > > could look at removing the suspend / resume iteration over
> > > > pinned external bos since we currently don't allow pinning
> > > > external bos in VRAM, and other external bos don't need any
> > > > special treatment at suspend / resume.
> > > >
> > > > v2:
> > > > - Address review comments. (Matthew Auld).
> > > > v3:
> > > > - Don't introduce an external_evicted list (Matthew Auld)
> > > > - Add a discussion around suspend / resume behaviour to the
> > > > commit message.
> > > > - Formatting fixes.
> > > >
> > > > Signed-off-by: Thomas Hellström
> > > > <thomas.hellstrom at linux.intel.com>
> > >
> > > Reviewed-by: Matthew Auld <matthew.auld at intel.com>
> > >
> >
> > Actually, there is a CI failure on LNL indicating that the pinned
> > kernel-bo dma-maps are actually needed at devm-managed release.
>
> Hmm, do you have a link? The failure I see looks to be more probe
> related? Once we do unplug(), outside the special evict all we do
> here
> we should pretty much not need dma-maps?
https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-146383v5/shard-lnl-2/igt@xe_module_load@reload-no-display.html#dmesg-warnings381
Ideally not. But again since not all xe subsystems using pinned maps
aren't properly finished at that point, I figure it's hard to tell.
Since the xe_hw_fence warning happened at devm_action_release() it
caught my attention.
However I can't repro on LNL, even with IOMMU turned on.
>
> What about moving the evict_all into a well placed devm action during
> probe? Basically at the point at which we think it is reasonable to
> get
> rid of the dma-maps? Or is that what you mean below?
I still think we should devm_ free all pinned kernel bos, so that there
are none existing once the pci-device is gone. But as a cach-all, yeah
we should probably move the traversal for pinned kernel_bo to be
executed as part of a final devm_ action. That'd be usable for
debugging as well if we were to attempt cleaning up all pinned bos on
unplugging.
I'll do a quick respin of that.
/Thomas
>
> >
> > I'm in the process on testing this out on LNL, and if so I'll drop
> > these dma-unmaps and we'd continue down the route of ensuring that
> > these subsystems are indeed devm_ managed and not drmm_ managed.
> >
> > Thanks,
> > Thomas
> >
> >
> >
>
More information about the Intel-xe
mailing list