[RFC PATCH 05/28] drm/gpusvm: Add support for GPU Shared Virtual Memory
Daniel Vetter
daniel.vetter at ffwll.ch
Mon Sep 2 11:40:23 UTC 2024
On Thu, Aug 29, 2024 at 04:49:15PM +0000, Matthew Brost wrote:
> On Wed, Aug 28, 2024 at 08:50:02PM +0200, Daniel Vetter wrote:
> > On Tue, Aug 27, 2024 at 07:48:38PM -0700, Matthew Brost wrote:
> > > +int drm_gpusvm_migrate_to_sram(struct drm_gpusvm *gpusvm,
> > > + struct drm_gpusvm_range *range,
> > > + const struct drm_gpusvm_ctx *ctx)
> > > +{
> > > + u64 start = range->va.start, end = range->va.end;
> > > + struct mm_struct *mm = gpusvm->mm;
> > > + struct vm_area_struct *vas;
> > > + int err;
> > > + bool retry = false;
> > > +
> > > + if (!ctx->mmap_locked) {
> > > + if (!mmget_not_zero(mm)) {
> > > + err = -EFAULT;
> > > + goto err_out;
> > > + }
> > > + if (ctx->trylock_mmap) {
> > > + if (!mmap_read_trylock(mm)) {
> > > + err = drm_gpusvm_evict_to_sram(gpusvm, range);
> > > + goto err_mmput;
> > > + }
> > > + } else {
> > > + mmap_read_lock(mm);
> > > + }
> > > + }
> > > +
> > > + mmap_assert_locked(mm);
> > > +
> > > + /*
> > > + * Loop required to find all VMA area structs for the corner case when
> > > + * VRAM backing has been partially unmapped from MM's address space.
> > > + */
> > > +again:
> > > + vas = find_vma(mm, start);
> >
> > So a hiliarous case that amdkfd gets a bit better but still not entirely
> > is that the original vma might entirely gone. Even when you can still get
> > at the mm of that process. This happens with cow (or shared too I think)
> > mappings in forked child processes, or also if you play fun mremap games.
> >
> > I think that outside of the ->migrate_to_ram callback migration/eviction
> > to sram cannot assume there's any reasonable vma around and has to
> > unconditionally go with the drm_gpusvm_evict_to_sram path.
> >
>
> See my response here [1]. Let me drop the whole trylock thing and
> convert to an 'evict' flag which calls drm_gpusvm_evict_to_sram in
> places where Xe needs to evict VRAM. Or maybe just export that function
> and call it directly. That way the only place the VMA is looked up for
> SRAM -> VRAM is upon CPU page fault.
Yeah I think a dedicated path for migrate_to_ram hook that goes directly
into your evict_to_sram path is the design-clean approach here imo.
> [1] https://patchwork.freedesktop.org/patch/610955/?series=137870&rev=1#comment_1111164
>
> > Also in the migrate_to_ram case the vma is essentially nothing else that
> > informational about which ranges we might need if we prefault a bit (in
> > case the child changed the vma compared to the original one). So it's good
> > to as parameter for migrate_vma_setup, but absolutely nothing else.
> >
> > amdkfd almost gets this right by being entirely based on their svm_range
> > structures, except they still have the lingering check that the orignal mm
> > is still alive. Of course you cannot ever use that memory on the gpu
> > anymore, but the child process could get very pissed if their memory is
> > suddenly gone. Also the eviction code has the same issue as yours and
> > limits itself to vma that still exist in the original mm, leaving anything
> > that's orphaned in children or remaps stuck in vram. At least that's my
> > understanding, I might very well be wrong.
> >
> > So probably want a bunch of these testcases too to make sure that all
> > works, and we're not stuck with memory allocations in vram that we can't
> > move out.
>
> When writing some additional test cases, let me add hooks in my IGTs to
> be able to verify we are not orphaning VRAM too.
So maybe apply caution, I'm honestly not sure whether core mm makes any
guarantees about not orphaning stuff, at least for a little bit.
Over the w/e my brain tossed me the "so are we sure we can tear down our
zone_device data, the page array specifically" brain teaser. And I think
the answer is that we have to wait until all page references disappear,
which might take a long time. Core mm makes no guarantee about elevated
page references disappearing in a timely manner, at least as far as I
know. Which is also why migration is a best effort thing only.
Cheers, Sima
--
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-xe
mailing list