[PATCH] drm/gpuvm: merge adjacent gpuva range during a map operation
Matthew Brost
matthew.brost at intel.com
Tue Sep 24 02:00:53 UTC 2024
On Tue, Sep 24, 2024 at 01:41:31AM +0200, Danilo Krummrich wrote:
> (adding dri-devel)
>
> On Mon, Sep 23, 2024 at 02:24:02PM +0000, Zeng, Oak wrote:
> > > > This patch is an old one in my back log. I roughly remember I ran into
> > > a situation where there were two duplicated VMAs covering
> > > > Same virtual address range are kept in gpuvm's RB-tree. One VMA
> > > was actually already destroyed. This further caused issues as
> > > > The destroyed VMA was found during a GPUVM RB-tree walk. This
> > > triggered me to look into the gpuvm merge split logic and end
> > > > Up with this patch. This patch did fix that issue.
> > >
> > > That would indeed be a big issue. As Matt suggests, is there a
> > > reproducer?
> > >
> > > Either way, adding merge support can't be the fix for this, we need a
> > > separate
> > > one, that's back-portable.
> > >
> >
> > The discussion went on when you were away. See https://patchwork.freedesktop.org/patch/614941/?series=138835&rev=1
>
> Yes, I'm aware. But I don't see how this is related to what I said above?
>
> >
> > Matt and me agreed to implement a merge logic in gpuvm, but gpuvm need to check a driver cookie/callback to decide merge or not.
> > We reached this conclusion based on some requirement from system allocator design. See more details in above link.
> >
> > Can you take a look and let us know whether you agree?
>
> Generally, I'm fine with that, one of my early versions of GPUVM had this. But I
> dropped it because I don't saw an immediate benefit.
>
> From my old change log:
>
> "Remove merging of GPUVAs; the kernel has limited to none knowlege about
> the semantics of mapping sequences. Hence, merging is purely speculative.
> It seems that gaining a significant (or at least a measurable) performance
> increase through merging is way more likely to happen when userspace is
> responsible for merging mappings up to the next larger page size if
> possible."
>
> If the pure number of GPUVA structures is a concern though, I think it's fair to
> add it. So, feel to send a patch.
>
> It's probably also a good idea to double check with my old merge implementation
> [1]. It's pretty easy to get this wrong. I'm not saying I got it right, but if
> we both ended up with the same logic, it's a good indicator. :)
>
> However, this should *not* be a solution for an existing bug. Above you mention
> a bug related to "duplicated VMAs covering the same virtual address range". This
> is unrelated and must be fixed separately. Do you have a way to reproduce this?
>
I 100% agree with Daniele that if this bug, merging is not the solution.
Merging is additional, optional, feature not a bug fix.
AFIAK there is not a bug here as our CI or mesa would likely report a
memory bug. A reproducer for this would be good if this exists.
Matt
> [1] https://lore.kernel.org/dri-devel/20230217134422.14116-6-dakr@redhat.com/
>
> >
> > > Also, can we move this on DRI-devel please?
> >
> > Yes will do.
> >
> > Oak
More information about the dri-devel
mailing list