[Nouveau] [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

Danilo Krummrich dakr at redhat.com
Thu Mar 16 16:39:29 UTC 2023


Hi Oded,

sorry for the late response, somehow this mail slipped through.

On 2/6/23 15:48, Oded Gabbay wrote:
> On Thu, Jan 19, 2023 at 7:24 AM Matthew Brost <matthew.brost at intel.com> wrote:
>> Is this not an application issue? Millions of mappings seems a bit
>> absurd to me.
> If I look at the most extreme case for AI, assuming 256GB of HBM
> memory and page mapping of 2MB, we get to 128K of mappings. But that's
> really the extreme case imo. I assume most mappings will be much
> larger. In fact, in the most realistic scenario of large-scale
> training, a single user will probably map the entire HBM memory using
> 1GB pages.
> 
> I have also a question, could this GPUVA code manage VA ranges
> mappings for userptr mappings, assuming we work without svm/uva/usm
> (pointer-is-a-pointer) ? Because then we are talking about possible
> 4KB mappings of 1 - 1.5 TB host server RAM (Implied in my question is
> the assumption this can be used also for non-VK use-cases. Please tell
> me if I'm totally wrong here).

In V2 I switched from drm_mm to maple tree, which should improve 
handling of lots of entries. I also dropped the requirement for GPUVA 
entries to be backed by a valid GEM object.

I think it can be used for non-VK use-cases. It basically just keeps 
track of mappings (not allocating them in the sense of finding a hole 
and providing a base address for a given size). There are basic 
functions to insert and remove entries. For those basic functions it is 
ensured that colliding entries can't be inserted and only a specific 
given entry can be removed, rather than e.g. an arbitrary range.

There are also more advanced functions where users of the GPUVA manager 
can request to "force map" a new mapping and to unmap a given range. The 
GPUVA manager will figure out the (sub-)operations to make this happen 
(.e.g. remove mappings in the way, split up mappings, etc.) and either 
provide these operations (or steps) through callbacks or though a list 
of operations to the caller to process them.

Are there any other use-cases or features you could think of that would 
be beneficial for accelerators?

- Danilo

> 
> Thanks,
> Oded
> 



More information about the Nouveau mailing list