[Nouveau] [RFC PATCH 00/13] SVM (share virtual memory) with HMM in nouveau

Mon Mar 12 17:50:58 UTC 2018

On Mon, Mar 12, 2018 at 06:30:09PM +0100, Daniel Vetter wrote:
> On Sat, Mar 10, 2018 at 04:01:58PM +0100, Christian K??nig wrote:

[...]

> > > They are work underway to revamp nouveau channel creation with a new
> > > userspace API. So we might want to delay upstreaming until this lands.
> > > We can stil discuss one aspect specific to HMM here namely the issue
> > > around GEM objects used for some specific part of the GPU. Some engine
> > > inside the GPU (engine are a GPU block like the display block which
> > > is responsible of scaning memory to send out a picture through some
> > > connector for instance HDMI or DisplayPort) can only access memory
> > > with virtual address below (1 << 40). To accomodate those we need to
> > > create a "hole" inside the process address space. This patchset have
> > > a hack for that (patch 13 HACK FOR HMM AREA), it reserves a range of
> > > device file offset so that process can mmap this range with PROT_NONE
> > > to create a hole (process must make sure the hole is below 1 << 40).
> > > I feel un-easy of doing it this way but maybe it is ok with other
> > > folks.
> > 
> > Well we have essentially the same problem with pre gfx9 AMD hardware. Felix
> > might have some advise how it was solved for HSA.
> 
> Couldn't we do an in-kernel address space for those special gpu blocks? As
> long as it's display the kernel needs to manage it anyway, and adding a
> 2nd mapping when you pin/unpin for scanout usage shouldn't really matter
> (as long as you cache the mapping until the buffer gets thrown out of
> vram). More-or-less what we do for i915 (where we have an entirely
> separate address space for these things which is 4G on the latest chips).
> -Daniel

We can not do an in-kernel address space for those. We already have an
in kernel address space but it does not apply for the object considered
here.

For NVidia (i believe this is the same for AMD AFAIK) the objects we
are talking about are objects that must be in the same address space
as the one against which process's shader/dma/... get executed.

For instance command buffer submited by userspace must be inside a
GEM object mapped inside the GPU's process address against which the
command are executed. My understanding is that the PFIFO (the engine
on nv GPU that fetch commands) first context switch to address space
associated with the channel and then starts fetching commands with
all address being interpreted against the channel address space.

Hence why we need to reserve some range in the process virtual address
space if we want to do SVM in a sane way. I mean we could just map
buffer into GPU page table and then cross fingers and toes hopping that
the process will never get any of its mmap overlapping those mapping :)

Cheers,
Jérôme