[Nouveau] [RFC PATCH 00/13] SVM (share virtual memory) with HMM in nouveau

Tue Mar 13 10:46:44 UTC 2018

On Mon, Mar 12, 2018 at 01:50:58PM -0400, Jerome Glisse wrote:
> On Mon, Mar 12, 2018 at 06:30:09PM +0100, Daniel Vetter wrote:
> > On Sat, Mar 10, 2018 at 04:01:58PM +0100, Christian K??nig wrote:
> 
> [...]
> 
> > > > They are work underway to revamp nouveau channel creation with a new
> > > > userspace API. So we might want to delay upstreaming until this lands.
> > > > We can stil discuss one aspect specific to HMM here namely the issue
> > > > around GEM objects used for some specific part of the GPU. Some engine
> > > > inside the GPU (engine are a GPU block like the display block which
> > > > is responsible of scaning memory to send out a picture through some
> > > > connector for instance HDMI or DisplayPort) can only access memory
> > > > with virtual address below (1 << 40). To accomodate those we need to
> > > > create a "hole" inside the process address space. This patchset have
> > > > a hack for that (patch 13 HACK FOR HMM AREA), it reserves a range of
> > > > device file offset so that process can mmap this range with PROT_NONE
> > > > to create a hole (process must make sure the hole is below 1 << 40).
> > > > I feel un-easy of doing it this way but maybe it is ok with other
> > > > folks.
> > > 
> > > Well we have essentially the same problem with pre gfx9 AMD hardware. Felix
> > > might have some advise how it was solved for HSA.
> > 
> > Couldn't we do an in-kernel address space for those special gpu blocks? As
> > long as it's display the kernel needs to manage it anyway, and adding a
> > 2nd mapping when you pin/unpin for scanout usage shouldn't really matter
> > (as long as you cache the mapping until the buffer gets thrown out of
> > vram). More-or-less what we do for i915 (where we have an entirely
> > separate address space for these things which is 4G on the latest chips).
> > -Daniel
> 
> We can not do an in-kernel address space for those. We already have an
> in kernel address space but it does not apply for the object considered
> here.
> 
> For NVidia (i believe this is the same for AMD AFAIK) the objects we
> are talking about are objects that must be in the same address space
> as the one against which process's shader/dma/... get executed.
> 
> For instance command buffer submited by userspace must be inside a
> GEM object mapped inside the GPU's process address against which the
> command are executed. My understanding is that the PFIFO (the engine
> on nv GPU that fetch commands) first context switch to address space
> associated with the channel and then starts fetching commands with
> all address being interpreted against the channel address space.
> 
> Hence why we need to reserve some range in the process virtual address
> space if we want to do SVM in a sane way. I mean we could just map
> buffer into GPU page table and then cross fingers and toes hopping that
> the process will never get any of its mmap overlapping those mapping :)

Ah, from the example I got the impression it's just the display engine
that has this restriction. CS/PFIFO having the same restriction is indeed
more fun.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch