[Nouveau] CUDA fixed VA allocations and sparse mappings

Tue Jul 7 17:15:59 PDT 2015

On Tue, Jul 07, 2015 at 08:13:28PM -0400, Ilia Mirkin wrote:
> On Tue, Jul 7, 2015 at 8:11 PM, C Bergström <cbergstrom at pathscale.com> wrote:
> > On Wed, Jul 8, 2015 at 7:08 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> >> On Tue, Jul 7, 2015 at 8:07 PM, C Bergström <cbergstrom at pathscale.com> wrote:
> >>> On Wed, Jul 8, 2015 at 6:58 AM, Ben Skeggs <skeggsb at gmail.com> wrote:
> >>>> On 8 July 2015 at 09:53, C Bergström <cbergstrom at pathscale.com> wrote:
> >>>>> regarding
> >>>>> --------
> >>>>> Fixed address allocations weren't going to be part of that, but I see
> >>>>> that it makes sense for a variety of use cases.  One question I have
> >>>>> here is how this is intended to work where the RM needs to make some
> >>>>> of these allocations itself (for graphics context mapping, etc), how
> >>>>> should potential conflicts with user mappings be handled?
> >>>>> --------
> >>>>> As an initial implemetation you can probably assume that the GPU
> >>>>> offloading is in "exclusive" mode. Basically that the CUDA or OpenACC
> >>>>> code has full ownership of the card. The Tesla cards don't even have a
> >>>>> video out on them. To complicate this even more - some offloading code
> >>>>> has very long running kernels and even worse - may critically depend
> >>>>> on using the full available GPU ram. (Large matrix sizes and soon big
> >>>>> Fortran arrays or complex data types)
> >>>> This doesn't change that, to setup the graphics engine, the driver
> >>>> needs to map various system-use data structures into the channel's
> >>>> address space *somewhere* :)
> >>>
> >>> I'm not sure I follow exactly what you mean, but I think the answer is
> >>> - don't setup the graphics engine if you're in "compute" mode. Doing
> >>> that, iiuc, will at least provide a start to support for compute.
> >>> Anyone who argues that graphics+compute is critical to have working at
> >>> the same time is probably a 1%.
> >>
> >> On NVIDIA GPUs, compute _is_ part of the graphics engine... aka PGRAPH.
> >
> > You can afaik setup PGRAPH without mapping memory for graphics. You
> > just init the engine and get out of the way.
> 
> But... you need to map memory to set up the engine. Not a lot, but
> it's gotta go *somewhere*.

There's some minimal state that needs to be mapped into GPU address space.
One thing that comes to mind are pushbuffers, which are needed to submit
stuff to any engine.