[Nouveau] [RFC PATCH 00/13] SVM (share virtual memory) with HMM in nouveau

Tue Mar 13 15:32:24 UTC 2018

On 2018-03-13 10:28 AM, Jerome Glisse wrote:
> On Mon, Mar 12, 2018 at 02:28:42PM -0400, Felix Kuehling wrote:
>> On 2018-03-10 10:01 AM, Christian König wrote:
>>>> To accomodate those we need to
>>>> create a "hole" inside the process address space. This patchset have
>>>> a hack for that (patch 13 HACK FOR HMM AREA), it reserves a range of
>>>> device file offset so that process can mmap this range with PROT_NONE
>>>> to create a hole (process must make sure the hole is below 1 << 40).
>>>> I feel un-easy of doing it this way but maybe it is ok with other
>>>> folks.
>>> Well we have essentially the same problem with pre gfx9 AMD hardware.
>>> Felix might have some advise how it was solved for HSA. 
>> For pre-gfx9 hardware we reserve address space in user mode using a big
>> mmap PROT_NONE call at application start. Then we manage the address
>> space in user mode and use MAP_FIXED to map buffers at specific
>> addresses within the reserved range.
>>
>> The big address space reservation causes issues for some debugging tools
>> (clang-sanitizer was mentioned to me), so with gfx9 we're going to get
>> rid of this address space reservation.
> What do you need those mapping for ? What kind of object (pm4 packet
> command buffer, GPU semaphore | fence, ...) ? Kernel private object ?
> On nv we need it for the main command buffer ring which we do not want
> to expose to application.

On pre-gfx9 hardware the GPU virtual address space is limted to 40 bits
for all hardware blocks. So all GPU-accessible memory must be below 40-bits.

> Thus for nv gpu we need kernel to monitor this PROT_NONE region to make
> sure that i never got unmapped, resize, move ... this is me fearing a
> rogue userspace that munmap and try to abuse some bug in SVM/GPU driver
> to abuse object map behind those fix mapping.

We mmap PROT_NONE anonymous memory and we don't have any safeguards
against rogue code unmapping it or modifying the mappings. The same
argument made by John Hubbard applies. If applications mess with
existing memory mappings, they are broken anyway. Why do our mappings
need special protections, but a mapping of e.g. libc doesn't?

In our case, we don't have HMM (yet), so in most cases changing the
memory mapping on the CPU side won't affect the GPU mappings. The
exception to that would be userptr mappings where a rogue unmap would
trigger an MMU notifier and result in updating the GPU mapping, which
could lead to a GPU VM fault later on.

Regards,
  Felix

>
> Cheers,
> Jérôme