[PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

Fri Jan 14 16:44:06 UTC 2022

Top post because I tried to catch up on the entire discussion here.

So fundamentally I'm not opposed to just close this fork() hole once and
for all. The thing that worries me from a upstream/platform pov is really
only if we don't do it consistently across all drivers.

So maybe as an idea:
- Do the original patch, but not just for ttm but all gem rendernode
  drivers at least (or maybe even all gem drivers, no idea), with the
  below discussion cleaned up as justification.

- Get acks from userspace driver folks who know real-world gl/vk usage and
  khr specs in-depth enough to be able to tell us how much we'll regret
  this.

- Merge it. Or come up with something new. Or maybe stick to the Nack, but
  then maybe it would be good to document that somewhere in-tree?

This entire can of worms just feels way too tricky to have it be handled
inconsistently across drivers. And trying to fix these kind of low-level
things across drivers once divergences exists is just really painful (e.g.
trying to make all dma-buf mmap VM_SPECIAL or the herculeian task
Christian is doing to get our dma_resv rules consistent across drivers).

Cheers, Daniel

On Fri, Jan 07, 2022 at 12:47:45PM -0500, Felix Kuehling wrote:
> Am 2022-01-07 um 3:56 a.m. schrieb Christian König:
> > Am 06.01.22 um 17:51 schrieb Felix Kuehling:
> >> Am 2022-01-06 um 11:48 a.m. schrieb Christian König:
> >>> Am 06.01.22 um 17:45 schrieb Felix Kuehling:
> >>>> Am 2022-01-06 um 4:05 a.m. schrieb Christian König:
> >>>> [SNIP]
> >>>> Also, why does your ACK or NAK depend on this at all. If it's the
> >>>> right
> >>>> thing to do, it's the right thing to do regardless of who benefits
> >>>> from
> >>>> it. In addition, how can a child process that doesn't even use the GPU
> >>>> be in violation of any GPU-driver related specifications.
> >>> The argument is that the application is broken and needs to be fixed
> >>> instead of worked around inside the kernel.
> >> I still don't get how they the application is broken. Like I said, the
> >> child process is not using the GPU. How can the application be fixed in
> >> this case?
> >
> > Sounds like I'm still missing some important puzzle pieces for the
> > full picture to figure out why this doesn't work.
> >
> >> Are you saying, any application that forks and doesn't immediately call
> >> exec is broken?
> >
> > More or less yes. We essentially have three possible cases here:
> >
> > 1. Application is already using (for example) OpenGL or OpenCL to do
> > some rendering on the GPU and then calls fork() and expects to use
> > OpenGL both in the parent and the child at the same time.
> >     As far as I know that's illegal from the Khronos specification
> > point of view and working around inside the kernel for something not
> > allowed in the first place is seen as futile effort.
> >
> > 2. Application opened the file descriptor, forks and then initializes
> > OpenGL/Vulkan/OpenCL.
> >     That's what some compositors are doing to drop into the backround
> > and is explicitly legal.
> >
> > 3. Application calls fork() and then doesn't use the GPU in the child.
> > Either by not using it or calling exec.
> >     That should be legal and not cause any problems in the first place. 
> >
> > But from your description I still don't get why we are running into
> > problems here.
> >
> > I was assuming that you have case #1 because we previously had some
> > examples of this with this python library, but from your description
> > it seems to be case #3.
> 
> Correct. #3 has at least one issue we previously worked around in the
> Thunk: The inherited VMAs prevent BOs from being freed in the parent
> process. This manifests as an apparent memory leak. Therefore the Thunk
> calls madvise(..., MADV_DONTFORK) on all its VRAM allocation. The BOs
> that are causing problems with CRIU are GTT BOs that weren't covered by
> this previous workaround.
> 
> The new issue with CRIU is, that CRIU saves and restores all the VMAs.
> When trying to restore the inherited VMAs in the child process, the mmap
> call fails because the child process' render node FD is no longer
> inherited from the parent, but is instead created by its own "open"
> system call. The mmap call in the child fails for at least two reasons:
> 
>   * The child process' render node FD doesn't have permission to map the
>     parent process' BO any more
>   * The parent BO doesn't get restored in the child process, so its mmap
>     offset doesn't get updated to the new mmap offset of the restored
>     parent BO by the amdgpu CRIU plugin
> 
> We could maybe add a whole bunch of complexity in CRIU and our CRIU
> plugin to fix this. But it's pointless because like you said, actually
> doing anything with the BO in the child process is illegal anyway
> (scenario #1 above). The easiest solution seems to be, to just not
> inherit the VMA in the first place.
> 
> Regards,
>   Felix
> 
> 
> >
> >> Or does an application that forks need to be aware that some other part
> >> of the application used the GPU and explicitly free any GPU resources?
> >
> > Others might fill that information in, but I think that was the plan
> > for newer APIs like Vulkan.
> >
> > Regards,
> > Christian.
> >
> >>
> >> Thanks,
> >>    Felix
> >>
> >>
> >>> Regards,
> >>> Christian.
> >>>
> >>>> Regards,
> >>>>     Felix
> >>>>
> >>>>
> >>>>> Let's talk about this on Mondays call. Thanks for giving the whole
> >>>>> context.
> >>>>>
> >>>>> Regards,
> >>>>> Christian.
> >>>>>
> >>>>>> Regards,
> >>>>>>      Felix
> >>>>>>
> >

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch