[PATCH] drm/ttm: Don't inherit GEM object VMAs in child process
Christian König
christian.koenig at amd.com
Mon Jan 17 16:23:37 UTC 2022
Am 17.01.22 um 15:50 schrieb Marek Olšák:
> I don't think fork() would work with userspace where all buffers are
> shared. It certainly doesn't work now. The driver needs to be notified
> that a buffer or texture is shared to ensure data coherency between
> processes, and the driver must execute decompression and other render
> passes when a buffer or texture is being shared for the first time.
> Those aren't called when fork() is called.
Yeah, that's why you can install handlers which run before/after fork()
is executed. But to summarize it is illegal for OpenGL, so we don't
really need to worry about it.
For compute there are a couple of use cases though, but even those are
not real world ones as far as I know.
But see below.
>
> Marek
>
> On Mon, Jan 17, 2022 at 9:34 AM Felix Kuehling <felix.kuehling at amd.com
> <mailto:felix.kuehling at amd.com>> wrote:
>
> Am 2022-01-17 um 9:21 a.m. schrieb Christian König:
> > Am 17.01.22 um 15:17 schrieb Felix Kuehling:
> >> Am 2022-01-17 um 6:44 a.m. schrieb Christian König:
> >>> Am 14.01.22 um 18:40 schrieb Felix Kuehling:
> >>>> Am 2022-01-14 um 12:26 p.m. schrieb Christian König:
> >>>>> Am 14.01.22 um 17:44 schrieb Daniel Vetter:
> >>>>>> Top post because I tried to catch up on the entire
> discussion here.
> >>>>>>
> >>>>>> So fundamentally I'm not opposed to just close this fork() hole
> >>>>>> once and
> >>>>>> for all. The thing that worries me from a upstream/platform
> pov is
> >>>>>> really
> >>>>>> only if we don't do it consistently across all drivers.
> >>>>>>
> >>>>>> So maybe as an idea:
> >>>>>> - Do the original patch, but not just for ttm but all gem
> rendernode
> >>>>>> drivers at least (or maybe even all gem drivers, no
> idea), with
> >>>>>> the
> >>>>>> below discussion cleaned up as justification.
> >>>>> I know of at least one use case which this will break.
> >>>>>
> >>>>> A couple of years back we had a discussion on the Mesa
> mailing list
> >>>>> because (IIRC) Marek introduced a background thread to push
> command
> >>>>> submissions to the kernel.
> >>>>>
> >>>>> That broke because some compositor used to initialize OpenGL
> and then
> >>>>> do a fork(). This indeed worked previously (no GPUVM at that
> time),
> >>>>> but with the addition of the backround thread obviously broke.
> >>>>>
> >>>>> The conclusion back then was that the compositor is broken
> and needs
> >>>>> fixing, but it still essentially means that there could be
> people out
> >>>>> there with really old userspace where this setting would
> just break
> >>>>> the desktop.
> >>>>>
> >>>>> I'm not really against that change either, but at least in
> theory we
> >>>>> could make fork() work perfectly fine even with VMs and
> background
> >>>>> threads.
> >>>> You may regret this if you ever try to build a shared virtual
> address
> >>>> space between GPU and CPU. Then you have two processes
> (parent and
> >>>> child) sharing the same render context and GPU VM address space.
> >>>> But the
> >>>> CPU address spaces are different. You can't maintain
> consistent shared
> >>>> virtual address spaces for both processes when the GPU address
> >>>> space is
> >>>> shared between them.
> >>> That's actually not much of a problem.
> >>>
> >>> All you need to do is to use pthread_atfork() and do the
> appropriate
> >>> action in parent/child to clean up your context:
> >>> https://man7.org/linux/man-pages/man3/pthread_atfork.3.html
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fman7.org%2Flinux%2Fman-pages%2Fman3%2Fpthread_atfork.3.html&data=04%7C01%7Cchristian.koenig%40amd.com%7Cd917b56904c64bcb501a08d9d9c8c05e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637780278519496422%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=4%2FEATucZoZnlP4t0FI6bYtCdThxC3HTOtkIcTU8G%2FqY%3D&reserved=0>
> >> Thunk already does that. However, it's not foolproof.
> pthread_atfork
> >> hanlders aren't called when the process is forked with a clone
> call.
> >
> > Yeah, but that's perfectly intentional. clone() is usually used to
> > create threads.
>
> Clone can be used to create new processes. Maybe not the common
> use today.
>
>
> >
> >>> The rest is just to make sure that all shared and all private
> data are
> >>> kept separate all the time. Sharing virtual memory is already
> done for
> >>> decades this way, it's just that nobody ever did it with a
> statefull
> >>> device like GPUs.
> >> My concern is not with sharing or not sharing data. It's with
> sharing
> >> the address space itself. If you share the render node, you
> share GPU
> >> virtual address space. However CPU address space is not shared
> between
> >> parent and child. That's a fundamental mismatch between the CPU
> world
> >> and current GPU driver implementation.
> >
> > Correct, but even that is easily solvable. As I said before you can
> > hang this state on a VMA and let it be cloned together with the CPU
> > address space.
>
> I'm not following. The address space I'm talking about is struct
> amdgpu_vm. It's associated with the render node file descriptor.
> Inheriting and using that file descriptor in the child inherits the
> amdgpu_vm. I don't see how you can hang that state on any one VMA.
>
But you don't really need that. You can bind the VM to your VMA mapping
and clone that as necessary.
I'm not sure how else I should describe that, as far as I know the
kernel that would be rather trivial to do.
Cloning all the userspace state like Marek described above is the much
harder part.
Regards,
Christian.
>
> To be consistent with the CPU, you'd need to clone the GPU address
> space
> (struct amdgpu_vm) in the child process. That means you need a new
> render node file descriptor that imports all the BOs from the parent
> address space. It's a bunch of extra work to fork a process, that
> you're
> proposing to immediately undo with an atfork handler. So I really
> don't
> see the point.
>
> Regards,
> Felix
>
>
> >
> > Since VMAs are informed about their cloning (in opposite to file
> > descriptors) it's trivial to even just clone kernel data on
> first access.
> >
> > Regards,
> > Christian.
> >
> >>
> >> Regards,
> >> Felix
> >>
> >>
> >>> Regards,
> >>> Christian.
> >>>
> >>>> Regards,
> >>>> Felix
> >>>>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20220117/10e983d0/attachment-0001.htm>
More information about the dri-devel
mailing list