[PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

Mon Jan 17 14:21:55 UTC 2022

Am 17.01.22 um 15:17 schrieb Felix Kuehling:
> Am 2022-01-17 um 6:44 a.m. schrieb Christian König:
>> Am 14.01.22 um 18:40 schrieb Felix Kuehling:
>>> Am 2022-01-14 um 12:26 p.m. schrieb Christian König:
>>>> Am 14.01.22 um 17:44 schrieb Daniel Vetter:
>>>>> Top post because I tried to catch up on the entire discussion here.
>>>>>
>>>>> So fundamentally I'm not opposed to just close this fork() hole
>>>>> once and
>>>>> for all. The thing that worries me from a upstream/platform pov is
>>>>> really
>>>>> only if we don't do it consistently across all drivers.
>>>>>
>>>>> So maybe as an idea:
>>>>> - Do the original patch, but not just for ttm but all gem rendernode
>>>>>      drivers at least (or maybe even all gem drivers, no idea), with
>>>>> the
>>>>>      below discussion cleaned up as justification.
>>>> I know of at least one use case which this will break.
>>>>
>>>> A couple of years back we had a discussion on the Mesa mailing list
>>>> because (IIRC) Marek introduced a background thread to push command
>>>> submissions to the kernel.
>>>>
>>>> That broke because some compositor used to initialize OpenGL and then
>>>> do a fork(). This indeed worked previously (no GPUVM at that time),
>>>> but with the addition of the backround thread obviously broke.
>>>>
>>>> The conclusion back then was that the compositor is broken and needs
>>>> fixing, but it still essentially means that there could be people out
>>>> there with really old userspace where this setting would just break
>>>> the desktop.
>>>>
>>>> I'm not really against that change either, but at least in theory we
>>>> could make fork() work perfectly fine even with VMs and background
>>>> threads.
>>> You may regret this if you ever try to build a shared virtual address
>>> space between GPU and CPU. Then you have two processes (parent and
>>> child) sharing the same render context and GPU VM address space. But the
>>> CPU address spaces are different. You can't maintain consistent shared
>>> virtual address spaces for both processes when the GPU address space is
>>> shared between them.
>> That's actually not much of a problem.
>>
>> All you need to do is to use pthread_atfork() and do the appropriate
>> action in parent/child to clean up your context:
>> https://man7.org/linux/man-pages/man3/pthread_atfork.3.html
> Thunk already does that. However, it's not foolproof. pthread_atfork
> hanlders aren't called when the process is forked with a clone call.

Yeah, but that's perfectly intentional. clone() is usually used to 
create threads.

>> The rest is just to make sure that all shared and all private data are
>> kept separate all the time. Sharing virtual memory is already done for
>> decades this way, it's just that nobody ever did it with a statefull
>> device like GPUs.
> My concern is not with sharing or not sharing data. It's with sharing
> the address space itself. If you share the render node, you share GPU
> virtual address space. However CPU address space is not shared between
> parent and child. That's a fundamental mismatch between the CPU world
> and current GPU driver implementation.

Correct, but even that is easily solvable. As I said before you can hang 
this state on a VMA and let it be cloned together with the CPU address 
space.

Since VMAs are informed about their cloning (in opposite to file 
descriptors) it's trivial to even just clone kernel data on first access.

Regards,
Christian.

>
> Regards,
>    Felix
>
>
>> Regards,
>> Christian.
>>
>>> Regards,
>>>     Felix
>>>