[PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

Felix Kuehling felix.kuehling at amd.com
Thu Jan 6 16:45:35 UTC 2022


Am 2022-01-06 um 4:05 a.m. schrieb Christian König:
> Am 05.01.22 um 17:16 schrieb Felix Kuehling:
>> [SNIP]
>>>> But KFD doesn't know anything about the inherited BOs
>>>> from the parent process.
>>> Ok, why that? When the KFD is reinitializing it's context why
>>> shouldn't it cleanup those VMAs?
>> That cleanup has to be initiated by user mode. Basically closing the old
>> KFD and DRM file descriptors, cleaning up all the user mode VM state,
>> unmapping all the VMAs, etc. Then it reopens KFD and the render nodes
>> and starts from scratch.
>>
>> User mode will do this automatically when it tries to reinitialize ROCm.
>> However, in this case the child process doesn't do that (e.g. a python
>> application using the multi-processing package). The child process does
>> not use ROCm. But you're left with all the dangling VMAs in the child
>> process indefinitely.
>
> Oh, not that one again. I'm unfortunately pretty sure that this is an
> clear NAK then.
>
> This python multi-processing package is violating various
> specifications by doing this fork() and we already had multiple
> discussions about that.

Well, it's in wide-spread use. We can't just throw up our hands and say
they're buggy and not supported.

Also, why does your ACK or NAK depend on this at all. If it's the right
thing to do, it's the right thing to do regardless of who benefits from
it. In addition, how can a child process that doesn't even use the GPU
be in violation of any GPU-driver related specifications.

Regards,
  Felix


>
> Let's talk about this on Mondays call. Thanks for giving the whole
> context.
>
> Regards,
> Christian.
>
>>
>> Regards,
>>    Felix
>>
>


More information about the amd-gfx mailing list