[PATCH v2 1/2] drm: Add GPU reset sysfs event

Daniel Stone daniel at fooishbar.org
Thu Mar 17 10:31:37 UTC 2022


Hi,

On Thu, 17 Mar 2022 at 09:21, Christian König <christian.koenig at amd.com> wrote:
> Am 17.03.22 um 09:42 schrieb Sharma, Shashank:
> >> AFAIU you probably want to be passing around a `struct pid *`, and
> >> then somehow use pid_vnr() in the context of the process reading the
> >> event to get the numeric pid.  Otherwise things will not do what you
> >> expect if the process triggering the crash is in a different pid
> >> namespace from the compositor.
> >
> > I am not sure if it is a good idea to add the pid extraction
> > complexity in here, it is left upto the driver to extract this
> > information and pass it to the work queue. In case of AMDGPU, its
> > extracted from GPU VM. It would be then more flexible for the drivers
> > as well.
>
> Yeah, but that is just used for debugging.
>
> If we want to use the pid for housekeeping, like for a daemon which
> kills/restarts processes, we absolutely need that or otherwise won't be
> able to work with containers.

100% this.

Pushing back to the compositor is a red herring. The compositor is
just a service which tries to handle window management and input. If
you're looking to kill the offending process or whatever, then that
should go through the session manager - be it systemd or something
container-centric or whatever. At least that way it can deal with
cgroups at the same time, unlike the compositor which is not really
aware of what the thing on the other end of the socket is doing. This
ties in with the support they already have for things like coredump
analysis, and would also be useful for other devices.

Some environments combine compositor and session manager, and a lot of
them have them strongly related, but they're very definitely not the
same thing ...

Cheers,
Daniel


More information about the amd-gfx mailing list