[PATCH v2 1/2] drm: Add GPU reset sysfs event

Rob Clark robdclark at gmail.com
Thu Mar 10 17:40:57 UTC 2022


On Thu, Mar 10, 2022 at 9:19 AM Sharma, Shashank
<shashank.sharma at amd.com> wrote:
>
>
>
> On 3/10/2022 6:10 PM, Rob Clark wrote:
> > On Thu, Mar 10, 2022 at 8:21 AM Sharma, Shashank
> > <shashank.sharma at amd.com> wrote:
> >>
> >>
> >>
> >> On 3/10/2022 4:24 PM, Rob Clark wrote:
> >>> On Thu, Mar 10, 2022 at 1:55 AM Christian König
> >>> <ckoenig.leichtzumerken at gmail.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> Am 09.03.22 um 19:12 schrieb Rob Clark:
> >>>>> On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
> >>>>> <contactshashanksharma at gmail.com> wrote:
> >>>>>> From: Shashank Sharma <shashank.sharma at amd.com>
> >>>>>>
> >>>>>> This patch adds a new sysfs event, which will indicate
> >>>>>> the userland about a GPU reset, and can also provide
> >>>>>> some information like:
> >>>>>> - process ID of the process involved with the GPU reset
> >>>>>> - process name of the involved process
> >>>>>> - the GPU status info (using flags)
> >>>>>>
> >>>>>> This patch also introduces the first flag of the flags
> >>>>>> bitmap, which can be appended as and when required.
> >>>>> Why invent something new, rather than using the already existing devcoredump?
> >>>>
> >>>> Yeah, that's a really valid question.
> >>>>
> >>>>> I don't think we need (or should encourage/allow) something drm
> >>>>> specific when there is already an existing solution used by both drm
> >>>>> and non-drm drivers.  Userspace should not have to learn to support
> >>>>> yet another mechanism to do the same thing.
> >>>>
> >>>> Question is how is userspace notified about new available core dumps?
> >>>
> >>> I haven't looked into it too closely, as the CrOS userspace
> >>> crash-reporter already had support for devcoredump, so it "just
> >>> worked" out of the box[1].  I believe a udev event is what triggers
> >>> the crash-reporter to go read the devcore dump out of sysfs.
> >>
> >> I had a quick look at the devcoredump code, and it doesn't look like
> >> that is sending an event to the user, so we still need an event to
> >> indicate a GPU reset.
> >
> > There definitely is an event to userspace, I suspect somewhere down
> > the device_add() path?
> >
>
> Let me check that out as well, hope that is not due to a driver-private
> event for GPU reset, coz I think I have seen some of those in a few DRM
> drivers.
>

Definitely no driver private event for drm/msm .. I haven't dug
through it all but this is the collector for devcoredump, triggered
somehow via udev.  Most likely from event triggered by device_add()

https://chromium.googlesource.com/chromiumos/platform2/+/HEAD/crash-reporter/udev_collector.cc

BR,
-R


More information about the dri-devel mailing list