[PATCH v2 1/2] drm: Add GPU reset sysfs event

Abhinav Kumar quic_abhinavk at quicinc.com
Thu Mar 10 18:33:24 UTC 2022



On 3/10/2022 9:40 AM, Rob Clark wrote:
> On Thu, Mar 10, 2022 at 9:19 AM Sharma, Shashank
> <shashank.sharma at amd.com> wrote:
>>
>>
>>
>> On 3/10/2022 6:10 PM, Rob Clark wrote:
>>> On Thu, Mar 10, 2022 at 8:21 AM Sharma, Shashank
>>> <shashank.sharma at amd.com> wrote:
>>>>
>>>>
>>>>
>>>> On 3/10/2022 4:24 PM, Rob Clark wrote:
>>>>> On Thu, Mar 10, 2022 at 1:55 AM Christian König
>>>>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Am 09.03.22 um 19:12 schrieb Rob Clark:
>>>>>>> On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
>>>>>>> <contactshashanksharma at gmail.com> wrote:
>>>>>>>> From: Shashank Sharma <shashank.sharma at amd.com>
>>>>>>>>
>>>>>>>> This patch adds a new sysfs event, which will indicate
>>>>>>>> the userland about a GPU reset, and can also provide
>>>>>>>> some information like:
>>>>>>>> - process ID of the process involved with the GPU reset
>>>>>>>> - process name of the involved process
>>>>>>>> - the GPU status info (using flags)
>>>>>>>>
>>>>>>>> This patch also introduces the first flag of the flags
>>>>>>>> bitmap, which can be appended as and when required.
>>>>>>> Why invent something new, rather than using the already existing devcoredump?
>>>>>>
>>>>>> Yeah, that's a really valid question.
>>>>>>
>>>>>>> I don't think we need (or should encourage/allow) something drm
>>>>>>> specific when there is already an existing solution used by both drm
>>>>>>> and non-drm drivers.  Userspace should not have to learn to support
>>>>>>> yet another mechanism to do the same thing.
>>>>>>
>>>>>> Question is how is userspace notified about new available core dumps?
>>>>>
>>>>> I haven't looked into it too closely, as the CrOS userspace
>>>>> crash-reporter already had support for devcoredump, so it "just
>>>>> worked" out of the box[1].  I believe a udev event is what triggers
>>>>> the crash-reporter to go read the devcore dump out of sysfs.
>>>>
>>>> I had a quick look at the devcoredump code, and it doesn't look like
>>>> that is sending an event to the user, so we still need an event to
>>>> indicate a GPU reset.
>>>
>>> There definitely is an event to userspace, I suspect somewhere down
>>> the device_add() path?
>>>
>>
>> Let me check that out as well, hope that is not due to a driver-private
>> event for GPU reset, coz I think I have seen some of those in a few DRM
>> drivers.
>>
> 
> Definitely no driver private event for drm/msm .. I haven't dug
> through it all but this is the collector for devcoredump, triggered
> somehow via udev.  Most likely from event triggered by device_add()
> 
> https://chromium.googlesource.com/chromiumos/platform2/+/HEAD/crash-reporter/udev_collector.cc

Yes, that is correct. the uevent for devcoredump is from device_add()

https://github.com/torvalds/linux/blob/master/drivers/base/core.c#L3340

> 
> BR,
> -R


More information about the amd-gfx mailing list