[PATCH v10 1/4] drm: Introduce device wedged event

Raag Jadav raag.jadav at intel.com
Tue Dec 3 05:00:02 UTC 2024


On Mon, Dec 02, 2024 at 10:07:59AM +0200, Raag Jadav wrote:
> On Fri, Nov 29, 2024 at 10:40:14AM -0300, André Almeida wrote:
> > Hi Raag,
> > 
> > Em 28/11/2024 12:37, Raag Jadav escreveu:
> > > Introduce device wedged event, which notifies userspace of 'wedged'
> > > (hanged/unusable) state of the DRM device through a uevent. This is
> > > useful especially in cases where the device is no longer operating as
> > > expected and has become unrecoverable from driver context. Purpose of
> > > this implementation is to provide drivers a generic way to recover with
> > > the help of userspace intervention without taking any drastic measures
> > > in the driver.
> > > 
> > > A 'wedged' device is basically a dead device that needs attention. The
> > > uevent is the notification that is sent to userspace along with a hint
> > > about what could possibly be attempted to recover the device and bring
> > > it back to usable state. Different drivers may have different ideas of
> > > a 'wedged' device depending on their hardware implementation, and hence
> > > the vendor agnostic nature of the event. It is up to the drivers to
> > > decide when they see the need for device recovery and how they want to
> > > recover from the available methods.
> > > 
> > 
> > Thank you for your work. Do you think you can add the optional PID
> > parameter, as the PID of the app that caused the reset? For SteamOS use case
> > it has been proved to be useful to kill the fault app as well. If the reset
> > was caused by a kthread, no PID can be provided hence it's an optional
> > parameter.
> 
> Hmm, I'm not sure if it really fits here since it doesn't seem like
> a generic usecase.
> 
> I'd still be open for it if found useful by the drivers but perhaps
> as an extended feature in a separate series.

What do you think Chris, are we good to go with v10?

Raag


More information about the Intel-xe mailing list