[PATCH v2 0/1] drm: Add doc about GPU reset

André Almeida andrealmeid at igalia.com
Mon Feb 27 20:39:59 UTC 2023


Hi,

Thanks everyone that gave feedback. v2 Changes:
- This new version is a section of drm-uapi instead of a new file
- Drop requirement for KMD to kill applications
- Drop role of init systems on compositors recover
- Drop assumption that robust apps creates new contexts

Original cover letter bellow:

Due to the complexity of its stack and the apps that we run on it, GPU resets
are for granted. What's left for driver developers is how to make resets a
smooth experience as possible. While some OS's can recover or show an error
message in such cases, Linux is more a hit-and-miss due to its lack of
standardization and guidelines of what to do in such cases.

This is the goal of this document, to proper define what should happen after a
GPU reset so developers can start acting on top of this. An IGT test should be
created to validate this for each driver.

Initially my approach was to expose an uevent for GPU resets, as it can be seen
here[1]. However, even if an uevent can be useful for some use cases (e.g.
telemetry and error reporting), for the "OS integration" case of GPU resets
it would be more productive to have something defined through the stack.

Thanks,
	André

[1] https://lore.kernel.org/amd-gfx/20221125175203.52481-1-andrealmeid@igalia.com/

André Almeida (1):
  drm/doc: Document DRM device reset expectations

 Documentation/gpu/drm-uapi.rst | 51 ++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

-- 
2.39.2



More information about the amd-gfx mailing list