[PATCH 1/4] drm/xe: Introduce a simple wedged state
Lucas De Marchi
lucas.demarchi at intel.com
Tue Apr 16 17:03:14 UTC 2024
On Tue, Apr 09, 2024 at 06:15:04PM GMT, Rodrigo Vivi wrote:
>Introduce a very simple 'wedged' state where any attempt
>to access the GPU is entirely blocked.
>
>On some critical cases, like on gt_reset failure, we need to
>block any other attempt to use the GPU. Otherwise we are at
>a risk of reaching cases that would force us to reboot the machine.
>
>So, when this cases are identified we corner and block any GPU
>access. No IOCTL and not even another GT reset should be attempted.
>
>The 'wedged' state in Xe is an end state with no way back.
>Only a device "re-probe" (unbind + bind) can restore the GPU access.
>
>v2: - s/wedged/busted (Lucas)
> - use unbind+bind instead of module reload (Lucas)
> - added more info on unbind operations and instruction on bug report
> - only print the message once.
>
>v3: - s/busted/wedged (Ashutosh, Tvrtko, Thomas)
> - don't assume user has sudo and tee available (Lucas)
>
>v4: - remove unnecessary cases around ct communication or migration.
>
>Cc: Ashutosh Dixit <ashutosh.dixit at intel.com>
>Cc: Tvrtko Ursulin <tursulin at ursulin.net>
>Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>Cc: Anshuman Gupta <anshuman.gupta at intel.com>
>Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com> #v2
>Reviewed-by: Lucas De Marchi <lucas.demarchi at intel.com> #v2
my r-b remains for this version.
thanks
Lucas De Marchi
More information about the Intel-xe
mailing list