[Intel-gfx] [PATCH] drm/i915/selftests: Try to recover from a wedged GPU during reset tests

Tahvanainen, Jari jari.tahvanainen at intel.com
Tue Sep 19 14:24:22 UTC 2017


-----Original Message-----
From: Chris Wilson [mailto:chris at chris-wilson.co.uk] 
Sent: Tuesday, September 19, 2017 5:19 PM
To: intel-gfx at lists.freedesktop.org
Cc: Tahvanainen, Jari <jari.tahvanainen at intel.com>; Mika Kuoppala <mika.kuoppala at linux.intel.com>
Subject: Re: [PATCH] drm/i915/selftests: Try to recover from a wedged GPU during reset tests

Quoting Chris Wilson (2017-09-15 14:09:29)
> If we see the seqno stop progressing, we abandon the test for fear 
> that the GPU died following the reset. However, during test teardown 
> we still wait for the GPU to idle before continuing, but we have 
> already confirmed that the GPU is dead. Furthermore, since we are 
> inside a reset test, we have disabled the hangchecker, and so there is 
> no safety net and we wait indefinitely. Detect the stuck GPU and 
> declare it wedged as a state of emergency so we can escape.
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Jari Tahvanainen <jari.tahvanainen at intel.com>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>

>Ping?

Sorry Chris for late answer. Tried to get touch with you earlier through IRC.
I merged the series on top of the drm-tip and executed it in HSW - no hang anymore - FAIL.

(drv_selftest:6304) igt-kmod-CRITICAL: Test assertion failure function igt_kselftest_execute, file igt_kmod.c:513:
(drv_selftest:6304) igt-kmod-CRITICAL: Failed assertion: err == 0
(drv_selftest:6304) igt-kmod-CRITICAL: kselftest "i915 igt__19__live_hangcheck=1 live_selftests=-1" failed: Input/output error [5]
(drv_selftest:6304) igt-core-INFO: Stack trace:
(drv_selftest:6304) igt-core-INFO:   #0 [__igt_fail_assert+0x101]
(drv_selftest:6304) igt-core-INFO:   #1 [igt_kselftest_execute+0x296]
(drv_selftest:6304) igt-core-INFO:   #2 [igt_kselftests+0x295]
(drv_selftest:6304) igt-core-INFO:   #3 [main+0x5f]
(drv_selftest:6304) igt-core-INFO:   #4 [__libc_start_main+0xf1]
(drv_selftest:6304) igt-core-INFO:   #5 [_start+0x2a]
(drv_selftest:6304) igt-core-INFO:   #6 [<unknown>+0x2a]
****  END  ****
Stack trace:
  #0 [__igt_fail_assert+0x101]
  #1 [igt_kselftest_execute+0x296]
  #2 [igt_kselftests+0x295]
  #3 [main+0x5f]
  #4 [__libc_start_main+0xf1]
  #5 [_start+0x2a]
  #6 [<unknown>+0x2a]
Subtest live_hangcheck: FAIL (1.911s)



More information about the Intel-gfx mailing list