[Intel-gfx] [PATCH i-g-t v11 0/1] tests: Add a new test for device hot unplug

Janusz Krzysztofik janusz.krzysztofik at linux.intel.com
Fri Jun 7 11:51:41 UTC 2019


The test should help resolving driver bugs which exhibit themselves
when a device is unplugged / driver unbind from a device while the
device is busy (different from simple module unload which requires 
device references being put first).

A kernel patch resolving kernel panics on driver hot unbind [1] was
verified on trybot with v10 of this test before it has been submitted
upstream.  Current version (v11) has also been tested on trybot with
the kernel patch already included upstream.  Hence, no kernel panics
are expected, however some kernel WARNs and driver error messages may
still need to be resolved before CI is happy with this new test.

[1] https://cgit.freedesktop.org/drm/drm-tip/commit/?id=47bc28d7ee6d8378ba4451c43885cb3241302243

Janusz Krzysztofik (1):
  tests: Add a new test for device hot unplug

 tests/Makefile.sources |   1 +
 tests/core_hotunplug.c | 222 +++++++++++++++++++++++++++++++++++++++++
 tests/meson.build      |   1 +
 3 files changed, 224 insertions(+)
 create mode 100644 tests/core_hotunplug.c

Changelog:
v10->v11:
- fix typos in some comments,
- use SPDX license identifier,
- include a per-patch changelog in the commit message (Daniel).

v9->v10 (submitted only to trybot):
- rename variables and function arguments to something that indicates
  they're file descriptors (Daniel),
- introduce a data structure that contains various file descriptors
  and a helper function to set them all (Daniel),
- fix strange indenting (Daniel),
- limit scope to first three subtests as the first set of tests to
  merge (Daniel).

v8->v9:
All changes after Daniel's comments - thanks!
- flatten the code, don't try to create a midlayer,
- provide mimimal subtests that even don't keep device open,
- don't use driver unbind in more advanced subtests,
- provide subtests with different level of resources allocated
  during device unplug,
- provide subtests which check driver behavior after device hot
  unplug.

v7->v8:
- move workload functions back from fixture to subtests,
- register different actions and different workloads in respective
  tables and iterate over those tables while enumerating subtests,
- introduce new subtest flavors by simply omiting module unload step,
- instead of simply requesting bus rescan or not, introduce action
  specific device recovery helpers, required specifically with those
  new subtests not touching the module,
- split workload functions in two parts, one spawning the workload,
  the other waiting for its completion,
- for the new subtests not requiring module unload, run workload
  functions directly from the test process and use new workload
  completion wait functions in place of subprocess completion wait,
- take more control over logging, longjumps and exit codes in
  workload subprocesses,
- add some debug messages for easy progress watching,
- move function API descriptions on top of respective typedefs,
- drop patch 2/2 with external workload command again, still nobody
  likes it.

v6->v7:
- add missing igt_exit() needed with the second patch.

v5->v6 (third public submission, incorrectly marked as v5, sorry):
- run workload inside an igt helper subprocess so resources consumed
  by the workload are cleaned up automatically on workload subprocess
  crash, without affecting test results,
- move the igt helper with workload back from subtests to initial
  fixture so workload crash also does not affect test results,
- re-add the second patch which extends the test with an option for
  using an external command as a workload,
- other cleanups suggested by Kasia and Chris.

v4->v5 (second public submission, marked as v2):
- try to restore the device to a working state after each subtest
  (Petri, Daniel).

v3->v4 (first public submission, not marked with any version number):
- run dummy_load from inside subtests (Antonio).

v2->v3 (internal submission):
- run dummy_load from the test process directly (Antonio),
- drop the patch for running external workload (Antonio).

v1->v2 (internal submission):
- run a subprocess with dummy_load instead of external command
  (Antonio),
- keep use of external workload command as an option, move that to a
  separate patch.

-- 
2.21.0



More information about the Intel-gfx mailing list