[Intel-gfx] [PATCH i-g-t v15] tests: Add a test for device hot unplug

Janusz Krzysztofik janusz.krzysztofik at linux.intel.com
Wed Apr 15 13:15:15 UTC 2020


From: Janusz Krzysztofik <janusz.krzysztofik at intel.com>

There is a test which verifies unloading of i915 driver module but no
test exists that checks how a driver behaves when it gets unbound from
a device or when the device gets unplugged.  Implement such test using
sysfs interface.

Two minimalistic subtests - "unbind-rebind" and "unplug-rescan" -
perform the named operations on a DRM device which is believed to be
not in use.  Another pair of subtests named "hotunbind-lateclose" and
hotunplug-lateclose" do the same on a DRM device while keeping its file
descriptor open and close it thereafter.

v2: Run a subprocess with dummy_load instead of external command
    (Antonio).
v3: Run dummy_load from the test process directly (Antonio).
v4: Run dummy_load from inside subtests (Antonio).
v5: Try to restore the device to a working state after each subtest
    (Petri, Daniel).
v6: Run workload inside an igt helper subprocess so resources consumed
    by the workload are cleaned up automatically on workload subprocess
    crash, without affecting test results,
  - move the igt helper with workload back from subtests to initial
    fixture so workload crash also does not affect test results,
  - other cleanups suggested by Katarzyna and Chris.
v7: No changes.
v8: Move workload functions back from fixture to subtests,
  - register different actions and different workloads in respective
    tables and iterate over those tables while enumerating subtests,
  - introduce new subtest flavors by simply omitting module unload step,
  - instead of simply requesting bus rescan or not, introduce action
    specific device recovery helpers, required specifically with those
    new subtests not touching the module,
  - split workload functions in two parts, one spawning the workload,
    the other waiting for its completion,
  - for the new subtests not requiring module unload, run workload
    functions directly from the test process and use new workload
    completion wait functions in place of subprocess completion wait,
  - take more control over logging, longjumps and exit codes in
    workload subprocesses,
  - add some debug messages for easy progress watching,
  - move function API descriptions on top of respective typedefs.
v9: All changes after Daniel's comments - thanks!
  - flatten the code, don't try to create a midlayer (Daniel),
  - provide minimal subtests that even don't keep device open (Daniel),
  - don't use driver unbind in more advanced subtests (Daniel),
  - provide subtests with different level of resources allocated
    during device unplug (Daniel),
  - provide subtests which check driver behavior after device hot
    unplug (Daniel).
v10 Rename variables and function arguments to something that
    indicates they're file descriptors (Daniel),
  - introduce a data structure that contains various file descriptors
    and a helper function to set them all (Daniel),
  - fix strange indentation (Daniel),
  - limit scope to first three subtests as the initial set of tests to
    merge (Daniel).
v11 Fix typos in some comments,
  - use SPDX license identifier,
  - include a per-patch changelog in the commit message (Daniel).
v12 We don't use SPDX license identifiers nor GPL-2.0 in IGT (Petri),
  - avoid chipset, make sure we reopen the same device (Chris),
  - rename subtest "drm_open-hotunplug" to "hotunplug-lateclose",
  - add subtest "hotunbind-lateclose" (less affected by IOMMU issues),
  - move some redundant code to helpers,
  - reorder some helpers,
  - reword some messages and comments,
  - clean up headers.
v13 Add test / subtest descriptions (patchwork).
v14 Extract redundant device rescan and reopen code to a 'healthcheck'
    helper,
  - call igt_abort_on_f() on device reopen failure (Petri),
  - if any timeout set with igt_set_timeout() inside a subtest expires,
    call igt_abort_on_f() from a follow-up igt_fixture (Petri),
  - when running on a i915 device, require GEM and call
    igt_abort_on_f() if no usable GEM is detected on device reopen.
v15 Add the test to CI blacklist (Martin).

Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at intel.com>
Cc: Antonio Argenziano <antonio.argenziano at intel.com>
Cc: Petri Latvala <petri.latvala at intel.com>
Cc: Daniel Vetter <daniel at ffwll.ch>
Cc: Katarzyna Dec <katarzyna.dec at intel.com>
Cc: Martin Peres <martin.peres at linux.intel.com>
Acked-by: Chris Wilson <chris at chris-wilson.co.uk>
---
 tests/Makefile.sources       |   1 +
 tests/core_hotunplug.c       | 300 +++++++++++++++++++++++++++++++++++
 tests/intel-ci/blacklist.txt |   1 +
 tests/meson.build            |   1 +
 4 files changed, 303 insertions(+)
 create mode 100644 tests/core_hotunplug.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 4e44c98c2..32cbbf4f9 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -18,6 +18,7 @@ TESTS_progs = \
 	core_getclient \
 	core_getstats \
 	core_getversion \
+	core_hotunplug \
 	core_setmaster \
 	core_setmaster_vs_auth \
 	debugfs_test \
diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
new file mode 100644
index 000000000..f9cfc8c3c
--- /dev/null
+++ b/tests/core_hotunplug.c
@@ -0,0 +1,300 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <fcntl.h>
+#include <limits.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "igt.h"
+#include "igt_device_scan.h"
+#include "igt_kmod.h"
+#include "igt_sysfs.h"
+
+IGT_TEST_DESCRIPTION("Examine behavior of a driver on device hot unplug");
+
+struct hotunplug {
+	struct {
+		int drm;
+		int sysfs_dev;
+		int sysfs_bus;
+		int sysfs_drv;
+	} fd;
+	char *dev_bus_addr;
+};
+
+/* Helpers */
+
+static void prepare_for_unbind(struct hotunplug *priv, char *buf, int buflen)
+{
+	int len;
+
+	igt_assert(buflen);
+
+	priv->fd.sysfs_drv = openat(priv->fd.sysfs_dev, "device/driver",
+				    O_DIRECTORY);
+	igt_assert(priv->fd.sysfs_drv >= 0);
+
+	len = readlinkat(priv->fd.sysfs_dev, "device", buf, buflen - 1);
+	buf[len] = '\0';
+	priv->dev_bus_addr = strrchr(buf, '/');
+	igt_assert(priv->dev_bus_addr++);
+
+	/* sysfs_dev no longer needed */
+	close(priv->fd.sysfs_dev);
+}
+
+static void prepare(struct hotunplug *priv, char *buf, int buflen)
+{
+	igt_debug("opening device\n");
+	priv->fd.drm = __drm_open_driver(DRIVER_ANY);
+	igt_assert(priv->fd.drm >= 0);
+
+	priv->fd.sysfs_dev = igt_sysfs_open(priv->fd.drm);
+	igt_assert(priv->fd.sysfs_dev >= 0);
+
+	if (buf) {
+		prepare_for_unbind(priv, buf, buflen);
+	} else {
+		/* prepare for bus rescan */
+		priv->fd.sysfs_bus = openat(priv->fd.sysfs_dev,
+					    "device/subsystem", O_DIRECTORY);
+		igt_assert(priv->fd.sysfs_bus >= 0);
+	}
+}
+
+static const char *failure;
+
+/* Unbind the driver from the device */
+static void driver_unbind(int fd_sysfs_drv, const char *dev_bus_addr)
+{
+	failure = "Driver unbind timeout!";
+	igt_set_timeout(60, failure);
+	igt_sysfs_set(fd_sysfs_drv, "unbind", dev_bus_addr);
+	igt_reset_timeout();
+	failure = NULL;
+
+	/* don't close fd_sysfs_drv, it will be used for driver rebinding */
+}
+
+/* Re-bind the driver to the device */
+static void driver_bind(int fd_sysfs_drv, const char *dev_bus_addr)
+{
+	failure = "Driver re-bind timeout!";
+	igt_set_timeout(60, failure);
+	igt_sysfs_set(fd_sysfs_drv, "bind", dev_bus_addr);
+	igt_reset_timeout();
+	failure = NULL;
+
+	close(fd_sysfs_drv);
+}
+
+/* Remove (virtually unplug) the device from its bus */
+static void device_unplug(int fd_sysfs_dev)
+{
+	failure = "Device unplug timeout!";
+	igt_set_timeout(60, failure);
+	igt_sysfs_set(fd_sysfs_dev, "device/remove", "1");
+	igt_reset_timeout();
+	failure = NULL;
+
+	close(fd_sysfs_dev);
+}
+
+/* Re-discover the device by rescanning its bus */
+static void bus_rescan(int fd_sysfs_bus)
+{
+	failure = "Bus rescan timeout!";
+	igt_set_timeout(60, failure);
+	igt_sysfs_set(fd_sysfs_bus, "rescan", "1");
+	igt_reset_timeout();
+	failure = NULL;
+
+	close(fd_sysfs_bus);
+}
+
+static void healthcheck(void)
+{
+	int fd_drm;
+
+	/* device name may have changed, rebuild IGT device list */
+	igt_devices_scan(true);
+
+	igt_debug("reopening the device\n");
+	fd_drm = __drm_open_driver(DRIVER_ANY);
+	igt_abort_on_f(fd_drm < 0, "Device reopen failure");
+
+	if (is_i915_device(fd_drm)) {
+		failure = "GEM failure";
+		igt_require_gem(fd_drm);
+		failure = NULL;
+	}
+
+	close(fd_drm);
+}
+
+static void set_filter_from_device(int fd)
+{
+	const char *filter_type = "sys:";
+	char filter[strlen(filter_type) + PATH_MAX + 1];
+	char *dst = stpcpy(filter, filter_type);
+	char path[PATH_MAX + 1];
+
+	igt_assert(igt_sysfs_path(fd, path, PATH_MAX));
+	strncat(path, "/device", PATH_MAX - strlen(path));
+	igt_assert(realpath(path, dst));
+
+	igt_device_filter_set(filter);
+}
+
+/* Subtests */
+
+static void unbind_rebind(void)
+{
+	struct hotunplug priv;
+	char buf[PATH_MAX];
+
+	prepare(&priv, buf, sizeof(buf));
+
+	igt_debug("closing the device\n");
+	close(priv.fd.drm);
+
+	igt_debug("unbinding the driver from the device\n");
+	driver_unbind(priv.fd.sysfs_drv, priv.dev_bus_addr);
+
+	igt_debug("rebinding the driver to the device\n");
+	driver_bind(priv.fd.sysfs_drv, priv.dev_bus_addr);
+
+	healthcheck();
+}
+
+static void unplug_rescan(void)
+{
+	struct hotunplug priv;
+
+	prepare(&priv, NULL, 0);
+
+	igt_debug("closing the device\n");
+	close(priv.fd.drm);
+
+	igt_debug("unplugging the device\n");
+	device_unplug(priv.fd.sysfs_dev);
+
+	igt_debug("recovering the device\n");
+	bus_rescan(priv.fd.sysfs_bus);
+
+	healthcheck();
+}
+
+static void hotunbind_lateclose(void)
+{
+	struct hotunplug priv;
+	char buf[PATH_MAX];
+
+	prepare(&priv, buf, sizeof(buf));
+
+	igt_debug("hot unbinding the driver from the device\n");
+	driver_unbind(priv.fd.sysfs_drv, priv.dev_bus_addr);
+
+	igt_debug("rebinding the driver to the device\n");
+	driver_bind(priv.fd.sysfs_drv, priv.dev_bus_addr);
+
+	igt_debug("late closing the unbound device instance\n");
+	close(priv.fd.drm);
+
+	healthcheck();
+}
+
+static void hotunplug_lateclose(void)
+{
+	struct hotunplug priv;
+
+	prepare(&priv, NULL, 0);
+
+	igt_debug("hot unplugging the device\n");
+	device_unplug(priv.fd.sysfs_dev);
+
+	igt_debug("recovering the device\n");
+	bus_rescan(priv.fd.sysfs_bus);
+
+	igt_debug("late closing the removed device instance\n");
+	close(priv.fd.drm);
+
+	healthcheck();
+}
+
+/* Main */
+
+igt_main
+{
+	igt_fixture {
+		int fd_drm;
+
+		/**
+		 * As subtests must be able to close examined devices
+		 * completely, don't use drm_open_driver() as it keeps
+		 * a device file descriptor open for exit handler use.
+		 */
+		fd_drm = __drm_open_driver(DRIVER_ANY);
+		igt_assert(fd_drm >= 0);
+
+		if (is_i915_device(fd_drm))
+			igt_require_gem(fd_drm);
+
+		/* Make sure subtests always reopen the same device */
+		set_filter_from_device(fd_drm);
+
+		close(fd_drm);
+	}
+
+	igt_describe("Check if the driver can be cleanly unbound from a device believed to be closed");
+	igt_subtest("unbind-rebind")
+		unbind_rebind();
+
+	igt_fixture
+		igt_abort_on_f(failure, "%s\n", failure);
+
+	igt_describe("Check if a device believed to be closed can be cleanly unplugged");
+	igt_subtest("unplug-rescan")
+		unplug_rescan();
+
+	igt_fixture
+		igt_abort_on_f(failure, "%s\n", failure);
+
+	igt_describe("Check if the driver can be cleanly unbound from a still open device, then released");
+	igt_subtest("hotunbind-lateclose")
+		hotunbind_lateclose();
+
+	igt_fixture
+		igt_abort_on_f(failure, "%s\n", failure);
+
+	igt_describe("Check if a still open device can be cleanly unplugged, then released");
+	igt_subtest("hotunplug-lateclose")
+		hotunplug_lateclose();
+
+	igt_fixture
+		igt_abort_on_f(failure, "%s\n", failure);
+}
diff --git a/tests/intel-ci/blacklist.txt b/tests/intel-ci/blacklist.txt
index ee7045f03..201f4b1b4 100644
--- a/tests/intel-ci/blacklist.txt
+++ b/tests/intel-ci/blacklist.txt
@@ -117,3 +117,4 @@ igt at .*@.*pipe-f($|-.*)
 # Since 5.7-rc1, this test has produced tens of megabytes of kernel
 # logs.
 igt at perf_pmu@cpu-hotplug
+igt at core_hotunplug@.*
diff --git a/tests/meson.build b/tests/meson.build
index e882f4dcd..0bdcfbe4c 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -3,6 +3,7 @@ test_progs = [
 	'core_getclient',
 	'core_getstats',
 	'core_getversion',
+	'core_hotunplug',
 	'core_setmaster',
 	'core_setmaster_vs_auth',
 	'debugfs_test',
-- 
2.21.1



More information about the Intel-gfx mailing list