[igt-dev] [PATCH i-g-t v9 1/1] tests: Add a new test for device hot unplug

Daniel Vetter daniel at ffwll.ch
Mon May 13 09:19:37 UTC 2019


On Thu, May 09, 2019 at 10:09:14AM +0200, Janusz Krzysztofik wrote:
> From: Janusz Krzysztofik <janusz.krzysztofik at intel.com>
> 
> There is a test which verifies unloading of i915 driver module but no test
> exists that checks how a driver behaves when it gets unbound from a device
> or when the device gets unplugged.  Provide such test using sysfs
> interface.
> 
> Two minimalistic subtests - "unbind-rebind" and "unplug-rescan" - perform
> desired operations on a DRM device which is beleived to be not in use.
> 
> A subtest named "drm_open-hotunplug" unplugs a DRM device while keeping
> a file descriptor open.
> 
> A "gem_buffer-hotunplug" subtest performs device unplug while keepiug a
> GEM buffer object allocated, while a "gem_mmap-hotunplug" subtest does the
> same with the GEM buffer object additionally mmapped.
> 
> Next two subtests work only with i915 or Intel driver.  One of them -
> "i915_context-hotunplug" - performs device unplug while keeping an extra
> GEM context allocated.  The other one - "i915_spin-hotunplug" - tries to
> unplug a device with an active spin batch.
> 
> The last two subtests verify driver behavior after a device  has been hot
> unplugged.  First of them, called "drm-hotunplug-unload", checks if an
> open file descriptor of an unplugged device still protects the driver
> module from being unloaded.  The last one - "i915-hotunplug-write" -
> checks how the i915 driver pointing to an unplugged device responds to a
> GEM write IOCTL operation.
> 
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at intel.com>

Thanks for reworking, this looks good to me and we're going the right
direction I think.

When resending patches please always include a per-patch changelog in the
commit message, explaining what you've changed, why (e.g. suggested by
$reviewer_name), compared to the previous version.

> ---
>  tests/Makefile.sources |   1 +
>  tests/core_hotunplug.c | 462 +++++++++++++++++++++++++++++++++++++++++
>  tests/meson.build      |   1 +
>  3 files changed, 464 insertions(+)
>  create mode 100644 tests/core_hotunplug.c
> 
> diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> index 7f921f6c..08942f00 100644
> --- a/tests/Makefile.sources
> +++ b/tests/Makefile.sources
> @@ -16,6 +16,7 @@ TESTS_progs = \
>  	core_getclient \
>  	core_getstats \
>  	core_getversion \
> +	core_hotunplug \
>  	core_setmaster_vs_auth \
>  	debugfs_test \
>  	drm_import_export \
> diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
> new file mode 100644
> index 00000000..08e7b2ab
> --- /dev/null
> +++ b/tests/core_hotunplug.c
> @@ -0,0 +1,462 @@
> +/*
> + * Copyright © 2019 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "igt.h"
> +#include "igt_device.h"
> +#include "igt_dummyload.h"
> +#include "igt_kmod.h"
> +#include "igt_sysfs.h"
> +
> +#include <limits.h>
> +#include <string.h>
> +#include <unistd.h>
> +
> +/* Re-bind the driver to the device */
> +static void driver_bind(int drv, const char *addr)
> +{
> +	igt_set_timeout(60, "Driver re-bind timeout!");
> +	igt_sysfs_set(drv, "bind", addr);
> +	igt_reset_timeout();
> +
> +	close(drv);
> +}
> +
> +/* Unbind the driver from the device */
> +static void driver_unbind(int drv, const char *addr)
> +{
> +	igt_set_timeout(60, "Driver unbind timeout!");
> +	igt_sysfs_set(drv, "unbind", addr);
> +	igt_reset_timeout();
> +
> +	/* don't close drv, it will be used for driver rebinding */
> +}
> +
> +/* Re-discover the device by rescanning its bus */
> +static void bus_rescan(int bus)
> +{
> +	igt_set_timeout(60, "Bus rescan timeout!");
> +	igt_sysfs_set(bus, "rescan", "1");
> +	igt_reset_timeout();
> +
> +	close(bus);
> +}
> +
> +/* Remove (virtually unplug) the device from its bus */
> +static void device_unplug(int dev)
> +{
> +	igt_set_timeout(60, "Device unplug timeout!");
> +	igt_sysfs_set(dev, "device/remove", "1");
> +	igt_reset_timeout();
> +
> +	close(dev);
> +}
> +
> +static bool module_unload(int chipset, const char *module)
> +{
> +	if (chipset == DRIVER_INTEL)
> +		return igt_i915_driver_unload() == IGT_EXIT_SUCCESS;
> +	else
> +		return igt_kmod_unload(module, 0) == 0;
> +}
> +
> +static void unbind_rebind(int chipset)
> +{
> +	int fd, dev, drv, len;

Please rename these to fd_drm, fd_sysfs_dev and fd_sysf_drv, or something
else that indicates they're file descriptors. Same for function arguments.
The code gets very confusing to read otherwise.

btw I think this went a tiny bit too much into the "no framework at all"
direction. I think a little data structure that contains the various fds
we need, plus a helper function to set them all up would be useful. Then
you can call that as needed, plus we have a place to add other
test-relevant data to pass around. But as-is the code looks ok too, up to
you.

But otherwise I like the open-coded test structure a lot, makes it much
easier to follow along and understand what's going on.

> +	char path[PATH_MAX];
> +	const char *addr;
> +
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	dev = igt_sysfs_open(fd);
> +	igt_assert(dev >= 0);
> +
> +	close(fd);
> +
> +	/* collect information required for driver bind/unbind */
> +	drv = openat(dev, "device/driver", O_DIRECTORY);
> +	igt_assert(drv >= 0);
> +
> +	len = readlinkat(dev, "device", path, sizeof(path) - 1);
> +	path[len] = '\0';
> +	addr = strrchr(path, '/') + 1;
> +
> +	close(dev);
> +
> +	igt_debug("unbinding driver\n");
> +        driver_unbind(drv, addr);

Strange indenting here. Some other places like this too.

> +
> +	igt_debug("rebinding driver\n");
> +	driver_bind(drv, addr);
> +
> +	igt_debug("reopening device\n");
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	close(fd);
> +}
> +
> +static void unplug_rescan(int chipset)
> +{
> +	int fd, dev, bus;
> +
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	dev = igt_sysfs_open(fd);
> +	igt_assert(dev >= 0);
> +
> +	close(fd);
> +
> +	/* collect information required for bus rescan */
> +	bus = openat(dev, "device/subsystem", O_DIRECTORY);
> +	igt_assert(bus >= 0);
> +
> +	igt_debug("unplugging device\n");
> +        device_unplug(dev);
> +
> +	igt_debug("recovering device\n");
> +	bus_rescan(bus);
> +
> +	igt_debug("reopening driver\n");
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	close(fd);
> +}
> +
> +static void drm_open_hotunplug(int chipset)
> +{
> +	int fd, dev, bus;
> +
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	dev = igt_sysfs_open(fd);
> +	igt_assert(dev >= 0);
> +
> +	/* collect information required for bus rescan */
> +	bus = openat(dev, "device/subsystem", O_DIRECTORY);
> +	igt_assert(bus >= 0);
> +
> +	igt_debug("unplugging device\n");
> +        device_unplug(dev);
> +
> +	igt_debug("recovering device\n");
> +	bus_rescan(bus);
> +
> +	igt_debug("closing device\n");
> +	close(fd);
> +
> +	igt_debug("reopening driver\n");
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	close(fd);
> +}

Testcases lgtm up to here. I think this would make sense as the first set
of tests to merge (together with the basic hotunplug kernel fixes).

> +
> +static void gem_buffer_hotunplug(int chipset)
> +{
> +	int fd, dev, bus;
> +	struct igt_fb fb;
> +
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	igt_debug("creating GEM object\n");
> +	igt_create_bo_for_fb(fd, 1, 1, DRM_FORMAT_XRGB8888, 0, &fb);

This is a kms function, and fairly high-level. I think for the unplug
tests we should have more low-level control over what we're doing. Here's
the testcases I have in mind:

- a generic gem_dumb_mmap-hotunplug testcase which uses the dumb ioctls.
  Needs to check that those exist (e.g. if we have i915 without kms
  enabled, atm doesn't exist yet but might happen).

- i915 specific testcases for the various mmap modes i915 supports: gtt
  mmap, wc and cached cpu mmap.

Note that for all of these you must first write into the mmap region, in
many cases the mmap won't be fully instantiated without that (i.e. kernel
will not allocate actually gpu memory, only some metadata to describe what
you want). Because of this delayed allocation just testing with a buffer
allocated is probably not that interesting, I'd drop that subtest. Ofc if
allocating itself yields bugs in testing, then definitely keep it!

> +
> +	dev = igt_sysfs_open(fd);
> +	igt_assert(dev >= 0);
> +
> +	/* collect information required for bus rescan */
> +	bus = openat(dev, "device/subsystem", O_DIRECTORY);
> +	igt_assert(bus >= 0);
> +
> +	igt_debug("unplugging device\n");
> +        device_unplug(dev);
> +
> +	igt_debug("recovering device\n");
> +	bus_rescan(bus);
> +
> +	igt_debug("closing device\n");
> +	close(fd);
> +
> +	igt_debug("reopening driver\n");
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	close(fd);
> +}
> +
> +static void gem_mmap_hotunplug(int chipset)
> +{
> +	int fd, dev, bus;
> +	struct igt_fb fb;
> +
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	igt_debug("mmapping GEM object\n");
> +	igt_create_bo_for_fb(fd, 1, 1, DRM_FORMAT_XRGB8888, 0, &fb);
> +	igt_ignore_warn(igt_fb_map_buffer(fd, &fb));
> +
> +	dev = igt_sysfs_open(fd);
> +	igt_assert(dev >= 0);
> +
> +	/* collect information required for bus rescan */
> +	bus = openat(dev, "device/subsystem", O_DIRECTORY);
> +	igt_assert(bus >= 0);
> +
> +	igt_debug("unplugging device\n");
> +        device_unplug(dev);
> +
> +	igt_debug("recovering device\n");
> +	bus_rescan(bus);
> +
> +	igt_debug("closing device\n");
> +	close(fd);
> +
> +	igt_debug("reopening driver\n");
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	close(fd);
> +}
> +
> +static void i915_context_hotunplug(int chipset)
> +{
> +	int fd, dev, bus;
> +
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	gem_require_contexts(fd);
> +
> +	igt_debug("creating GEM context\n");
> +	igt_ignore_warn(gem_context_create(fd));
> +
> +	dev = igt_sysfs_open(fd);
> +	igt_assert(dev >= 0);
> +
> +	/* collect information required for bus rescan */
> +	bus = openat(dev, "device/subsystem", O_DIRECTORY);
> +	igt_assert(bus >= 0);
> +
> +	igt_debug("unplugging device\n");
> +        device_unplug(dev);
> +
> +	igt_debug("recovering device\n");
> +	bus_rescan(bus);
> +
> +	igt_debug("closing device\n");
> +	close(fd);
> +
> +	igt_debug("reopening driver\n");
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	igt_debug("performing healthcheck\n");
> +	gem_test_engine(fd, ALL_ENGINES);
> +
> +	close(fd);

I think for context we again have the issue that allocating one doesn't do
much, we need to use it first. But for a simple context test we want to
make sure that the batch has stopped. Also, you don't need to allocate a
context, there's a default context allocated when you open an fd. Here's
my testcase idea:

- open fd
- run a simple batch (this means you need some buffers and stuff, and the
  kernel will allocate contexts, buffers and map them all into your
  context). gem_test_engine might do this, minus the "keep the buffer
  mapped" part.
- make sure you don't release your batchbuffer (we want the kernel to do
  the cleanup for us).
- hotunplug and usual tail of the testcase

> +}
> +
> +static void i915_spin_hotunplug(int chipset)
> +{
> +	int fd, dev, bus;
> +
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	igt_debug("submitting dummy load\n");
> +	igt_ignore_warn(igt_spin_new(fd));
> +
> +	dev = igt_sysfs_open(fd);
> +	igt_assert(dev >= 0);
> +
> +	/* collect information required for bus rescan */
> +	bus = openat(dev, "device/subsystem", O_DIRECTORY);
> +	igt_assert(bus >= 0);
> +
> +	igt_debug("unplugging device\n");
> +        device_unplug(dev);
> +
> +	igt_debug("recovering device\n");
> +	bus_rescan(bus);
> +
> +	igt_debug("closing device\n");
> +	close(fd);
> +
> +	igt_debug("reopening driver\n");
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	igt_debug("performing healthcheck\n");
> +	gem_test_engine(fd, ALL_ENGINES);
> +
> +	close(fd);
> +}
> +
> +static void drm_hotunplug_unload(int chipset, const char *module)
> +{
> +	int fd, dev, bus;
> +
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	dev = igt_sysfs_open(fd);
> +	igt_assert(dev >= 0);
> +
> +	/* collect information required for bus rescan */
> +	bus = openat(dev, "device/subsystem", O_DIRECTORY);
> +	igt_assert(bus >= 0);
> +
> +	igt_debug("unplugging device\n");
> +        device_unplug(dev);
> +
> +	igt_debug("trying to unload module\n");
> +	igt_assert(!module_unload(chipset, module));
> +
> +	igt_debug("closing device\n");
> +	close(fd);
> +
> +	igt_debug("unloading module\n");
> +	igt_assert(module_unload(chipset, module));
> +
> +	igt_debug("recovering device\n");
> +	bus_rescan(bus);
> +
> +	igt_debug("reopening driver\n");
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	close(fd);
> +}
> +
> +static void i915_hotunplug_write(int chipset)
> +{
> +	int fd, dev, bus, gem;
> +	const uint32_t bbe = MI_BATCH_BUFFER_END;
> +
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	igt_require_gem(fd);
> +
> +	igt_debug("creating GEM buffer object\n");
> +	gem = gem_create(fd, 1);
> +
> +	dev = igt_sysfs_open(fd);
> +	igt_assert(dev >= 0);
> +
> +	/* collect information required for bus rescan */
> +	bus = openat(dev, "device/subsystem", O_DIRECTORY);
> +	igt_assert(bus >= 0);
> +
> +	igt_debug("unplugging device\n");
> +        device_unplug(dev);
> +
> +	igt_debug("trying to write to GEM buffer\n");
> +	igt_assert_neq(__gem_write(fd, gem, 0, &bbe, sizeof(bbe)), 0);

I think we could rename this to our general "test ioctls after hotunplug"
testcase, and go through the entire list of ioctls.

Also I think testing for an explicit errno would be good here, I think
-EIO is the right one for "oops the device is gone". Need to double check
with Chris Wilson, he tends to have good insights on these uapi details.

These hotunplug ioctl testcase might be a good 2nd step of landing this.
The module unload testcase should also be fairly easy to get all fixed up
and landed.

Then we can start looking at the nastier ones, maybe start with mmap, and
the used-but-now-idle context. Then busybatch. And even later (we're
talking months here I think) we can start looking at dma-buf/dma-fence
sharing and stuff like that. Even later then in-flight kms requests.

This is all going to be lots of fun :-)

Cheers, Daniel

> +
> +	igt_debug("recovering device\n");
> +	bus_rescan(bus);
> +
> +	igt_debug("closing device\n");
> +	close(fd);
> +
> +	igt_debug("reopening driver\n");
> +	fd = __drm_open_driver(chipset);
> +	igt_assert(fd >= 0);
> +
> +	igt_debug("performing healthcheck\n");
> +	gem_test_engine(fd, ALL_ENGINES);
> +
> +	close(fd);
> +}
> +
> +igt_main {
> +	int chipset;
> +	char *module;
> +
> +	igt_fixture {
> +		char path[PATH_MAX];
> +		int fd, dev, len;
> +
> +		/**
> +		 * Since some subtests depend on successful unload of a driver
> +		 * module, don't use drm_open_driver() as it keeps a device file
> +		 * descriptor open for exit handler use and that effectively
> +		 * prevents the module from being unloaded.
> +		 */
> +		fd = __drm_open_driver(DRIVER_ANY);
> +		igt_assert(fd >= 0);
> +
> +		if (is_i915_device(fd)) {
> +			chipset = DRIVER_INTEL;
> +			module = strdup("i915");
> +		} else {
> +			chipset = DRIVER_ANY;
> +
> +			/* Capture module name to be unloaded */
> +			dev = igt_sysfs_open(fd);
> +			len = readlinkat(dev, "device/driver/module", path,
> +					 sizeof(path) - 1);
> +			close(dev);
> +			path[len] = '\0';
> +			module = strdup(strrchr(path, '/') + 1);
> +		}
> +		close(fd);
> +
> +		igt_info("Running the test on driver \"%s\", chipset mask %#0x\n",
> +			 module, chipset);
> +	}
> +
> +	igt_subtest("unbind-rebind")
> +		unbind_rebind(chipset);
> +
> +	igt_subtest("unplug-rescan")
> +		unplug_rescan(chipset);
> +
> +	igt_subtest("drm_open-hotunplug")
> +		drm_open_hotunplug(chipset);
> +
> +	igt_subtest("gem_buffer-hotunplug")
> +		gem_buffer_hotunplug(chipset);
> +
> +	igt_subtest("gem_mmap-hotunplug")
> +		gem_mmap_hotunplug(chipset);
> +
> +	igt_subtest("i915_context-hotunplug")
> +		i915_context_hotunplug(chipset);
> +
> +	igt_subtest("i915_spin-hotunplug")
> +		i915_spin_hotunplug(chipset);
> +
> +	igt_subtest("drm-hotunplug-unload")
> +		drm_hotunplug_unload(chipset, module);
> +
> +	igt_subtest("i915-hotunplug-write")
> +		i915_hotunplug_write(chipset);
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index 711979b4..ff391c94 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -3,6 +3,7 @@ test_progs = [
>  	'core_getclient',
>  	'core_getstats',
>  	'core_getversion',
> +	'core_hotunplug',
>  	'core_setmaster_vs_auth',
>  	'debugfs_test',
>  	'drm_import_export',
> -- 
> 2.20.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the igt-dev mailing list