[Intel-gfx] [PATCH igt] lib: Check and report if a subtest triggers a new kernel taint
Petri Latvala
petri.latvala at intel.com
Tue Dec 5 13:01:59 UTC 2017
On Tue, Dec 05, 2017 at 12:38:19PM +0000, Chris Wilson wrote:
> Quoting Petri Latvala (2017-12-05 12:32:36)
> > On Tue, Dec 05, 2017 at 12:24:53PM +0000, Chris Wilson wrote:
> > > Checking for a tainted kernel is a convenient way to see if the test
> > > generated a critical error such as a oops, or machine check.
> > >
> > > v2: Docs?
> > >
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > > Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg at intel.com>
> > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > > ---
> > > lib/Makefile.sources | 2 +
> > > lib/igt_core.c | 19 +++++++-
> > > lib/igt_kernel_taint.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++++
> > > lib/igt_kernel_taint.h | 34 ++++++++++++++
> > > 4 files changed, 174 insertions(+), 1 deletion(-)
> > > create mode 100644 lib/igt_kernel_taint.c
> > > create mode 100644 lib/igt_kernel_taint.h
> > >
> > > diff --git a/lib/Makefile.sources b/lib/Makefile.sources
> > > index 6e968d9f7..152153908 100644
> > > --- a/lib/Makefile.sources
> > > +++ b/lib/Makefile.sources
> > > @@ -22,6 +22,8 @@ lib_source_list = \
> > > igt_gt.h \
> > > igt_gvt.c \
> > > igt_gvt.h \
> > > + igt_kernel_taint.c \
> > > + igt_kernel_taint.h \
> > > igt_primes.c \
> > > igt_primes.h \
> > > igt_rand.c \
> > > diff --git a/lib/igt_core.c b/lib/igt_core.c
> > > index 03fa6e4e8..486c5989d 100644
> > > --- a/lib/igt_core.c
> > > +++ b/lib/igt_core.c
> > > @@ -63,6 +63,7 @@
> > > #include "intel_chipset.h"
> > > #include "intel_io.h"
> > > #include "igt_debugfs.h"
> > > +#include "igt_kernel_taint.h"
> > > #include "version.h"
> > > #include "config.h"
> > >
> > > @@ -261,6 +262,7 @@ static bool list_subtests = false;
> > > static char *run_single_subtest = NULL;
> > > static bool run_single_subtest_found = false;
> > > static const char *in_subtest = NULL;
> > > +static unsigned long saved_kernel_taint;
> > > static struct timespec subtest_time;
> > > static clockid_t igt_clock = (clockid_t)-1;
> > > static bool in_fixture = false;
> > > @@ -937,6 +939,8 @@ bool __igt_run_subtest(const char *subtest_name)
> > > return false;
> > > }
> > >
> > > + saved_kernel_taint = igt_read_kernel_taint();
> > > +
> > > kmsg(KERN_INFO "[IGT] %s: starting subtest %s\n", command_str, subtest_name);
> > > igt_debug("Starting subtest: %s\n", subtest_name);
> > >
> > > @@ -1083,8 +1087,21 @@ void __igt_skip_check(const char *file, const int line,
> > > void igt_success(void)
> > > {
> > > succeeded_one = true;
> > > - if (in_subtest)
> > > + if (in_subtest) {
> > > + unsigned long new_kernel_taints =
> > > + igt_read_kernel_taint() & ~saved_kernel_taint;
> > > + unsigned int tainted = igt_kernel_tainted(new_kernel_taints);
> > > +
> > > + if (tainted) {
> > > + igt_kernel_taint_print(new_kernel_taints);
> > > + if (tainted & TAINT_ERROR)
> > > + exit_subtest("FAIL");
> > > + else
> > > + exit_subtest("WARN");
> > > + }
> > > +
> > > exit_subtest("SUCCESS");
> > > + }
> > > }
> >
> >
> > If you change the result to FAIL or WARN here, succeeded_one should not be changed.
> >
> > What of tests that don't have subtests?
>
> I don't know where they are tracked. If there is a location we can place
> such a before/after check. Or even if we do want to change test status
> at all, and just make it a warn if the kernel becomes tainted.
>
> The ambition here isn't just to flag oops reliably, but to respond when
> we know the HW is broken and requires rebooting.
Still, the behaviour should be consistent whether it's a simple_test
or a subtest, right?
igt_simple_init_parse_opts would be a good place for storing the old
taint status, and checking generated taints / remapping the result
could be done in igt_exit, when test_with_subtests == false.
--
Petri Latvala
More information about the Intel-gfx
mailing list