[Intel-gfx] [PATCH igt] lib: Check and report if a subtest triggers a new kernel taint

Petri Latvala petri.latvala at intel.com
Tue Dec 5 13:01:59 UTC 2017


On Tue, Dec 05, 2017 at 12:38:19PM +0000, Chris Wilson wrote:
> Quoting Petri Latvala (2017-12-05 12:32:36)
> > On Tue, Dec 05, 2017 at 12:24:53PM +0000, Chris Wilson wrote:
> > > Checking for a tainted kernel is a convenient way to see if the test
> > > generated a critical error such as a oops, or machine check.
> > > 
> > > v2: Docs?
> > > 
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > > Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg at intel.com>
> > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > > ---
> > >  lib/Makefile.sources   |   2 +
> > >  lib/igt_core.c         |  19 +++++++-
> > >  lib/igt_kernel_taint.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++++
> > >  lib/igt_kernel_taint.h |  34 ++++++++++++++
> > >  4 files changed, 174 insertions(+), 1 deletion(-)
> > >  create mode 100644 lib/igt_kernel_taint.c
> > >  create mode 100644 lib/igt_kernel_taint.h
> > > 
> > > diff --git a/lib/Makefile.sources b/lib/Makefile.sources
> > > index 6e968d9f7..152153908 100644
> > > --- a/lib/Makefile.sources
> > > +++ b/lib/Makefile.sources
> > > @@ -22,6 +22,8 @@ lib_source_list =           \
> > >       igt_gt.h                \
> > >       igt_gvt.c               \
> > >       igt_gvt.h               \
> > > +     igt_kernel_taint.c      \
> > > +     igt_kernel_taint.h      \
> > >       igt_primes.c            \
> > >       igt_primes.h            \
> > >       igt_rand.c              \
> > > diff --git a/lib/igt_core.c b/lib/igt_core.c
> > > index 03fa6e4e8..486c5989d 100644
> > > --- a/lib/igt_core.c
> > > +++ b/lib/igt_core.c
> > > @@ -63,6 +63,7 @@
> > >  #include "intel_chipset.h"
> > >  #include "intel_io.h"
> > >  #include "igt_debugfs.h"
> > > +#include "igt_kernel_taint.h"
> > >  #include "version.h"
> > >  #include "config.h"
> > >  
> > > @@ -261,6 +262,7 @@ static bool list_subtests = false;
> > >  static char *run_single_subtest = NULL;
> > >  static bool run_single_subtest_found = false;
> > >  static const char *in_subtest = NULL;
> > > +static unsigned long saved_kernel_taint;
> > >  static struct timespec subtest_time;
> > >  static clockid_t igt_clock = (clockid_t)-1;
> > >  static bool in_fixture = false;
> > > @@ -937,6 +939,8 @@ bool __igt_run_subtest(const char *subtest_name)
> > >               return false;
> > >       }
> > >  
> > > +     saved_kernel_taint = igt_read_kernel_taint();
> > > +
> > >       kmsg(KERN_INFO "[IGT] %s: starting subtest %s\n", command_str, subtest_name);
> > >       igt_debug("Starting subtest: %s\n", subtest_name);
> > >  
> > > @@ -1083,8 +1087,21 @@ void __igt_skip_check(const char *file, const int line,
> > >  void igt_success(void)
> > >  {
> > >       succeeded_one = true;
> > > -     if (in_subtest)
> > > +     if (in_subtest) {
> > > +             unsigned long new_kernel_taints =
> > > +                     igt_read_kernel_taint() & ~saved_kernel_taint;
> > > +             unsigned int tainted = igt_kernel_tainted(new_kernel_taints);
> > > +
> > > +             if (tainted) {
> > > +                     igt_kernel_taint_print(new_kernel_taints);
> > > +                     if (tainted & TAINT_ERROR)
> > > +                             exit_subtest("FAIL");
> > > +                     else
> > > +                             exit_subtest("WARN");
> > > +             }
> > > +
> > >               exit_subtest("SUCCESS");
> > > +     }
> > >  }
> > 
> > 
> > If you change the result to FAIL or WARN here, succeeded_one should not be changed.
> > 
> > What of tests that don't have subtests?
> 
> I don't know where they are tracked. If there is a location we can place
> such a before/after check. Or even if we do want to change test status
> at all, and just make it a warn if the kernel becomes tainted.
> 
> The ambition here isn't just to flag oops reliably, but to respond when
> we know the HW is broken and requires rebooting.

Still, the behaviour should be consistent whether it's a simple_test
or a subtest, right?

igt_simple_init_parse_opts would be a good place for storing the old
taint status, and checking generated taints / remapping the result
could be done in igt_exit, when test_with_subtests == false.


-- 
Petri Latvala


More information about the Intel-gfx mailing list