[Intel-gfx] [PATCH igt] lib: Check and report if a subtest triggers a new kernel taint

Chris Wilson chris at chris-wilson.co.uk
Tue Dec 5 12:38:19 UTC 2017


Quoting Petri Latvala (2017-12-05 12:32:36)
> On Tue, Dec 05, 2017 at 12:24:53PM +0000, Chris Wilson wrote:
> > Checking for a tainted kernel is a convenient way to see if the test
> > generated a critical error such as a oops, or machine check.
> > 
> > v2: Docs?
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg at intel.com>
> > Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > ---
> >  lib/Makefile.sources   |   2 +
> >  lib/igt_core.c         |  19 +++++++-
> >  lib/igt_kernel_taint.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  lib/igt_kernel_taint.h |  34 ++++++++++++++
> >  4 files changed, 174 insertions(+), 1 deletion(-)
> >  create mode 100644 lib/igt_kernel_taint.c
> >  create mode 100644 lib/igt_kernel_taint.h
> > 
> > diff --git a/lib/Makefile.sources b/lib/Makefile.sources
> > index 6e968d9f7..152153908 100644
> > --- a/lib/Makefile.sources
> > +++ b/lib/Makefile.sources
> > @@ -22,6 +22,8 @@ lib_source_list =           \
> >       igt_gt.h                \
> >       igt_gvt.c               \
> >       igt_gvt.h               \
> > +     igt_kernel_taint.c      \
> > +     igt_kernel_taint.h      \
> >       igt_primes.c            \
> >       igt_primes.h            \
> >       igt_rand.c              \
> > diff --git a/lib/igt_core.c b/lib/igt_core.c
> > index 03fa6e4e8..486c5989d 100644
> > --- a/lib/igt_core.c
> > +++ b/lib/igt_core.c
> > @@ -63,6 +63,7 @@
> >  #include "intel_chipset.h"
> >  #include "intel_io.h"
> >  #include "igt_debugfs.h"
> > +#include "igt_kernel_taint.h"
> >  #include "version.h"
> >  #include "config.h"
> >  
> > @@ -261,6 +262,7 @@ static bool list_subtests = false;
> >  static char *run_single_subtest = NULL;
> >  static bool run_single_subtest_found = false;
> >  static const char *in_subtest = NULL;
> > +static unsigned long saved_kernel_taint;
> >  static struct timespec subtest_time;
> >  static clockid_t igt_clock = (clockid_t)-1;
> >  static bool in_fixture = false;
> > @@ -937,6 +939,8 @@ bool __igt_run_subtest(const char *subtest_name)
> >               return false;
> >       }
> >  
> > +     saved_kernel_taint = igt_read_kernel_taint();
> > +
> >       kmsg(KERN_INFO "[IGT] %s: starting subtest %s\n", command_str, subtest_name);
> >       igt_debug("Starting subtest: %s\n", subtest_name);
> >  
> > @@ -1083,8 +1087,21 @@ void __igt_skip_check(const char *file, const int line,
> >  void igt_success(void)
> >  {
> >       succeeded_one = true;
> > -     if (in_subtest)
> > +     if (in_subtest) {
> > +             unsigned long new_kernel_taints =
> > +                     igt_read_kernel_taint() & ~saved_kernel_taint;
> > +             unsigned int tainted = igt_kernel_tainted(new_kernel_taints);
> > +
> > +             if (tainted) {
> > +                     igt_kernel_taint_print(new_kernel_taints);
> > +                     if (tainted & TAINT_ERROR)
> > +                             exit_subtest("FAIL");
> > +                     else
> > +                             exit_subtest("WARN");
> > +             }
> > +
> >               exit_subtest("SUCCESS");
> > +     }
> >  }
> 
> 
> If you change the result to FAIL or WARN here, succeeded_one should not be changed.
> 
> What of tests that don't have subtests?

I don't know where they are tracked. If there is a location we can place
such a before/after check. Or even if we do want to change test status
at all, and just make it a warn if the kernel becomes tainted.

The ambition here isn't just to flag oops reliably, but to respond when
we know the HW is broken and requires rebooting.
-Chris


More information about the Intel-gfx mailing list