[igt-dev] [PATCH i-g-t v2] runner: Make taint abort messages more verbose
Arkadiusz Hiler
arkadiusz.hiler at intel.com
Mon Dec 3 15:45:39 UTC 2018
On Mon, Dec 03, 2018 at 03:09:29PM +0000, Chris Wilson wrote:
> Quoting Arkadiusz Hiler (2018-12-03 15:00:12)
> > Since not everyone is familiar with kernel taints, and it is easy to get
> > confused and attribute an abort to an innocent TAINT_USER caused by an
> > unsafe module option, which is usually the first thing people find
> > greping dmesg for "taint", we should provide more guidance.
> >
> > This patch extends the abort log by printing the taint names, as found
> > in the kernel, along with a short explanation, so people know what to
> > look for in the dmesg.
> >
> > v2: rebase, reword
> >
> > Cc: Petri Latvala <petri.latvala at intel.com>
> > Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler at intel.com>
> > ---
> > runner/executor.c | 38 ++++++++++++++++++++++++++++++--------
> > 1 file changed, 30 insertions(+), 8 deletions(-)
> >
> > diff --git a/runner/executor.c b/runner/executor.c
> > index 54c530b7..c0639a66 100644
> > --- a/runner/executor.c
> > +++ b/runner/executor.c
> > @@ -137,13 +137,23 @@ static char *handle_lockdep(void)
> > return NULL;
> > }
> >
> > +/* see Linux's include/linux/kernel.h */
> > +static const struct {
> > + unsigned long bit;
> > + const char *explanation;
> > +} abort_taints[] = {
> > + {(1 << 5), "TAINT_BAD_PAGE: Bad page reference or an unexpected page flags."},
> > + {(1 << 7), "TAINT_DIE: Kernel has died - BUG/OOPS."},
> > + {(1 << 9), "TAINT_WARN: WARN_ON has happened."},
> > + {0, 0}};
> > +
> > static unsigned long tainted(unsigned long *taints)
> > {
> > - const unsigned long bad_taints =
> > - 0x20 | /* TAINT_PAGE */
> > - 0x80 | /* TAINT_DIE */
> > - 0x200; /* TAINT_OOPS */
> > FILE *f;
> > + unsigned long bad_taints = 0;
> > +
> > + for (typeof(*abort_taints) *taint = abort_taints; taint->bit; taint++)
> > + bad_taints |= taint->bit;
> >
> > *taints = 0;
> >
> > @@ -158,14 +168,26 @@ static unsigned long tainted(unsigned long *taints)
> >
> > static char *handle_taint(void)
> > {
> > - unsigned long taints, bad_taints;
> > + unsigned long taints;
> > char *reason;
> >
> > - bad_taints = tainted(&taints);
> > - if (!bad_taints)
> > + if (!tainted(&taints))
> > return NULL;
> >
> > - asprintf(&reason, "Kernel tainted (%#lx -- %lx)", taints, bad_taints);
> > + asprintf(&reason, "Kernel badly tainted (%#lx) (check dmesg for details):\n",
> > + taints);
> > +
> > + for (typeof(*abort_taints) *taint = abort_taints; taint->bit; taint++) {
> > + if (taint->bit & taints) {
> > + char *old_reason = reason;
> > + asprintf(&reason, "%s\t(%#lx) %s\n",
> > + old_reason,
> > + taint->bit,
> > + taint->explanation);
> > + free(old_reason);
>
> Still this gives no more information, than having to manually check
> dmesg for the explanation.
I still stand by what has been said last week:
https://lists.freedesktop.org/archives/igt-dev/2018-November/007128.html
I have got two more questions on why are we aborting on unsafe module
options since then.
> "TAINT_BAD_PAGE: Bad page reference or an unexpected page flags."},
> "TAINT_DIE: Kernel has died - BUG/OOPS."},
> "TAINT_WARN: WARN_ON has happened."},
>
> are quite frankly misleading, and not informative.
Any suggestions on how to make them less misleading?
> And since the runner is handling the dmesg, do the work to present the
> relevant information.
> -Chris
WIP by Petri, but that is going to take a while.
--
Cheers,
Arek
More information about the igt-dev
mailing list