[Intel-gfx] [PATCH i-g-t v2] runner: Don't kill a test on taint if watching timeouts
Petri Latvala
petri.latvala at intel.com
Mon Dec 7 13:09:51 UTC 2020
On Fri, Dec 04, 2020 at 08:50:07PM +0100, Janusz Krzysztofik wrote:
> We may still be interested in results of a test even if it has tainted
> the kernel. On the other hand, we need to kill the test on taint if no
> other means of killing it on a jam is active.
>
> If abort on both kernel taint or a timeout is requested, decrease all
> potential timeouts significantly while the taint is detected instead of
> aborting immediately. However, report the taint as the reason of the
> abort if a timeout decreased by the taint expires.
>
> v2: Fix missing show_kernel_task_state() lost on rebase conflict
> resolution (Chris - thanks!)
>
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
The effects of this is that we sometimes now get more logs from a test
at the cost of it not directly showing up as an incomplete. We would
still get the igt at runner@aborted result for it so overall we still
catch tainting cases.
Chris's comments have been clarified off-list not to mean directly
opposing this patch, so
Reviewed-by: Petri Latvala <petri.latvala at intel.com>
> ---
> runner/executor.c | 26 ++++++++++++++++++++------
> 1 file changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/runner/executor.c b/runner/executor.c
> index 1688ae41d..faf272d85 100644
> --- a/runner/executor.c
> +++ b/runner/executor.c
> @@ -726,6 +726,8 @@ static const char *need_to_timeout(struct settings *settings,
> double time_since_kill,
> size_t disk_usage)
> {
> + int decrease = 1;
> +
> if (killed) {
> /*
> * Timeout after being killed is a hardcoded amount
> @@ -753,20 +755,32 @@ static const char *need_to_timeout(struct settings *settings,
> }
>
> /*
> - * If we're configured to care about taints, kill the
> - * test if there's a taint.
> + * If we're configured to care about taints,
> + * decrease timeouts in use if there's a taint,
> + * or kill the test if no timeouts have been requested.
> */
> if (settings->abort_mask & ABORT_TAINT &&
> - is_tainted(taints))
> - return "Killing the test because the kernel is tainted.\n";
> + is_tainted(taints)) {
> + /* list of timeouts that may postpone immediate kill on taint */
> + if (settings->per_test_timeout || settings->inactivity_timeout)
> + decrease = 10;
> + else
> + return "Killing the test because the kernel is tainted.\n";
> + }
>
> if (settings->per_test_timeout != 0 &&
> - time_since_subtest > settings->per_test_timeout)
> + time_since_subtest > settings->per_test_timeout / decrease) {
> + if (decrease > 1)
> + return "Killing the test because the kernel is tainted.\n";
> return show_kernel_task_state("Per-test timeout exceeded. Killing the current test with SIGQUIT.\n");
> + }
>
> if (settings->inactivity_timeout != 0 &&
> - time_since_activity > settings->inactivity_timeout)
> + time_since_activity > settings->inactivity_timeout / decrease ) {
> + if (decrease > 1)
> + return "Killing the test because the kernel is tainted.\n";
> return show_kernel_task_state("Inactivity timeout exceeded. Killing the current test with SIGQUIT.\n");
> + }
>
> if (disk_usage_limit_exceeded(settings, disk_usage))
> return "Disk usage limit exceeded.\n";
> --
> 2.21.1
>
More information about the Intel-gfx
mailing list