[PATCH i-g-t v4] runner/executor: Detect when child process is killed by a signal

Peter Senna Tschudin peter.senna at linux.intel.com
Tue Sep 3 06:19:23 UTC 2024


Make igt-runner aware about tests being killed by signals. Before this
patch, manually killing a test process would result in igt-runner silently
marking the test as incomplete.

Now igt-runner aborts the run verbosely. As an example the following was
extracted from results.json:
 This test caused an abort condition: Test terminated by a signal Killed (-9): Killed

 v4: improve abort code path to not interfere with igt-runner timeouts
 v3: do not interfere with igt-runner killing tests due to timeout and diskspace
 v2: fix race condition

Cc: Petri Latvala <adrinael at adrinael.net>
Cc: Kamil Konieczny <kamil.konieczny at linux.intel.com>
Signed-off-by: Peter Senna Tschudin <peter.senna at intel.com>
---
 runner/executor.c | 38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/runner/executor.c b/runner/executor.c
index ac73e1dde..4466461c1 100644
--- a/runner/executor.c
+++ b/runner/executor.c
@@ -888,6 +888,8 @@ static int monitor_output(pid_t child,
 	const int interval_length = 1;
 	int wd_timeout;
 	int killed = 0; /* 0 if not killed, signal number otherwise */
+	bool child_reaped = false;
+	bool child_killed_by_signal = false;
 	struct timespec time_beg, time_now, time_last_activity, time_last_subtest, time_killed;
 	unsigned long taints = 0;
 	bool aborting = false;
@@ -960,6 +962,25 @@ static int monitor_output(pid_t child,
 
 		igt_gettime(&time_now);
 
+		/* Testing for !killed to prevent aborting too early after igt-runner
+		 * decides to kill a process.
+		 */
+		if (!killed && (child == waitpid(child, &status, WNOHANG))) {
+			child_reaped = true;
+			if (WIFSIGNALED(status)) {
+				child_killed_by_signal = true;
+				killed = WTERMSIG(status);
+
+				/*
+				 * Do not abort just yet, because igt-runner can kill the test
+				 * due to a timeout for example. Aborting here prevents
+				 * igt-runner from reporting a timeout. The code that aborts
+				 * the run after the test was killed is at the end of the
+				 * while() loop.
+				 */
+			}
+		}
+
 		/* TODO: Refactor these handlers to their own functions */
 		if (outfd >= 0 && FD_ISSET(outfd, &set)) {
 			char *newline;
@@ -1241,7 +1262,11 @@ static int monitor_output(pid_t child,
 				errf("Error reading from signalfd: %m\n");
 				continue;
 			} else if (siginfo.ssi_signo == SIGCHLD) {
-				if (child != waitpid(child, &status, WNOHANG)) {
+				if (!child_reaped) {
+					if (child == waitpid(child, &status, WNOHANG))
+						child_reaped = true;
+				}
+				if (!child_reaped) {
 					errf("Failed to reap child\n");
 					status = 9999;
 				} else if (WIFEXITED(status)) {
@@ -1483,6 +1508,17 @@ static int monitor_output(pid_t child,
 				return -1;
 			time_killed = time_now;
 		}
+
+		if (child_killed_by_signal) {
+			aborting = true;
+
+			sprintf(buf, "Test terminated by a signal %s (%d): %s\n",
+					strsignal(killed), -killed, sigdescr_np(killed));
+			errf("%s", buf);
+			*abortreason = strdup(buf);
+
+			break;
+		}
 	}
 
 	dump_dmesg(kmsgfd, outputs[_F_DMESG]);
-- 
2.34.1



More information about the igt-dev mailing list