[PATCH i-g-t v10] igt-runner fact checking

Janusz Krzysztofik janusz.krzysztofik at linux.intel.com
Mon Dec 9 09:17:03 UTC 2024


On Friday, 6 December 2024 06:45:31 CET Peter Senna Tschudin wrote:
> Hi Janusz,
> 
> Thank you for your detailed comments. I appreciate the opportunity
> to clarify and address your concerns.
> 
> On 05.12.2024 15:05, Janusz Krzysztofik wrote:
> > Hi Peter,
> > 
> > On Wednesday, 4 December 2024 19:44:53 CET Peter Senna Tschudin wrote:
> >> When using igt-runner, collect facts before each test and after the
> >> last test, and report when facts change. The facts are:
> >>  - GPUs on PCI bus: hardware.pci.gpu_at_addr.0000:03:00.0: 8086:e20b Intel Battlemage (Gen20)
> >>  - Associations between PCI GPU and DRM card: hardware.pci.drm_card_at_addr.0000:03:00.0: card1
> >>  - Kernel taints: kernel.is_tainted.taint_warn: true
> >>  - GPU kernel modules loaded: kernel.kmod_is_loaded.i915: true
> >>
> >> This change imposes little execution overhead and adds just a few
> >> lines of logging. The facts will be printed on normal igt-runner
> >> output. Here is a real example from our CI shwoing
> >> hotreplug-lateclose changing the DRM card number 
> > 
> > Since you give that as an example of how helpful your facts can be, and follow 
> > that with a kernel taint example, that may indicate you think, and users of 
> > your facts may then be mislead having that read, that the taint was related to 
> > the change of card number, while both had nothing to do with each other.
> 
> Let’s take a step back to define the purpose and scope of igt-facts:
>  - Definition of a fact from the dictionary: A fact is an objectively verifiable
>    piece of information.
>  - Purpose of igt-facts: Track which tests cause changes to the facts.
> 
> The operation is straightforward: facts are collected before and after each test,
> and any differences are logged. Here’s an example showing a fact change and a new
> fact after running hotreplug-lateclose:
> 
>  [249.858249] [FACT core_hotunplug (hotreplug-lateclose)] changed: hardware.pci.drm_card_at_addr.0000:00:02.0: card0 -> card1
>  [249.858392] [FACT core_hotunplug (hotreplug-lateclose)] new: kernel.is_tainted.taint_die: true
> 
> This output highlights the facts without implying causation between them. The
> tool(and my commit message) neither explains relationships between facts nor
> misleads users into assuming causation.

For me your commit message does.

Can you please provide a full list of "facts" your code is supposed to handle?  
Can you please explain why you selected just those "facts", not others?

Thanks,
Janusz

> 
> > 
> > Please add something like ', which is expected,' to your description.  Changed 
> > card number is expected, and that's nothing wrong with that.  The old, still 
> > open instance of the driver still exists.  It is expected to be already 
> > decoupled from hardware, but it still occupies its minor device number. Then, 
> > new instance of the driver, attached to the same hardware, gets first free 
> > minor number.  Refresh of IGT device filter before health check can perfectly 
> > handle that case.
> 
> The igt-facts tool reports changes without qualifying them as expected or
> unexpected. For example, the change in the DRM card number is logged as a fact,
> irrespective of its expected nature.
> 
> > 
> >> and tainting the
> >> kernel on the abort path:
> > 
> > No, hotreplug-lateclose doesn't taint the kernel. That's wrong conclusion from 
> > what was reported, or wrong wording at least.  Kernel taint (or lockdep not 
> > active) must have been a result of driver issues (or maybe well known 
> > limitations) exposed by the test.  Then igt_runner took the abort path since 
> > it detected those unhealthy conditions.  Again, users of your facts may be 
> > mislead if your messages can really suggest what you tell us about what you 
> > think has happened.
> 
> The kernel taint reported is a fact observed after the test. While its root cause
> lies within the driver or other system components, this is outside the scope of
> igt-facts. The report simply reflects what changed during the test.
> 
> To summarize:
>  - igt-facts is a factual reporting tool. It does not establish causation or
>    interpret changes.
>  - Both the DRM card number change and kernel taint were factual observations post
>    hotreplug-lateclose.
> 
> I hope this clarifies the intent and operation of igt-facts. Please let me know if
> further discussion is needed.
> 
> Thank you,
> 
> Peter
> > 
> > The whole idea of testing i915 resistance to hot unplug scenarios came from 
> > perceived cases of VFs potentially disappearing from VMs.  Some significant 
> > effort was put on that feature of the driver, but since the 'hot' test 
> > variants were never unblocked in CI, things tended to get worse and worse over 
> > time while new driver features that didn't care for hot unplug capability were 
> > added.
> > 
> > Thanks,
> > Janusz
> > 
> >>
> >>  [245.316207] [056/121] (816s left) core_hotunplug (hotreplug-lateclose)
> >>  [245.383596] Starting subtest: hotreplug-lateclose
> >>  [249.843361] Aborting: Lockdep not active
> >>  [249.858249] [FACT core_hotunplug (hotreplug-lateclose)] changed: hardware.pci.drm_card_at_addr.0000:00:02.0: card0 -> card1
> >>  [249.858392] [FACT core_hotunplug (hotreplug-lateclose)] new: kernel.is_tainted.taint_die: true
> >>  [249.859075] Closing watchdogs
> >>
> >> CC: Ryszard Knop <ryszard.knop at intel.com>
> >> CC: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
> >> CC: Zbigniew Kempczyński <zbigniew.kempczynski at intel.com>
> >> CC: Lucas De Marchi <lucas.demarchi at intel.com>
> >> CC: luciano.coelho at intel.com
> >> CC: nirmoy.das at intel.com
> >> CC: stuart.summers at intel.com
> >> CC: himal.prasad.ghimiray at intel.com
> >> CC: dominik.karol.piatkowski at intel.com
> >> CC: katarzyna.piecielska at intel.com
> >> Signed-off-by: Peter Senna Tschudin <peter.senna at linux.intel.com>
> >> ---
> >> v10:
> >>  - fix memory leaks from asprintf (Thank you Dominik Karol!)
> >>  - fix comments for consistency (Thank you Dominik Karol!)
> >>
> >> v9:
> >>  - do not report new hardware when loading/unloading kmod changes
> >>    the string of the GPU name. I accidentally reintroduced this
> >>    issue when refactoring to use linked lists.
> >>  - add tools/lsfacts: 9 lines of code that print either the facts
> >>    or that no facts were found.
> >>  - fix code comments describing functions
> >>  - fix white space issues
> >>
> >> v8:
> >>  - fix white space issues
> >>
> >> v7:
> >>  - refactor to use linked lists provided by igt_lists
> >>  - Added function arguments to code comments
> >>  - updated commit message
> >>
> >> v6:
> >>  - sort includes in igt_facts.c alphabetically
> >>  - add facts for kernel taints using igt_kernel_tainted() and
> >>    igt_explain_taints()
> >>
> >> v5:
> >>  - fix the broken patch format from v4
> >>
> >> v4:
> >>  - fix a bug on delete_fact()
> >>  - drop glib and calls to g_ functions
> >>  - change commit message to indicate that report only on fact changes
> >>  - use consistent format for reporting changes
> >>  - fix SPDX header format
> >>
> >> v3:
> >>  - refreshed commit message
> >>  - changed format SPDX string
> >>  - removed license text
> >>  - replace last_test assignment when null by two ternary operators
> >>  - added function descriptions following example found elsewhere in
> >>    the code
> >>  - added igt_assert to catch failures to realloc()
> >>
> >> v2:
> >>  - add lib/tests/igt_facts.c for basic unit testing
> >>  - bugfix: do not report a new gpu when the driver changes the gpu name
> >>  - bugfix: do not report the pci_id twice on the gpu name
> >>
> >>  lib/igt_facts.c       | 755 ++++++++++++++++++++++++++++++++++++++++++
> >>  lib/igt_facts.h       |  47 +++
> >>  lib/meson.build       |   1 +
> >>  lib/tests/igt_facts.c |  15 +
> >>  lib/tests/meson.build |   1 +
> >>  runner/executor.c     |  10 +
> >>  tools/lsfacts.c       |  25 ++
> >>  tools/meson.build     |   1 +
> >>  8 files changed, 855 insertions(+)
> >>  create mode 100644 lib/igt_facts.c
> >>  create mode 100644 lib/igt_facts.h
> >>  create mode 100644 lib/tests/igt_facts.c
> >>  create mode 100644 tools/lsfacts.c
> >>
> >> diff --git a/lib/igt_facts.c b/lib/igt_facts.c
> >> new file mode 100644
> >> index 000000000..4749d3417
> >> --- /dev/null
> >> +++ b/lib/igt_facts.c
> >> @@ -0,0 +1,755 @@
> >> +// SPDX-License-Identifier: MIT
> >> +// Copyright © 2024 Intel Corporation
> >> +
> >> +#include <ctype.h>
> >> +#include <libudev.h>
> >> +#include <stdio.h>
> >> +#include <sys/time.h>
> >> +#include <time.h>
> >> +
> >> +#include "igt_core.h"
> >> +#include "igt_device_scan.h"
> >> +#include "igt_facts.h"
> >> +#include "igt_kmod.h"
> >> +#include "igt_list.h"
> >> +#include "igt_taints.h"
> >> +
> >> +static struct igt_list_head igt_facts_list_drm_card_head;
> >> +static struct igt_list_head igt_facts_list_kmod_head;
> >> +static struct igt_list_head igt_facts_list_ktaint_head;
> >> +static struct igt_list_head igt_facts_list_pci_gpu_head;
> >> +
> >> +
> >> +/**
> >> + * igt_facts_lists_init:
> >> + *
> >> + * Initialize igt_facts linked lists.
> >> + *
> >> + * Returns: void
> >> + */
> >> +void igt_facts_lists_init(void)
> >> +{
> >> +	IGT_INIT_LIST_HEAD(&igt_facts_list_drm_card_head);
> >> +	IGT_INIT_LIST_HEAD(&igt_facts_list_kmod_head);
> >> +	IGT_INIT_LIST_HEAD(&igt_facts_list_ktaint_head);
> >> +	IGT_INIT_LIST_HEAD(&igt_facts_list_pci_gpu_head);
> >> +}
> >> +
> >> +
> >> +/**
> >> + * igt_facts_log:
> >> + * @last_test: name of the test that triggered the fact
> >> + * @name: name of the fact
> >> + * @new_value: new value of the fact
> >> + * @old_value: old value of the fact
> >> + *
> >> + * Reports fact changes:
> >> + * - new fact: if old_value is NULL and new_value is not NULL
> >> + * - deleted fact: if new_value is NULL and old_value is not NULL
> >> + * - changed fact: if new_value is different from old_value
> >> + *
> >> + * Returns: void
> >> + */
> >> +static void igt_facts_log(const char *last_test, const char *name,
> >> +			  const char *new_value, const char *old_value)
> >> +{
> >> +	struct timespec uptime_ts;
> >> +	char *uptime = NULL;
> >> +	const char *before_tests = "before any test";
> >> +
> >> +	if (old_value == NULL && new_value == NULL)
> >> +		return;
> >> +
> >> +	if (clock_gettime(CLOCK_BOOTTIME, &uptime_ts) != 0)
> >> +		return;
> >> +
> >> +	asprintf(&uptime,
> >> +		 "%ld.%06ld",
> >> +		 uptime_ts.tv_sec,
> >> +		 uptime_ts.tv_nsec / 1000);
> >> +
> >> +	/* New fact */
> >> +	if (old_value == NULL && new_value != NULL) {
> >> +		igt_info("[%s] [FACT %s] new: %s: %s\n",
> >> +			 uptime,
> >> +			 last_test ? last_test : before_tests,
> >> +			 name,
> >> +			 new_value);
> >> +		goto out;
> >> +	}
> >> +
> >> +	/* Update fact */
> >> +	if (old_value != NULL && new_value != NULL) {
> >> +		igt_info("[%s] [FACT %s] changed: %s: %s -> %s\n",
> >> +			 uptime,
> >> +			 last_test ? last_test : before_tests,
> >> +			 name,
> >> +			 old_value,
> >> +			 new_value);
> >> +		goto out;
> >> +	}
> >> +
> >> +	/* Deleted fact */
> >> +	if (old_value != NULL && new_value == NULL) {
> >> +		igt_info("[%s] [FACT %s] deleted: %s: %s\n",
> >> +			 uptime,
> >> +			 last_test ? last_test : before_tests,
> >> +			 name,
> >> +			 old_value);
> >> +		goto out;
> >> +	}
> >> +
> >> +out:
> >> +	free(uptime);
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_list_get:
> >> + * @name: name of the fact to be added
> >> + * @head: head of the list
> >> + *
> >> + * Get a fact from the list.
> >> + *
> >> + * Returns: pointer to the fact if found, NULL otherwise
> >> + *
> >> + */
> >> +static igt_fact *igt_facts_list_get(const char *name,
> >> +				    struct igt_list_head *head)
> >> +{
> >> +	igt_fact *fact = NULL;
> >> +
> >> +	if (igt_list_empty(head))
> >> +		return NULL;
> >> +
> >> +	igt_list_for_each_entry(fact, head, link) {
> >> +		if (strcmp(fact->name, name) == 0)
> >> +			return fact;
> >> +	}
> >> +	return NULL;
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_list_del:
> >> + * @name: name of the fact to be added
> >> + * @head: head of the list
> >> + * @last_test: name of the last test
> >> + * @log: bool indicating if the delete operation should be logged
> >> + *
> >> + * Delete a fact from the list.
> >> + *
> >> + * Returns: bool indicating if fact was deleted from the list
> >> + *
> >> + */
> >> +static bool igt_facts_list_del(const char *name,
> >> +			       struct igt_list_head *head,
> >> +			       const char *last_test,
> >> +			       bool log)
> >> +{
> >> +	igt_fact *fact = NULL;
> >> +
> >> +	if (igt_list_empty(head))
> >> +		return false;
> >> +
> >> +	igt_list_for_each_entry(fact, head, link) {
> >> +		if (strcmp(fact->name, name) == 0) {
> >> +			if (log)
> >> +				igt_facts_log(last_test, fact->name,
> >> +					      NULL, fact->value);
> >> +
> >> +			igt_list_del(&fact->link);
> >> +			free(fact->name);
> >> +			free(fact->value);
> >> +			free(fact->last_test);
> >> +			free(fact);
> >> +			return true;
> >> +		}
> >> +	}
> >> +	return false;
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_list_add:
> >> + * @name: name of the fact to be added
> >> + * @value: value of the fact to be added
> >> + * @last_test: name of the last test
> >> + * @head: head of the list
> >> + *
> >> + * Returns: bool indicating if fact was added to the list
> >> + *
> >> + */
> >> +static bool igt_facts_list_add(const char *name,
> >> +			       const char *value,
> >> +			       const char *last_test,
> >> +			       struct igt_list_head *head)
> >> +{
> >> +	igt_fact *new_fact = NULL, *old_fact = NULL;
> >> +	bool logged = false;
> >> +
> >> +	if (name == NULL || value == NULL)
> >> +		return false;
> >> +
> >> +	old_fact = igt_facts_list_get(name, head);
> >> +	if (old_fact) {
> >> +		if (strcmp(old_fact->value, value) == 0) {
> >> +			old_fact->present = true;
> >> +			return false;
> >> +		}
> >> +		igt_facts_log(last_test, name, value, old_fact->value);
> >> +		logged = true;
> >> +		igt_facts_list_del(name, head, last_test, false);
> >> +	}
> >> +
> >> +	new_fact = malloc(sizeof(igt_fact));
> >> +	if (new_fact == NULL)
> >> +		return false;
> >> +
> >> +	new_fact->name = strdup(name);
> >> +	new_fact->value = strdup(value);
> >> +	new_fact->last_test = last_test ? strdup(last_test) : NULL;
> >> +	new_fact->present = true;
> >> +
> >> +	if (!logged)
> >> +		igt_facts_log(last_test, name, value, NULL);
> >> +
> >> +	igt_list_add(&new_fact->link, head);
> >> +
> >> +	return true;
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_list_mark:
> >> + * @head: head of the list
> >> + *
> >> + * Mark all facts in the list as not present. Opted for the mark and sweep
> >> + * design pattern due to its simplicity and efficiency.
> >> + *
> >> + * Returns: void
> >> + */
> >> +static void igt_facts_list_mark(struct igt_list_head *head)
> >> +{
> >> +	igt_fact *fact = NULL;
> >> +
> >> +	if (igt_list_empty(head))
> >> +		return;
> >> +
> >> +	igt_list_for_each_entry(fact, head, link)
> >> +		fact->present = false;
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_list_sweep:
> >> + * @head: head of the list
> >> + * @last_test: name of the last test
> >> + *
> >> + * Sweep the list and delete all facts that are not present. Opted for the mark
> >> + * and sweep design pattern due to its simplicity and efficiency.
> >> + *
> >> + * Returns: void
> >> + */
> >> +static void igt_facts_list_sweep(struct igt_list_head *head,
> >> +				 const char *last_test)
> >> +{
> >> +	igt_fact *fact = NULL, *tmp = NULL;
> >> +
> >> +	if (igt_list_empty(head))
> >> +		return;
> >> +
> >> +	igt_list_for_each_entry_safe(fact, tmp, head, link)
> >> +		if (!fact->present)
> >> +			igt_facts_list_del(fact->name, head, last_test, true);
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_list_mark_and_sweep:
> >> + * @head: head of the list
> >> + *
> >> + * Clean up the list using mark and sweep. Opted for the mark and sweep
> >> + * design pattern due to its simplicity and efficiency.
> >> + *
> >> + * Returns: void
> >> + */
> >> +static void igt_facts_list_mark_and_sweep(struct igt_list_head *head)
> >> +{
> >> +	igt_facts_list_mark(head);
> >> +	igt_facts_list_sweep(head, NULL);
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_are_all_lists_empty:
> >> + *
> >> + * Returns true if all lists are empty. Used by the tool lsfacts.
> >> + *
> >> + * Returns: bool
> >> + */
> >> +bool igt_facts_are_all_lists_empty(void)
> >> +{
> >> +	return igt_list_empty(&igt_facts_list_drm_card_head) &&
> >> +	       igt_list_empty(&igt_facts_list_kmod_head) &&
> >> +	       igt_list_empty(&igt_facts_list_ktaint_head) &&
> >> +	       igt_list_empty(&igt_facts_list_pci_gpu_head);
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_scan_pci_gpus:
> >> + * @last_test: name of the last test
> >> + *
> >> + * This function scans the pci bus for gpus using udev. It uses
> >> + * igt_facts_list_mark(), igt_facts_list_add() and igt_facts_list_sweep() to
> >> + * update igt_facts_list_pci_gpu_head.
> >> + *
> >> + * Returns: void
> >> + */
> >> +static void igt_facts_scan_pci_gpus(const char *last_test)
> >> +{
> >> +	static struct igt_list_head *head = &igt_facts_list_pci_gpu_head;
> >> +	struct udev *udev = NULL;
> >> +	struct udev_enumerate *enumerate = NULL;
> >> +	struct udev_list_entry *devices, *dev_list_entry;
> >> +	struct igt_device_card card;
> >> +	char pcistr[10];
> >> +	int ret;
> >> +	char *factname = NULL;
> >> +	char *factvalue = NULL;
> >> +
> >> +	udev = udev_new();
> >> +	if (!udev) {
> >> +		igt_warn("Failed to create udev context\n");
> >> +		return;
> >> +	}
> >> +
> >> +	enumerate = udev_enumerate_new(udev);
> >> +	if (!enumerate) {
> >> +		igt_warn("Failed to create udev enumerate\n");
> >> +		udev_unref(udev);
> >> +		return;
> >> +	}
> >> +
> >> +	ret = udev_enumerate_add_match_subsystem(enumerate, "pci");
> >> +	if (ret < 0)
> >> +		goto out;
> >> +
> >> +	ret = udev_enumerate_add_match_property(enumerate,
> >> +						"PCI_CLASS",
> >> +						"30000");
> >> +	if (ret < 0)
> >> +		goto out;
> >> +
> >> +	ret = udev_enumerate_add_match_property(enumerate,
> >> +						"PCI_CLASS",
> >> +						"38000");
> >> +	if (ret < 0)
> >> +		goto out;
> >> +
> >> +	ret = udev_enumerate_scan_devices(enumerate);
> >> +	if (ret < 0)
> >> +		goto out;
> >> +
> >> +	devices = udev_enumerate_get_list_entry(enumerate);
> >> +	if (!devices)
> >> +		goto out;
> >> +
> >> +	igt_facts_list_mark(head);
> >> +
> >> +	udev_list_entry_foreach(dev_list_entry, devices) {
> >> +		const char *path;
> >> +		struct udev_device *udev_dev;
> >> +		struct udev_list_entry *entry;
> >> +		char *model = NULL;
> >> +		char *codename = NULL;
> >> +		igt_fact *old_fact = NULL;
> >> +
> >> +		path = udev_list_entry_get_name(dev_list_entry);
> >> +		udev_dev = udev_device_new_from_syspath(udev, path);
> >> +		if (!udev_dev)
> >> +			continue;
> >> +
> >> +		/* Strip path to only the content after the last / */
> >> +		path = strrchr(path, '/');
> >> +		if (path)
> >> +			path++;
> >> +		else
> >> +			path = "unknown";
> >> +
> >> +		strcpy(card.pci_slot_name, "-");
> >> +
> >> +		entry = udev_device_get_properties_list_entry(udev_dev);
> >> +		while (entry) {
> >> +			const char *name = udev_list_entry_get_name(entry);
> >> +			const char *value = udev_list_entry_get_value(entry);
> >> +
> >> +			entry = udev_list_entry_get_next(entry);
> >> +			if (!strcmp(name, "ID_MODEL_FROM_DATABASE"))
> >> +				model = strdup(value);
> >> +			else if (!strcmp(name, "PCI_ID"))
> >> +				igt_assert_eq(sscanf(value, "%hx:%hx",
> >> +						     &card.pci_vendor,
> >> +						     &card.pci_device), 2);
> >> +		}
> >> +		snprintf(pcistr, sizeof(pcistr), "%04x:%04x",
> >> +			 card.pci_vendor, card.pci_device);
> >> +		codename = igt_device_get_pretty_name(&card, false);
> >> +
> >> +		/* Set codename to null if it is the same string as pci_id */
> >> +		if (codename && strcmp(pcistr, codename) == 0) {
> >> +			free(codename);
> >> +			codename = NULL;
> >> +		}
> >> +		asprintf(&factname, "%s.%s", pci_gpu_fact, path);
> >> +		asprintf(&factvalue,
> >> +			"%s %s %s",
> >> +			pcistr,
> >> +			codename ? codename : "",
> >> +			model ? model : "");
> >> +
> >> +		/**
> >> +		 * Loading and unloading the kmod may change the human
> >> +		 * readeable string in value. Do not change value if the
> >> +		 * pci id is the same.
> >> +		 */
> >> +		old_fact = igt_facts_list_get(factname, head);
> >> +		if (old_fact && strncmp(old_fact->value, factvalue, 9) == 0)
> >> +			old_fact->present = true;
> >> +		else
> >> +			igt_facts_list_add(factname, factvalue, last_test, head);
> >> +
> >> +		free(codename);
> >> +		free(model);
> >> +		free(factname);
> >> +		free(factvalue);
> >> +		udev_device_unref(udev_dev);
> >> +	}
> >> +
> >> +	igt_facts_list_sweep(head, last_test);
> >> +
> >> +out:
> >> +	udev_enumerate_unref(enumerate);
> >> +	udev_unref(udev);
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_scan_pci_drm_cards:
> >> + * @last_test: name of the last test
> >> + *
> >> + * This function scans the pci bus for drm cards using udev. It uses the
> >> + * igt_facts_list_mark(), igt_facts_list_add() and igt_facts_list_sweep() to
> >> + * update igt_facts_list_drm_card_head.
> >> + *
> >> + * Returns: void
> >> + */
> >> +static void igt_facts_scan_pci_drm_cards(const char *last_test)
> >> +{
> >> +	static struct igt_list_head *head = &igt_facts_list_drm_card_head;
> >> +	struct udev *udev = NULL;
> >> +	struct udev_enumerate *enumerate = NULL;
> >> +	struct udev_list_entry *devices, *dev_list_entry;
> >> +	int ret;
> >> +	char *factname = NULL;
> >> +	char *factvalue = NULL;
> >> +
> >> +	udev = udev_new();
> >> +	if (!udev)
> >> +		return;
> >> +
> >> +	enumerate = udev_enumerate_new(udev);
> >> +	if (!enumerate) {
> >> +		udev_unref(udev);
> >> +		return;
> >> +	}
> >> +
> >> +	ret = udev_enumerate_add_match_subsystem(enumerate, "drm");
> >> +	if (ret < 0)
> >> +		goto out;
> >> +
> >> +	ret = udev_enumerate_scan_devices(enumerate);
> >> +	if (ret < 0)
> >> +		goto out;
> >> +
> >> +	devices = udev_enumerate_get_list_entry(enumerate);
> >> +	if (!devices)
> >> +		goto out;
> >> +
> >> +	ret = udev_enumerate_add_match_subsystem(enumerate, "drm");
> >> +	if (ret < 0)
> >> +		goto out;
> >> +
> >> +	ret = udev_enumerate_scan_devices(enumerate);
> >> +	if (ret < 0)
> >> +		goto out;
> >> +
> >> +	devices = udev_enumerate_get_list_entry(enumerate);
> >> +	if (!devices)
> >> +		goto out;
> >> +
> >> +	igt_facts_list_mark(head);
> >> +
> >> +	udev_list_entry_foreach(dev_list_entry, devices) {
> >> +		const char *path;
> >> +		struct udev_device *drm_dev, *pci_dev;
> >> +		const char *drm_name, *pci_addr;
> >> +
> >> +		path = udev_list_entry_get_name(dev_list_entry);
> >> +		drm_dev = udev_device_new_from_syspath(udev, path);
> >> +		if (!drm_dev)
> >> +			continue;
> >> +
> >> +		drm_name = udev_device_get_sysname(drm_dev);
> >> +		/* Filter the device by name. Want devices such as card0 and card1.
> >> +		 * If the device has '-' in the name, contine
> >> +		 */
> >> +		if (strncmp(drm_name, "card", 4) != 0 ||
> >> +		    strchr(drm_name, '-') != NULL) {
> >> +			udev_device_unref(drm_dev);
> >> +			continue;
> >> +		}
> >> +
> >> +		/* Get the pci address of the gpu associated with the drm_dev*/
> >> +		pci_dev = udev_device_get_parent_with_subsystem_devtype(drm_dev,
> >> +									"pci",
> >> +									NULL);
> >> +		if (pci_dev) {
> >> +			pci_addr = udev_device_get_sysattr_value(pci_dev,
> >> +								 "address");
> >> +			if (!pci_addr)
> >> +				pci_addr = udev_device_get_sysname(pci_dev);
> >> +		} else {
> >> +			/* Some GPUs are platform devices. Ignore them. */
> >> +			pci_addr = NULL;
> >> +			udev_device_unref(drm_dev);
> >> +			continue;
> >> +		}
> >> +
> >> +		asprintf(&factname, "%s.%s", drm_card_fact, pci_addr);
> >> +		asprintf(&factvalue, "%s", drm_name);
> >> +
> >> +		igt_facts_list_add(factname, factvalue, last_test, head);
> >> +
> >> +		free(factname);
> >> +		free(factvalue);
> >> +		udev_device_unref(drm_dev);
> >> +	}
> >> +
> >> +	igt_facts_list_sweep(head, last_test);
> >> +
> >> +out:
> >> +	udev_enumerate_unref(enumerate);
> >> +	udev_unref(udev);
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_scan_kernel_taints:
> >> + * @last_test: name of the last test
> >> + *
> >> + * This function scans for kernel taints using igt_kernel_tainted() and
> >> + * igt_explain_taints(). It will cut off the explanation keeping only the
> >> + * taint name.
> >> + *
> >> + * Returns: void
> >> + */
> >> +static void igt_facts_scan_kernel_taints(const char *last_test)
> >> +{
> >> +	static struct igt_list_head *head = &igt_facts_list_ktaint_head;
> >> +	unsigned long taints = 0;
> >> +	const char *reason = NULL;
> >> +	char *taint_name = NULL;
> >> +	char *fact_name = NULL;
> >> +
> >> +	taints = igt_kernel_tainted(&taints);
> >> +	/* For testing, set all bits to 1
> >> +	 * taints = 0xFFFFFFFF;
> >> +	 */
> >> +
> >> +
> >> +	igt_facts_list_mark(head);
> >> +
> >> +	while ((reason = igt_explain_taints(&taints)) != NULL) {
> >> +		/* Cut at the ':' to get only the taint name */
> >> +		taint_name = strtok(strdup(reason), ":");
> >> +		if (!taint_name)
> >> +			continue;
> >> +
> >> +		/* Lowercase taint_name */
> >> +		for (int i = 0; taint_name[i]; i++)
> >> +			taint_name[i] = tolower(taint_name[i]);
> >> +
> >> +		asprintf(&fact_name, "%s.%s", ktaint_fact, taint_name);
> >> +		igt_facts_list_add(fact_name, "true", last_test, head);
> >> +
> >> +		free(taint_name);
> >> +		free(fact_name);
> >> +	}
> >> +
> >> +	igt_facts_list_sweep(head, last_test);
> >> +}
> >> +
> >> +
> >> +/**
> >> + * igt_facts_scan_kernel_loaded_kmods:
> >> + * @last_test: name of the last test
> >> + *
> >> + * This function scans for loaded kmods using igt_fact_kmod_list and
> >> + * igt_kmod_is_loaded().
> >> + *
> >> + * Returns: void
> >> + */
> >> +static void igt_facts_scan_kernel_loaded_kmods(const char *last_test)
> >> +{
> >> +	static struct igt_list_head *head = &igt_facts_list_kmod_head;
> >> +	char *name = NULL;
> >> +
> >> +	igt_facts_list_mark(head);
> >> +
> >> +	/* Iterate over igt_fact_kmod_list[] until the element contains "\0" */
> >> +	for (int i = 0; strcmp(igt_fact_kmod_list[i], "\0") != 0; i++) {
> >> +		asprintf(&name, "%s.%s", kmod_fact, igt_fact_kmod_list[i]);
> >> +		if (igt_kmod_is_loaded(igt_fact_kmod_list[i]))
> >> +			igt_facts_list_add(name, "true", last_test, head);
> >> +
> >> +		free(name);
> >> +	}
> >> +
> >> +	igt_facts_list_sweep(head, last_test);
> >> +}
> >> +
> >> +/**
> >> + * igt_facts:
> >> + * @last_test: name of the last test
> >> + *
> >> + * Call this function where you want to gather and report facts.
> >> + *
> >> + * Returns: void
> >> + */
> >> +void igt_facts(const char *last_test)
> >> +{
> >> +	igt_facts_scan_pci_gpus(last_test);
> >> +	igt_facts_scan_pci_drm_cards(last_test);
> >> +	igt_facts_scan_kernel_taints(last_test);
> >> +	igt_facts_scan_kernel_loaded_kmods(last_test);
> >> +
> >> +	fflush(stdout);
> >> +	fflush(stderr);
> >> +}
> >> +
> >> +/*
> >> + * Testing
> >> + *
> >> + * Defined here to keep most of the functions static
> >> + *
> >> + */
> >> +
> >> +/**
> >> + * igt_facts_test_add_get:
> >> + * @head: head of the list
> >> + *
> >> + * Tests igt_facts_list_add and igt_facts_list_get.
> >> + *
> >> + * Returns: void
> >> + */
> >> +static void igt_facts_test_add_get(struct igt_list_head *head)
> >> +{
> >> +	igt_fact *fact = NULL;
> >> +	bool ret;
> >> +	const char *name = "hardware.pci.gpu_at_addr.0000:00:02.0";
> >> +	const char *value = "8086:64a0 Intel Lunarlake (Gen20)";
> >> +	const char *last_test = NULL;
> >> +
> >> +	ret = igt_facts_list_add(name, value, last_test, head);
> >> +	igt_assert(ret == true);
> >> +
> >> +	// Assert that there is one element in the linked list
> >> +	igt_assert_eq(igt_list_length(head), 1);
> >> +
> >> +	// Assert that the element in the linked list is the one we added
> >> +	fact = igt_facts_list_get(name, head);
> >> +	igt_assert(fact != NULL);
> >> +	igt_assert_eq(strcmp(fact->name, name), 0);
> >> +	igt_assert_eq(strcmp(fact->value, value), 0);
> >> +	igt_assert(fact->present == true);
> >> +	igt_assert(fact->last_test == NULL);
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_test_mark_and_sweep:
> >> + * @head: head of the list
> >> + *
> >> + * - Add 3 elements to the list and mark them as not present.
> >> + * - Update two of the elements and mark them as present.
> >> + * - Sweep the list and assert that
> >> + *   - Only the two updated elements are present
> >> + *   - The third element was deleted
> >> + *
> >> + * Returns: void
> >> + */
> >> +static void igt_facts_test_mark_and_sweep(struct igt_list_head *head)
> >> +{
> >> +	igt_fact *fact = NULL;
> >> +	const char *name1 = "hardware.pci.gpu_at_addr.0000:00:02.0";
> >> +	const char *value1 = "8086:64a0 Intel Lunarlake (Gen20)";
> >> +	const char *name2 = "hardware.pci.gpu_at_addr.0000:00:03.0";
> >> +	const char *value2 = "8086:64a1 Intel Lunarlake (Gen21)";
> >> +	const char *name3 = "hardware.pci.gpu_at_addr.0000:00:04.0";
> >> +	const char *value3 = "8086:64a2 Intel Lunarlake (Gen22)";
> >> +
> >> +	igt_facts_list_add(name1, value1, NULL, head);
> >> +	igt_facts_list_add(name2, value2, NULL, head);
> >> +	igt_facts_list_add(name3, value3, NULL, head);
> >> +
> >> +	igt_facts_list_mark(head);
> >> +
> >> +	igt_facts_list_add(name1, value1, NULL, head);
> >> +	igt_facts_list_add(name2, value2, NULL, head);
> >> +
> >> +	igt_facts_list_sweep(head, NULL);
> >> +
> >> +	// Assert that there are two elements in the linked list
> >> +	igt_assert_eq(igt_list_length(head), 2);
> >> +
> >> +	// Assert that the two updated elements are present
> >> +	fact = igt_facts_list_get(name1, head);
> >> +	igt_assert(fact != NULL);
> >> +	igt_assert(fact->present == true);
> >> +
> >> +	fact = igt_facts_list_get(name2, head);
> >> +	igt_assert(fact != NULL);
> >> +	igt_assert(fact->present == true);
> >> +
> >> +	// Assert that the third element was deleted
> >> +	fact = igt_facts_list_get(name3, head);
> >> +	igt_assert(fact == NULL);
> >> +}
> >> +
> >> +/**
> >> + * igt_facts_test:
> >> + *
> >> + * Main function for testing the igt_facts module
> >> + *
> >> + * Returns: bool indicating if the tests passed
> >> + */
> >> +void igt_facts_test(void)
> >> +{
> >> +	const char *last_test = "Unit Testing";
> >> +
> >> +	igt_facts_lists_init();
> >> +
> >> +	/* Assert that all lists are empty */
> >> +	igt_assert(igt_list_empty(&igt_facts_list_kmod_head));
> >> +	igt_assert(igt_list_empty(&igt_facts_list_ktaint_head));
> >> +	igt_assert(igt_list_empty(&igt_facts_list_pci_gpu_head));
> >> +	igt_assert(igt_list_empty(&igt_facts_list_drm_card_head));
> >> +
> >> +	/* Assert that add and get work. Will add one element to the list */
> >> +	igt_facts_test_add_get(&igt_facts_list_pci_gpu_head);
> >> +
> >> +	/* Assert that igt_facts_list_mark_and_sweep() cleans up the list */
> >> +	igt_assert(igt_list_empty(&igt_facts_list_pci_gpu_head) == false);
> >> +	igt_facts_list_mark_and_sweep(&igt_facts_list_pci_gpu_head);
> >> +	igt_assert(igt_list_empty(&igt_facts_list_pci_gpu_head) == true);
> >> +
> >> +	/* Test the mark and sweep pattern used to delete elements
> >> +	 * from the list
> >> +	 */
> >> +	igt_facts_test_mark_and_sweep(&igt_facts_list_pci_gpu_head);
> >> +
> >> +	/* Clean up the list and call igt_facts(). This should not crash */
> >> +	igt_facts_list_mark_and_sweep(&igt_facts_list_pci_gpu_head);
> >> +	igt_facts(last_test);
> >> +}
> >> diff --git a/lib/igt_facts.h b/lib/igt_facts.h
> >> new file mode 100644
> >> index 000000000..e4adca3fb
> >> --- /dev/null
> >> +++ b/lib/igt_facts.h
> >> @@ -0,0 +1,47 @@
> >> +/* SPDX-License-Identifier: MIT
> >> + * Copyright © 2024 Intel Corporation
> >> + */
> >> +
> >> +#include <stdbool.h>
> >> +
> >> +#include "igt_list.h"
> >> +
> >> +
> >> +/* igt_fact:
> >> + * @name: name of the fact
> >> + * @value: value of the fact
> >> + * @last_test: name of the test that triggered the fact
> >> + * @present: bool indicating if fact is present. Used for deleting facts from
> >> + * the list.
> >> + * @link: link to the next fact
> >> + *
> >> + * A fact is a piece of information that can be used to determine the state of
> >> + * the system.
> >> + *
> >> + */
> >> +typedef struct {
> >> +	char *name;
> >> +	char *value;
> >> +	char *last_test;
> >> +	bool present; /* For mark and seep */
> >> +	struct igt_list_head link;
> >> +} igt_fact;
> >> +
> >> +const char *igt_fact_kmod_list[] = {
> >> +	"amdgpu",
> >> +	"i915",
> >> +	"nouveau",
> >> +	"radeon",
> >> +	"xe",
> >> +	"\0"
> >> +};
> >> +
> >> +const char *kmod_fact     = "kernel.kmod_is_loaded"; /* true or false */
> >> +const char *ktaint_fact   = "kernel.is_tainted"; /* taint name: taint_warn */
> >> +const char *pci_gpu_fact  = "hardware.pci.gpu_at_addr"; /* id vendor model */
> >> +const char *drm_card_fact = "hardware.pci.drm_card_at_addr"; /* cardX */
> >> +
> >> +void igt_facts_lists_init(void);
> >> +void igt_facts(const char *last_test);
> >> +bool igt_facts_are_all_lists_empty(void);
> >> +void igt_facts_test(void); /* For unit testing only */
> >> diff --git a/lib/meson.build b/lib/meson.build
> >> index c3556a921..c44ca2b5a 100644
> >> --- a/lib/meson.build
> >> +++ b/lib/meson.build
> >> @@ -18,6 +18,7 @@ lib_sources = [
> >>  	'i915/i915_crc.c',
> >>  	'igt_collection.c',
> >>  	'igt_color_encoding.c',
> >> +	'igt_facts.c',
> >>  	'igt_crc.c',
> >>  	'igt_debugfs.c',
> >>  	'igt_device.c',
> >> diff --git a/lib/tests/igt_facts.c b/lib/tests/igt_facts.c
> >> new file mode 100644
> >> index 000000000..7fa9d0f22
> >> --- /dev/null
> >> +++ b/lib/tests/igt_facts.c
> >> @@ -0,0 +1,15 @@
> >> +// SPDX-License-Identifier: MIT
> >> +// Copyright © 2024 Intel Corporation
> >> +
> >> +#include <stdbool.h>
> >> +
> >> +#include "igt_core.h"
> >> +#include "igt_facts.h"
> >> +
> >> +/* Tests are not defined here so we can keep most of the functions static */
> >> +
> >> +igt_simple_main
> >> +{
> >> +	igt_info("Running igt_facts_test\n");
> >> +	igt_facts_test();
> >> +}
> >> diff --git a/lib/tests/meson.build b/lib/tests/meson.build
> >> index df8092638..1ce19f63c 100644
> >> --- a/lib/tests/meson.build
> >> +++ b/lib/tests/meson.build
> >> @@ -8,6 +8,7 @@ lib_tests = [
> >>  	'igt_dynamic_subtests',
> >>  	'igt_edid',
> >>  	'igt_exit_handler',
> >> +	'igt_facts',
> >>  	'igt_fork',
> >>  	'igt_fork_helper',
> >>  	'igt_hook',
> >> diff --git a/runner/executor.c b/runner/executor.c
> >> index ac73e1dde..d1eca3c05 100644
> >> --- a/runner/executor.c
> >> +++ b/runner/executor.c
> >> @@ -30,6 +30,7 @@
> >>  
> >>  #include "igt_aux.h"
> >>  #include "igt_core.h"
> >> +#include "igt_facts.h"
> >>  #include "igt_taints.h"
> >>  #include "igt_vec.h"
> >>  #include "executor.h"
> >> @@ -2306,6 +2307,9 @@ bool execute(struct execute_state *state,
> >>  	sigset_t sigmask;
> >>  	double time_spent = 0.0;
> >>  	bool status = true;
> >> +	char *last_test = NULL;
> >> +
> >> +	igt_facts_lists_init();
> >>  
> >>  	if (state->dry) {
> >>  		outf("Dry run, not executing. Invoke igt_resume if you want to execute.\n");
> >> @@ -2438,6 +2442,10 @@ bool execute(struct execute_state *state,
> >>  		int result;
> >>  		bool already_written = false;
> >>  
> >> +		/* Calls before running each test */
> >> +		igt_facts(last_test);
> >> +		last_test = entry_display_name(&job_list->entries[state->next]);
> >> +
> >>  		if (should_die_because_signal(sigfd)) {
> >>  			status = false;
> >>  			goto end;
> >> @@ -2526,6 +2534,8 @@ bool execute(struct execute_state *state,
> >>  			return execute(state, settings, job_list);
> >>  		}
> >>  	}
> >> +	/* Last call to collect facts after the last test runs */
> >> +	igt_facts(last_test);
> >>  
> >>  	if ((timefd = openat(resdirfd, "endtime.txt", O_CREAT | O_WRONLY | O_EXCL, 0666)) >= 0) {
> >>  		dprintf(timefd, "%f\n", timeofday_double());
> >> diff --git a/tools/lsfacts.c b/tools/lsfacts.c
> >> new file mode 100644
> >> index 000000000..10dee0317
> >> --- /dev/null
> >> +++ b/tools/lsfacts.c
> >> @@ -0,0 +1,25 @@
> >> +// SPDX-License-Identifier: MIT
> >> +// Copyright © 2024 Intel Corporation
> >> +
> >> +#include "igt.h"
> >> +#include "igt_facts.h"
> >> +
> >> +/**
> >> + * SECTION:lsfacts
> >> + * @short_description: lsfacts
> >> + * @title: lsfacts
> >> + * @include: lsfacts.c
> >> + *
> >> + * # lsfacts
> >> + *
> >> + * Scan for igt-facts and print them on screen. Indicate if no facts are found.
> >> + */
> >> +int main(int argc, char *argv[])
> >> +{
> >> +	igt_facts_lists_init();
> >> +
> >> +	igt_facts("lsfacts");
> >> +
> >> +	if (igt_facts_are_all_lists_empty())
> >> +		igt_info("No facts found...\n");
> >> +}
> >> diff --git a/tools/meson.build b/tools/meson.build
> >> index 48c9a4b50..ff1b0ef90 100644
> >> --- a/tools/meson.build
> >> +++ b/tools/meson.build
> >> @@ -42,6 +42,7 @@ tools_progs = [
> >>  	'intel_gem_info',
> >>  	'intel_gvtg_test',
> >>  	'dpcd_reg',
> >> +	'lsfacts',
> >>  	'lsgpu',
> >>  	'power',
> >>  ]
> >>
> > 
> > 
> > 
> > 
> 
> 






More information about the igt-dev mailing list