[PATCH i-g-t] igt-runner fact checking
Lucas De Marchi
lucas.demarchi at intel.com
Wed Nov 6 14:13:59 UTC 2024
On Sat, Nov 02, 2024 at 12:37:59PM +0100, Peter Senna Tschudin wrote:
>On igt-runner, before each test, and after the last test collect and report
>facts, such as:
> - hardware.pci.gpu_at_addr.0000:03:00.0 = 8086:e20b Intel Battlemage (Gen20)
> - hardware.pci.drm_card_at_addr.0000:03:00.0 = card1
> - kernel.kmod_is_loaded.i915 = true
>
>Reports new, deleted and changed facts. Here is an example:
> $ cat test-list
> igt at core_hotunplug@hotrebind
> igt at xe_module_load@reload-no-display
>
> $ ... ./build/runner/igt_runner ...
> [51.225490] [FACT before any test] new hardware.pci.gpu_at_addr.0000:03:00.0 = 8086:e20b Intel Battlemage (Gen20)
> [51.225700] [FACT before any test] new hardware.pci.gpu_at_addr.0000:00:02.0 = 8086:a782 Intel Raptorlake_s (Gen12)
> [51.227334] [1/2] core_hotunplug (hotrebind)
> [59.429470] [FACT core_hotunplug (hotrebind)] new hardware.pci.drm_card_at_addr.0000:03:00.0 = card1
> [59.429486] [FACT core_hotunplug (hotrebind)] new hardware.pci.drm_card_at_addr.0000:00:02.0 = card2
> [59.430002] [FACT core_hotunplug (hotrebind)] new kernel.kmod_is_loaded.amdgpu = true
> [59.430011] [FACT core_hotunplug (hotrebind)] new kernel.kmod_is_loaded.i915 = true
> [59.430014] [FACT core_hotunplug (hotrebind)] new kernel.kmod_is_loaded.xe = true
> [59.430456] [2/2] xe_module_load (reload-no-display)
> [61.521348] [FACT xe_module_load (reload-no-display)] deleted hardware.pci.drm_card_at_addr.0000:03:00.0 = card1
> [61.521834] [FACT xe_module_load (reload-no-display)] deleted kernel.kmod_is_loaded.xe = true
> Done.
>
>As igt at core_hotunplug@hotrebind makes changes to the system, running the same
>test-list again produces different results:
>
> $ ... ./build/runner/igt_runner ...
> [141.870776] [FACT before any test] new hardware.pci.gpu_at_addr.0000:03:00.0 = 8086:e20b Intel Battlemage (Gen20)
> [141.870820] [FACT before any test] new hardware.pci.gpu_at_addr.0000:00:02.0 = 8086:a782 Intel Raptorlake_s (Gen12)
> [141.872987] [FACT before any test] new hardware.pci.drm_card_at_addr.0000:00:02.0 = card2
> [141.874137] [FACT before any test] new kernel.kmod_is_loaded.amdgpu = true
> [141.874144] [FACT before any test] new kernel.kmod_is_loaded.i915 = true
> [141.874596] [1/2] core_hotunplug (hotrebind)
> [146.662913] [FACT core_hotunplug (hotrebind)] changed hardware.pci.drm_card_at_addr.0000:00:02.0: card2 -> card0
> [146.663850] [2/2] xe_module_load (reload-no-display)
> Done.
>
>TODO
> - Save logs to disk.
>
>Signed-off-by: Peter Senna Tschudin <peter.senna at linux.intel.com>
>---
> lib/igt_facts.c | 443 ++++++++++++++++++++++++++++++++++++++++++++++
> lib/igt_facts.h | 51 ++++++
> lib/meson.build | 1 +
> runner/executor.c | 9 +
> 4 files changed, 504 insertions(+)
> create mode 100644 lib/igt_facts.c
> create mode 100644 lib/igt_facts.h
>
>diff --git a/lib/igt_facts.c b/lib/igt_facts.c
>new file mode 100644
>index 000000000..90b362c5e
>--- /dev/null
>+++ b/lib/igt_facts.c
>@@ -0,0 +1,443 @@
>+// SPDX-License-Identifier: MIT
>+
>+/*
>+ * Copyright © 2024 Intel Corporation
>+ *
>+ * Permission is hereby granted, free of charge, to any person obtaining a
>+ * copy of this software and associated documentation files (the "Software"),
>+ * to deal in the Software without restriction, including without limitation
>+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>+ * and/or sell copies of the Software, and to permit persons to whom the
>+ * Software is furnished to do so, subject to the following conditions:
>+ *
>+ * The above copyright notice and this permission notice (including the next
>+ * paragraph) shall be included in all copies or substantial portions of the
>+ * Software.
>+ *
>+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>+ * IN THE SOFTWARE.
we shouldn't be copying this in new files. Just SPDX + Copyright is
sufficient.
>+ *
>+ */
>+
>+#include <stdio.h>
>+#include <glib.h>
>+#include <libudev.h>
>+#include "igt_core.h"
>+#include "igt_kmod.h"
>+#include "igt_facts.h"
>+#include "igt_device_scan.h"
>+
>+
>+/* Report new facts and fact changes
>+ * - new: new_value is NULL
>+ * - change: new_value is not NULL
>+ * - delete: delete is true
>+ */
>+static void report_fact_change(igt_fact *fact, const char *new_value,
>+ const char *last_test, bool delete)
>+{
>+ struct timespec uptime_ts;
>+ char *uptime = NULL;
>+
>+ if (last_test == NULL)
>+ last_test = g_strdup("before any test");
why are we using glib? I only see functions that could be replaced by
libc like strdup() and asprintf()
>+
>+ if (clock_gettime(CLOCK_BOOTTIME, &uptime_ts) != 0)
>+ return;
>+
>+ uptime = g_strdup_printf("%ld.%06ld",
>+ uptime_ts.tv_sec,
>+ uptime_ts.tv_nsec / 1000);
>+
>+ /* If delete is true, the fact was deleted */
>+ if (delete) {
>+ igt_info("[%s] [FACT %s] deleted %s = %s\n",
>+ uptime, last_test, fact->name, fact->value);
>+ return;
>+ }
>+
>+ /* If new_value is NULL, it is a new fact */
>+ if (new_value == NULL) {
>+ igt_info("[%s] [FACT %s] new %s = %s\n",
>+ uptime, last_test, fact->name, fact->value);
>+ return;
>+ }
>+
>+ /* Check if the value changed */
>+ if (strcmp(fact->value, new_value) != 0) {
>+ igt_info("[%s] [FACT %s] changed %s: %s -> %s\n",
>+ uptime, last_test, fact->name, fact->value, new_value);
>+ }
>+ g_free(uptime);
>+}
>+
>+/* Get a fact by name. Return NULL if fact is not found. */
>+static igt_fact *get_fact(igt_fact_list *list, const char *name)
>+{
>+ if (!list || list->facts == NULL || list->num_facts == 0)
>+ return NULL;
>+
>+ for (int i = 0; i < list->num_facts; i++) {
>+ if (list->facts[i].name == NULL)
>+ continue;
>+
>+ if (strcmp(list->facts[i].name, name) == 0)
>+ return &list->facts[i];
>+ }
>+ return NULL;
>+}
>+
>+static bool delete_fact(igt_fact_list *list, const char *name, const char *last_test)
>+{
>+ igt_fact *fact = NULL;
>+ int i;
>+
>+ fact = get_fact(list, name);
>+ if (fact == NULL)
>+ return false;
>+
>+ /* Report the deletion */
>+ report_fact_change(fact, NULL, last_test, true);
>+
>+ /* Move all facts after the one to be deleted one step back */
>+ for (i = 0; i < list->num_facts; i++) {
>+ if (strcmp(list->facts[i].name, name) == 0) {
>+ g_free(list->facts[i].name);
>+ g_free(list->facts[i].value);
>+
>+ for (int j = i; j < list->num_facts - 1; j++)
>+ list->facts[j] = list->facts[j + 1];
>+ break;
>+ }
>+ }
>+ list->num_facts--;
>+ list->facts = realloc(list->facts, list->num_facts * sizeof(igt_fact));
>+
>+ return true;
>+}
>+
>+/* Only used while creating the new_list. Not intended to add facts to the
>+ * global fact list.
>+ */
>+static bool set_fact(igt_fact_list *list, const char *name,
>+ const char *value, const char *last_test)
>+{
>+ igt_fact *fact = NULL;
>+
>+ /* Check that name and value are not null */
>+ if (name == NULL || value == NULL)
>+ return false;
>+
>+ fact = get_fact(list, name);
>+ if (fact != NULL)
>+ return false; /* Should not happen */
>+
>+ /* Add a new fact */
>+ list->num_facts++;
>+ list->facts = realloc(list->facts, list->num_facts * sizeof(igt_fact));
>+ list->facts[list->num_facts - 1].name = g_strdup(name);
>+ list->facts[list->num_facts - 1].value = g_strdup(value);
>+
>+ if (last_test)
>+ list->facts[list->num_facts - 1].last_test = g_strdup(last_test);
>+ else
>+ list->facts[list->num_facts - 1].last_test = NULL;
>+
>+ return true;
>+}
>+
>+/* Add a new fact or update the value of an existing fact. Return false if
>+ * the value is the same as the existing one.
>+ */
>+static bool update_list(igt_fact_list *list, const char *name,
>+ const char *value, const char *last_test)
>+{
>+ igt_fact *fact = NULL;
>+
>+ /* Check that name and value are not null */
>+ if (name == NULL || value == NULL)
>+ return false;
>+
>+ fact = get_fact(list, name);
>+ if (fact != NULL) {
>+ /* Return false if fact->value equals value */
>+ if (strcmp(fact->value, value) == 0)
>+ return false; /* Not changed */
>+
>+ report_fact_change(fact, value, last_test, false);
>+
>+ /* Update the value */
>+ fact->value = g_strdup(value);
>+ return true;
>+ }
>+
>+ /* Add a new fact */
>+ list->num_facts++;
>+ list->facts = realloc(list->facts, list->num_facts * sizeof(igt_fact));
>+ list->facts[list->num_facts - 1].name = g_strdup(name);
>+ list->facts[list->num_facts - 1].value = g_strdup(value);
>+
>+ /* Report new fact */
>+ report_fact_change(&list->facts[list->num_facts - 1], NULL,
>+ last_test, false);
>+
>+ return true;
>+}
>+
>+/* Merge list and new_list. fact_group is used to filer entries on list
>+ * that should be deleted. For example, if fact_group is "pci_gpu", we
>+ * can delete the fact hardware.pci.gpu_at.0000:00:02.0 if it doesn't
>+ * exist in new_list.
>+ */
>+static void merge_facts(igt_fact_list *list, igt_fact_list *new_list,
>+ const char *fact_group, const char *last_test)
>+{
>+ /* Filter by fact_group and delete facts that exist on list
>+ * but not on new_list
>+ */
>+ for (int i = 0; i < list->num_facts; i++) {
>+ if (strstr(list->facts[i].name, fact_group) == NULL)
>+ continue;
>+ if (get_fact(new_list, list->facts[i].name) == NULL)
>+ delete_fact(list, list->facts[i].name, last_test);
>+ }
>+
>+ /* Add and update facts from new_list to list */
>+ for (int i = 0; i < new_list->num_facts; i++) {
>+ if (strstr(new_list->facts[i].name, fact_group) == NULL)
>+ continue; /* Should never happen */
>+ update_list(list, new_list->facts[i].name, new_list->facts[i].value,
>+ new_list->facts[i].last_test);
>+ }
>+}
>+
>+static void scan_pci_for_gpus(igt_fact_list *list, const char *last_test)
>+{
>+ struct udev *udev = NULL;
>+ struct udev_enumerate *enumerate = NULL;
>+ struct udev_list_entry *devices, *dev_list_entry;
>+ struct igt_device_card card;
>+ char pcistr[10];
>+ int ret;
>+ char *factname = NULL;
>+ char *factvalue = NULL;
>+ igt_fact_list new_list = {0};
>+
>+ udev = udev_new();
>+ if (!udev) {
>+ igt_warn("Failed to create udev context\n");
>+ return;
>+ }
>+
>+ enumerate = udev_enumerate_new(udev);
>+ if (!enumerate) {
>+ igt_warn("Failed to create udev enumerate\n");
>+ udev_unref(udev);
>+ return;
>+ }
>+
>+ ret = udev_enumerate_add_match_subsystem(enumerate, "pci");
>+ if (ret < 0)
>+ goto out;
>+
>+ ret = udev_enumerate_add_match_property(enumerate, "PCI_CLASS", "30000");
>+ if (ret < 0)
>+ goto out;
>+
>+ ret = udev_enumerate_add_match_property(enumerate, "PCI_CLASS", "38000");
>+ if (ret < 0)
>+ goto out;
>+
>+ ret = udev_enumerate_scan_devices(enumerate);
>+ if (ret < 0)
>+ goto out;
>+
>+ devices = udev_enumerate_get_list_entry(enumerate);
>+ if (!devices)
>+ goto out;
>+
>+ udev_list_entry_foreach(dev_list_entry, devices) {
>+ const char *path;
>+ struct udev_device *udev_dev;
>+ struct udev_list_entry *entry;
>+ char *model = NULL;
>+ char *codename = NULL;
>+
>+ path = udev_list_entry_get_name(dev_list_entry);
>+ udev_dev = udev_device_new_from_syspath(udev, path);
>+ if (!udev_dev)
>+ continue;
>+
>+ /* Strip path to only the content after the last / */
>+ path = strrchr(path, '/');
>+ if (path)
>+ path++;
>+ else
>+ path = "unknown";
>+
>+ strcpy(card.pci_slot_name, "-");
>+
>+ entry = udev_device_get_properties_list_entry(udev_dev);
>+ while (entry) {
>+ const char *name = udev_list_entry_get_name(entry);
>+ const char *value = udev_list_entry_get_value(entry);
>+
>+ entry = udev_list_entry_get_next(entry);
>+ if (!strcmp(name, "ID_MODEL_FROM_DATABASE"))
>+ model = g_strdup(value);
>+ else if (!strcmp(name, "PCI_ID"))
>+ igt_assert_eq(sscanf(value, "%hx:%hx",
>+ &card.pci_vendor, &card.pci_device), 2);
>+ }
>+ snprintf(pcistr, sizeof(pcistr), "%04x:%04x",
>+ card.pci_vendor, card.pci_device);
>+ codename = igt_device_get_pretty_name(&card, false);
>+
>+ factname = g_strdup_printf("%s.%s", pci_gpu_fact, path);
>+ factvalue = g_strdup_printf("%s %s %s",
>+ pcistr,
>+ codename ? codename : "",
>+ model ? model : "");
>+
>+ set_fact(&new_list, factname, factvalue, last_test);
>+
>+ free(codename);
>+ free(model);
>+ udev_device_unref(udev_dev);
>+ }
>+ merge_facts(list, &new_list, pci_gpu_fact, last_test);
>+
>+out:
>+ udev_enumerate_unref(enumerate);
>+ udev_unref(udev);
>+}
>+
>+static void scan_pci_drm_cards(igt_fact_list *list, const char *last_test)
>+{
>+ struct udev *udev = NULL;
>+ struct udev_enumerate *enumerate = NULL;
>+ struct udev_list_entry *devices, *dev_list_entry;
>+ int ret;
>+ char *factname = NULL;
>+ char *factvalue = NULL;
>+ igt_fact_list new_list = {0};
>+
>+ udev = udev_new();
>+ if (!udev)
>+ return;
>+
>+ enumerate = udev_enumerate_new(udev);
>+ if (!enumerate) {
>+ udev_unref(udev);
>+ return;
>+ }
>+
>+ ret = udev_enumerate_add_match_subsystem(enumerate, "drm");
>+ if (ret < 0)
>+ goto out;
>+
>+ ret = udev_enumerate_scan_devices(enumerate);
>+ if (ret < 0)
>+ goto out;
>+
>+ devices = udev_enumerate_get_list_entry(enumerate);
>+ if (!devices)
>+ goto out;
>+
>+ ret = udev_enumerate_add_match_subsystem(enumerate, "drm");
>+ if (ret < 0)
>+ goto out;
>+
>+ ret = udev_enumerate_scan_devices(enumerate);
>+ if (ret < 0)
>+ goto out;
>+
>+ devices = udev_enumerate_get_list_entry(enumerate);
>+ if (!devices)
>+ goto out;
>+
>+ udev_list_entry_foreach(dev_list_entry, devices) {
>+ const char *path;
>+ struct udev_device *drm_dev, *pci_dev;
>+ const char *drm_name, *pci_addr;
>+
>+ path = udev_list_entry_get_name(dev_list_entry);
>+ drm_dev = udev_device_new_from_syspath(udev, path);
>+ if (!drm_dev)
>+ continue;
>+
>+ drm_name = udev_device_get_sysname(drm_dev);
>+ /* Filter the device by name. Want devices such as card0 and card1.
>+ * If the device has '-' in the name, contine
>+ */
>+ if (strncmp(drm_name, "card", 4) != 0 || strchr(drm_name, '-') != NULL) {
>+ udev_device_unref(drm_dev);
>+ continue;
>+ }
>+
>+ /* Get the pci address of the gpu associated with the drm_dev*/
>+ pci_dev = udev_device_get_parent_with_subsystem_devtype(drm_dev, "pci", NULL);
>+ if (pci_dev) {
>+ pci_addr = udev_device_get_sysattr_value(pci_dev, "address");
>+ if (!pci_addr)
>+ pci_addr = udev_device_get_sysname(pci_dev);
>+ } else {
>+ /* Some GPUs are platform devices. Ignore them. */
>+ pci_addr = NULL;
>+ udev_device_unref(drm_dev);
>+ continue;
>+ }
>+
>+ factname = g_strdup_printf("%s.%s", drm_card_fact, pci_addr);
>+ factvalue = g_strdup_printf("%s", drm_name);
>+ set_fact(&new_list, factname, factvalue, last_test);
>+
>+ udev_device_unref(drm_dev);
>+ }
>+ if (new_list.num_facts > 0)
>+ merge_facts(list, &new_list, drm_card_fact, last_test);
>+
>+out:
>+ udev_enumerate_unref(enumerate);
>+ udev_unref(udev);
>+}
>+
>+static void scan_kernel_loaded_kmods(igt_fact_list *list, const char *last_test)
>+{
>+ char *name = NULL;
>+ igt_fact_list new_list = {0};
>+
>+ /* Iterate over igt_fact_kmod_list[] until the element contains "\0" */
>+ for (int i = 0; strcmp(igt_fact_kmod_list[i], "\0") != 0; i++) {
>+ name = g_strdup_printf("%s.%s", kmod_fact, igt_fact_kmod_list[i]);
>+ if (igt_kmod_is_loaded(igt_fact_kmod_list[i]))
>+ set_fact(&new_list, name, "true", last_test);
>+
>+ g_free(name);
>+ }
>+ merge_facts(list, &new_list, kmod_fact, last_test);
>+}
>+
>+void igt_facts(igt_fact_list *list, const char *last_test)
>+{
>+ scan_pci_for_gpus(list, last_test);
ok, this uses udev to scan for vga/display
>+ scan_pci_drm_cards(list, last_test);
this seems rather to be scan_drm_cards, i.e. any device that has a
loaded driver registered with drm subsystem. This would also show
vgem, usb, etc
>+ scan_kernel_loaded_kmods(list, last_test);
So... igt_facts() hardcodes what is the info collected. Maybe we
should rather plug this into the hooks framework as part of
"built-in hooks".
This way it's easier to control what and when info is collected,
allowing developers (and CI) to pick and choose what gets executed.
+Gustavo
Lucas De Marchi
More information about the igt-dev
mailing list