[PATCH i-g-t v10] igt-runner fact checking

Peter Senna Tschudin peter.senna at linux.intel.com
Fri Dec 6 13:16:27 UTC 2024


Hi Kamil,

Thank you for your comments. I appreciate the opportunity to clarify and address
your concerns.

On 06.12.2024 12:42, Kamil Konieczny wrote:
> Hi Peter,
> On 2024-12-05 at 05:54:26 +0100, Peter Senna Tschudin wrote:
> 
> overall looks good, I have few nits, see below.

Thank you! Our CI tested it thoroughly, it creates value, and  it works well.
It is time to merge it.

> 
> As for subject, it usally informs what changed, so "igt-runner fact checking"
> is a little misleading. Please see my ask about that below.

Please explain what is misleading about the subject. It looks good to me.

> 
>> When using igt-runner, collect facts before each test and after the
>> last test, and report when facts change. The facts are:
>>  - GPUs on PCI bus: hardware.pci.gpu_at_addr.0000:03:00.0: 8086:e20b Intel Battlemage (Gen20)
>>  - Associations between PCI GPU and DRM card: hardware.pci.drm_card_at_addr.0000:03:00.0: card1
>>  - Kernel taints: kernel.is_tainted.taint_warn: true
>>  - GPU kernel modules loaded: kernel.kmod_is_loaded.i915: true
>>
>> This change imposes little execution overhead and adds just a few
> 
> imho 'little overhead' is beause facts are only four but I agree
> we could change implementation when its number grows.

Little overhead is because of how I implemented the code. The execution
is efficient and produces as little text output as possible.

> 
>> lines of logging. The facts will be printed on normal igt-runner
>> output. Here is a real example from our CI showing
>> hotreplug-lateclose changing the DRM card number and tainting the
>> kernel on the abort path:
>>
>>  [245.316207] [056/121] (816s left) core_hotunplug (hotreplug-lateclose)
>>  [245.383596] Starting subtest: hotreplug-lateclose
>>  [249.843361] Aborting: Lockdep not active
>>  [249.858249] [FACT core_hotunplug (hotreplug-lateclose)] changed: hardware.pci.drm_card_at_addr.0000:00:02.0: card0 -> card1
>>  [249.858392] [FACT core_hotunplug (hotreplug-lateclose)] new: kernel.is_tainted.taint_die: true
>>  [249.859075] Closing watchdogs
>>
>> CC: Ryszard Knop <ryszard.knop at intel.com>
>> CC: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
>> CC: Zbigniew Kempczyński <zbigniew.kempczynski at intel.com>
>> CC: Lucas De Marchi <lucas.demarchi at intel.com>
>> CC: luciano.coelho at intel.com
>> CC: nirmoy.das at intel.com
>> CC: stuart.summers at intel.com
>> CC: himal.prasad.ghimiray at intel.com
>> CC: dominik.karol.piatkowski at intel.com
>> CC: katarzyna.piecielska at intel.com
> 
> +cc Petri + kms devs
> Cc: Juha-Pekka Heikkila <juhapekka.heikkila at gmail.com>
> Cc: Juha-Pekka Heikkila <juha-pekka.heikkila at intel.com>
> Cc: Swati Sharma <swati2.sharma at intel.com>
> Cc: Jani Saarinen <jani.saarinen at intel.com>
> Cc: Jani Nikula <jani.nikula at linux.intel.com>
> 
> +other gpu devs
> Cc: Rob Clark <robdclark at gmail.com>
> Cc: Rob Clark <robdclark at chromium.org>
> Cc: Helen Koike <helen.koike at collabora.com>
> 
> +drm ci dev
> Cc: Melissa Wen <mwen at igalia.com>
> Cc: Maíra Canal <mcanal at igalia.com>

That is a lot of people to CC, but ok.

> 
>> Signed-off-by: Peter Senna Tschudin <peter.senna at linux.intel.com>
>> ---
>> Re-sending as the mailing list server rejected the message
>> yesterday due to a probably full disk...
>>
>> v10:
>>  - fix memory leaks from asprintf (Thank you Dominik Karol!)
>>  - fix comments for consistency (Thank you Dominik Karol!)
>>
>> v9:
>>  - do not report new hardware when loading/unloading kmod changes
>>    the string of the GPU name. I accidentally reintroduced this
>>    issue when refactoring to use linked lists.
>>  - add tools/lsfacts: 9 lines of code that print either the facts
>>    or that no facts were found.
>>  - fix code comments describing functions
>>  - fix white space issues
>>
>> v8:
>>  - fix white space issues
>>
>> v7:
>>  - refactor to use linked lists provided by igt_lists
>>  - Added function arguments to code comments
>>  - updated commit message
>>
>> v6:
>>  - sort includes in igt_facts.c alphabetically
>>  - add facts for kernel taints using igt_kernel_tainted() and
>>    igt_explain_taints()
>>
>> v5:
>>  - fix the broken patch format from v4
>>
>> v4:
>>  - fix a bug on delete_fact()
>>  - drop glib and calls to g_ functions
>>  - change commit message to indicate that report only on fact changes
>>  - use consistent format for reporting changes
>>  - fix SPDX header format
>>
>> v3:
>>  - refreshed commit message
>>  - changed format SPDX string
>>  - removed license text
>>  - replace last_test assignment when null by two ternary operators
>>  - added function descriptions following example found elsewhere in
>>    the code
>>  - added igt_assert to catch failures to realloc()
>>
>> v2:
>>  - add lib/tests/igt_facts.c for basic unit testing
>>  - bugfix: do not report a new gpu when the driver changes the gpu name
>>  - bugfix: do not report the pci_id twice on the gpu name
>>
>>  lib/igt_facts.c       | 755 ++++++++++++++++++++++++++++++++++++++++++
>>  lib/igt_facts.h       |  47 +++
>>  lib/meson.build       |   1 +
>>  lib/tests/igt_facts.c |  15 +
>>  lib/tests/meson.build |   1 +
> 
> Please split your patchset into three, add above files to
> first patch:
> 
> [PATCH i-g-t v11 1/3] lib/igt_facts: add fact checking lib
> 
>>  runner/executor.c     |  10 +
> 
> [PATCH i-g-t v11 2/3] runner/executor: Check facts at tests run
> 
>>  tools/lsfacts.c       |  25 ++
>>  tools/meson.build     |   1 +
> 
> [PATCH i-g-t v11 3/3] tools/lsfacts: Add facts listing tool
> 
> and it all with cover letter.

No. Why would I do that? Both the change to runner/executor.c and
the new tool are very small, not justifying a separate patch. They
also match well in functionality also not justifying a different
patch. What gain are you trying to achieve here? I really like a
single patch in this case.

> 
>>  8 files changed, 855 insertions(+)
>>  create mode 100644 lib/igt_facts.c
>>  create mode 100644 lib/igt_facts.h
>>  create mode 100644 lib/tests/igt_facts.c
>>  create mode 100644 tools/lsfacts.c
>>
>> diff --git a/lib/igt_facts.c b/lib/igt_facts.c
>> new file mode 100644
>> index 000000000..4749d3417
>> --- /dev/null
>> +++ b/lib/igt_facts.c
>> @@ -0,0 +1,755 @@
>> +// SPDX-License-Identifier: MIT
>> +// Copyright © 2024 Intel Corporation
> 
> Copyright is usally formatted in C-comment style /* comment */
> to allow extending it (or adding Author: fields, etc) See other
> C files for example tests/device_reset.c
> So here:
> 
> /*
>  * Copyright © 2024 Intel Corporation
>  */
> 
> Do the same in other new files.

No sorry, this is because of you asked me to please kernel's
checkpatch.pl. I am not sending another revision for this nit
that changes the comment style. Have you seen how the comment
style varies in our code base, including recent commits? Why
being so picky with my patch?

That being said, if you want to implement these changes yourself
please go for it.

Thank you,

Peter

> 
> Regards,
> Kamil
> 
>> +
>> +#include <ctype.h>
>> +#include <libudev.h>
>> +#include <stdio.h>
>> +#include <sys/time.h>
>> +#include <time.h>
>> +
>> +#include "igt_core.h"
>> +#include "igt_device_scan.h"
>> +#include "igt_facts.h"
>> +#include "igt_kmod.h"
>> +#include "igt_list.h"
>> +#include "igt_taints.h"
>> +
>> +static struct igt_list_head igt_facts_list_drm_card_head;
>> +static struct igt_list_head igt_facts_list_kmod_head;
>> +static struct igt_list_head igt_facts_list_ktaint_head;
>> +static struct igt_list_head igt_facts_list_pci_gpu_head;
>> +
>> +
>> +/**
>> + * igt_facts_lists_init:
>> + *
>> + * Initialize igt_facts linked lists.
>> + *
>> + * Returns: void
>> + */
>> +void igt_facts_lists_init(void)
>> +{
>> +	IGT_INIT_LIST_HEAD(&igt_facts_list_drm_card_head);
>> +	IGT_INIT_LIST_HEAD(&igt_facts_list_kmod_head);
>> +	IGT_INIT_LIST_HEAD(&igt_facts_list_ktaint_head);
>> +	IGT_INIT_LIST_HEAD(&igt_facts_list_pci_gpu_head);
>> +}
>> +
>> +
>> +/**
>> + * igt_facts_log:
>> + * @last_test: name of the test that triggered the fact
>> + * @name: name of the fact
>> + * @new_value: new value of the fact
>> + * @old_value: old value of the fact
>> + *
>> + * Reports fact changes:
>> + * - new fact: if old_value is NULL and new_value is not NULL
>> + * - deleted fact: if new_value is NULL and old_value is not NULL
>> + * - changed fact: if new_value is different from old_value
>> + *
>> + * Returns: void
>> + */
>> +static void igt_facts_log(const char *last_test, const char *name,
>> +			  const char *new_value, const char *old_value)
>> +{
> 
> ...cut...
> 
>> -- 
>> 2.34.1
>>



More information about the igt-dev mailing list