Add env info to igt_runner (was: Re: [PATCH i-g-t 4/4] lib/igt_device_scan: Fix scan vs bind/unbind/reload)

Thu Dec 19 18:38:56 UTC 2024

Quoting Lucas De Marchi (2024-12-19 14:24:09-03:00)
>hijacking the thread and adding some people to Cc for the igt_runner question.
>Previously In-Reply-To: <rnw3q6mhthnwyvowvszr2gllyjtbb2mozk4em272xlmkvm7pyl at szbhtg3sd7d7>
>
>On Thu, Dec 19, 2024 at 10:35:00AM -0600, Lucas De Marchi wrote:
>>On Wed, Dec 18, 2024 at 07:34:19AM +0100, Zbigniew Kempczyński wrote:
>>>On Tue, Dec 17, 2024 at 09:13:24PM -0800, Lucas De Marchi wrote:
>>>>There's no guarantee a card will end up with the same device node when
>>>>modules are loaded/unloaded and drivers bound/unbound. There's some
>>>>fundamental issue with the igt's the way it is and it's also puzzling
>>>>from the logs it looks like the device vanished from the bus, when in
>>>>reality is just the SW state out of sync with what the kernel is
>>>>exporting.
>>>>
>>>>Re-scanning when trying to match a device is not expensive compared to
>>>>what most tests are doing, so simply force it to occur whenever trying
>>>>to match a card.
>>>
>>>I also should comment the above. It is generally true, but I've noticed
>>>getting attributes might be expensive. Even it may take up to few
>>>seconds, that's why I've added some attributes we don't fetch from udev
>>>(see is_on_blacklist()). If I'm not wrong getting 'config' was a cause
>>>to limit attributes we fetch.
>>
>>why would we get all attributes and exclude some?  Shouldn't we get only
>>the attributes we actually use? AFAIK this logic is basically used by
>>--device/IGT_DEVICE, right? What filters we normally use?
>>
>>I usually pass the pci slot (because I know that won't change
>>dynamically and cause surprises). Apparently CI passes vendor/devid:
>>
>>        export IGT_DEVICE=pci:vendor=$1,device=$2
>>
>>(but it seems to vary depending on pipeline)
>>
>>Some devs pass the device node directly too as in a lot of places
>>there's only ever card0 possible.
>
>
>Could we dump the env and args somewhere so we know how igt_runner or
>individual tests are being called without looking at the CI piepeline
>sources? I was thinking about either having that info in the stdout
>output of igt_runner or in the json. Another possibility would be in
>dmesg, but I'm not sure it's a good option. Thoughts?
>
>My preferred option would be to have e.g.:
>
>{
>   "__type__": "TestrunResult",
>   "results_version": 10,
>   "name": "xe-2403-995cd30a4e222b6a7b4b40c36219e4937fd7109e\/bat-bmg-1\/0",
>   "uname": "Linux bat-bmg-1 6.13.0-rc3-xe+ #1 SMP PREEMPT_DYNAMIC Thu Dec 19 14:40:51 UTC 2024 x86_64",
>   "time_elapsed": {
>     "__type__": "TimeAttribute",
>     "start": 1734621126.8734231,
>     "end": 1734621288.5994539
>   },
>   "environment": {

Nitpick: maybe "env"? :-)

>     "IGT_DEVICE": ...
>     <any IGT_* env var>
>   },
>   "argv": [ ... ]
>

I like the idea and I also prefer putting that info in the results.json,
which makes general info about the execution more self-contained.

Does CI always make results.json available and easy to find? Depending
on the answer to this, we might think about additionally dumping that
info in some user visible log from CI results pages.

--
Gustavo Sousa