[igt-dev] ✗ Fi.CI.BAT: failure for new engine discovery interface (rev7)
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Tue Feb 12 11:56:55 UTC 2019
On 12/02/2019 11:39, Arkadiusz Hiler wrote:
> On Tue, Feb 12, 2019 at 08:43:24AM +0000, Tvrtko Ursulin via igt-dev wrote:
>>
>> On 07/02/2019 08:58, Petri Latvala wrote:
>>> On Wed, Feb 06, 2019 at 09:25:38AM +0000, Patchwork wrote:
>>>> == Series Details ==
>>>>
>>>> Series: new engine discovery interface (rev7)
>>>> URL : https://patchwork.freedesktop.org/series/52699/
>>>> State : failure
>>>>
>>>> == Summary ==
>>>>
>>>> IGT patchset build failed on latest successful build
>>>> 592b854fead32c2b0dac7198edfb9a6bffd66932 tools/intel_watermark: Clean up the platform checks in the ilk+ code
>>>>
>>>>
>>>>
>>>> The output from the failed tests:
>>>>
>>>> 138/266 testcase check: gem_ctx_isolation FAIL 0.36 s
>>>>
>>>> --- command ---
>>>> /home/cidrm/igt-gpu-tools/tests/igt_command_line.sh gem_ctx_isolation
>>>> --- stdout ---
>>>> tests/gem_ctx_isolation:
>>>> Checking invalid option handling...
>>>> Checking valid option handling...
>>>> Checking subtest enumeration...
>>>> FAIL: tests/gem_ctx_isolation
>>>> --- stderr ---
>>>> Received signal SIGSEGV.
>>>> Stack trace:
>>>> #0 [fatal_sig_handler+0xd5]
>>>> #1 [killpg+0x40]
>>>> #2 [__real_main687+0x810]
>>>> #3 [main+0x44]
>>>> #4 [__libc_start_main+0xe7]
>>>> #5 [_start+0x2a]
>>>> -------
>>>>
>>>> 248/266 testcase check: perf_pmu FAIL 0.21 s
>>>>
>>>> --- command ---
>>>> /home/cidrm/igt-gpu-tools/tests/igt_command_line.sh perf_pmu
>>>> --- stdout ---
>>>> tests/perf_pmu:
>>>> Checking invalid option handling...
>>>> Checking valid option handling...
>>>> Checking subtest enumeration...
>>>> FAIL: tests/perf_pmu
>>>> --- stderr ---
>>>> Received signal SIGSEGV.
>>>> Stack trace:
>>>> #0 [fatal_sig_handler+0xd5]
>>>> #1 [killpg+0x40]
>>>> #2 [__real_main1671+0xf1f]
>>>> #3 [main+0x44]
>>>> #4 [__libc_start_main+0xe7]
>>>> #5 [_start+0x2a]
>>>> -------
>>>>
>>>
>>>
>>> These tests are constructing subtests for each known engine. Your
>>> engine loop changes make it not possible to enumerate any names
>>> statically, without accessing the driver.
>>
>> Is generating subtest names without accessing the driver an absolute
>> requirement?
>>
>> This particular failure would be fixable if the engine list was initialized
>> on the first call to __for_each_engine_class_instance, as called by the
>> perf_pmu above. And the fixture block is already ran in test enumeration,
>> where the driver is opened. I think a lot of tests are like this. So in this
>> context a simple query ioctl on top doesn't sound bad.
>>
>> But I also remember, and I think it was due bug tracking limitations, that
>> in the past the desire was test enumeration is constant regardless of the
>> execution platform.
>>
>> I think it would be simpler if we didn't have to maintain a separate static
>> list of all possible engines. To make that work we would need to detect the
>> mode of execution (list vs execute) in the engine iterator as well. So gain,
>> to me it sounds preferable that tests would be allowed to enumerate
>> dynamically.
>>
>> So question is can CI/bugtracking cope with that, and are we okay with doing
>> a trivial ioctl from the for_each_engine_<something>.
>
> For quite some time it is enforced that, in IGT, listing tests is
> exhaustive and independent of the machine. CI heavily relies on this
> assumption. We generate and manage the testlists on macines that are
> GT-less.
>
> Especially for 'shards':
>
> 1. list all test,
> split it in 35 'randomized' chunks that have roughly the same execution time
>
> 2. currnet_chunk=35;
> while (current_chunk)
> if(any(machines, is_free))
> schedule(current_chunk--, first_free(machines))
>
> I don't really see how we can improve upon that.
>
> Overall keeping static list of all the possible engines for listing and
> then "skipping" the actual exectuion if it's not there seems like a
> simpler issue to tackle.
Downside there is that it creates a coupling between test code and CI
implementation.
Would it be feasible to generate the test list on sharded platforms?
It could be a "list tests" job which you could send to each of the shard
platforms (one machine of each platforms) before randomizing and chunking.
Advantage there would be that you don't send chunks containing tests
which a particular platform cannot run.
And of course another advantage is that we remove unnecessary coupling
between two components so IGT maintenance becomes easier. (No two sets
of engines, no two ways to iterate those sets, and no need to remember
when to use each.)
Regards,
Tvrtko
More information about the igt-dev
mailing list