[Piglit] Nearly finished: shader_runner running THOUSANDS of tests per process

Fri May 27 01:18:22 UTC 2016

Marek Olšák <maraeo at gmail.com> writes:

> On Mon, Apr 18, 2016 at 6:45 PM, Dylan Baker <baker.dylan.c at gmail.com> wrote:
>> Quoting Marek Olšák (2016-04-16 15:16:34)
>>> Hi,
>>>
>>> This makes shader_runner very fast. The expected result is 40%
>>> decrease in quick.py running time, or a 12x faster piglit run if you
>>> run shader tests alone.
>>>
>>> Branch:
>>> https://cgit.freedesktop.org/~mareko/piglit/log/?h=shader-runner
>>>
>>> Changes:
>>>
>>> 1) Any number of test files can be specified as command-line
>>> parameters. Those command lines can be insanely long.
>>>
>>> 2) shader_runner can re-create the window & GL context if test
>>> requirements demand different settings when going from one test to
>>> another.
>>>
>>> 3) all.py generates one shader_runner instance per group of tests
>>> (usually one or two directories - tests and generated_tests).
>>> Individual tests are reported as subtests.
>>>
>>> The shader_runner part is done. The python part needs more work.
>>>
>>>
>>> What's missing:
>>>
>>> Handling of crashes. If shader_runner crashes:
>>> - The crash is not shown in piglit results (other tests with subtests
>>> already have the same behavior)
>>> - The remaining tests will not be run.
>>>
>>> The ShaderTest python class has the list of all files and should be
>>> able to catch a crash, check how many test results have been written,
>>> and restart shader_runner with the remaining tests.
>>>
>>> shader_runner prints TEST %i: and then the subtest result. %i is the
>>> i-th file in the list. Python can parse that and re-run shader_runner
>>> with the first %i tests removed. (0..%i-1 -> parse subtest results; %i
>>> -> crash; %i+1.. -> run again)
>>>
>>> I'm by no means a python expert, so here's an alternative solution (for me):
>>> - Catch crash signals in shader_runner.
>>> - In the single handler, re-run shader_runner with the remaining tests.
>>>
>>> Opinions welcome,

Per-test process isolation is a key feature of Piglit that the Intel CI
relies upon.  If non-crash errors bleed into separate tests, results
will be unusable.

In fact, we wrap all other test suites in piglit primarily to provide
them with per-test process isolation.

For limiting test run-time, we shard tests into groups and run them on
parallel systems.  Currently this is only supported by dEQP features,
but it can make test time arbitrarily low if you have adequate hardware.

For test suites that don't support sharding, I think it would be useful
to generate suites from start/end times that can run the maximal set of
tests in the targeted duration.

I would be worried by complex handling of crashes.  It would be
preferable if separate suites were available to run with/without shader
runner process isolation.

Users desiring faster execution can spend the saved time figuring out
which test crashed.

>>> Marek
>>> _______________________________________________
>>> Piglit mailing list
>>> Piglit at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/piglit
>>
>> Thanks for working on this Marek,
>>
>> This has been discussed here several times amongst the intel group, and
>> the recurring problem to solve is crashing. I don't have a strong
>> opinion on python vs catching a fail in the signal handler, except that
>> handling in the python might be more robust, but I'm not really familiar
>> with what a C signal handler can recover from, so it may not.
>
> I can catch signals like exceptions and report 'crash'. Then I can
> open a new process from the handler to run the remaining tests, wait
> and exit.

Will an intermittent crash be run again until it passes?

> The signal catching won't work on Windows.
>
> Also, there are piglit GL framework changes that have only been tested
> with Waffle and may break other backends.
>
>>
>> The one concern I have is using subtests. There are a couple of
>> limitations to them, first we'll loose all of the per test stdout/stderr
>> data, and that seems less than optimal. I wonder if it would be better
>> to have shader runner print some sort of scissor to stdout and stderr
>> when it starts a test and when it finishes one, and then report results
>> as normal without the subtest. That would maintain the output of each
>> test file, which seems like what we want, otherwise the output will be
>
> That can be done easily in C.
>
>> jumbled. The other problem with subtests is that the JUnit backend
>> doesn't have a way to represent subtests at the moment. That would be
>> problematic for both us and for VMWare.
>
> I can't help with anything related to python.
>
> The goal is to make piglit faster for general regression testing.
> Other use cases can be affected negatively, but the time savings are
> worth it.
>
>>
>> Looking at the last patch the python isn't all correct there, it will
>> run in some cases and fail in others, particularly it will do something
>> odd if fast skipping is enabled, but I'm not sure exactly what. I think
>> it's worth measuring and seeing if the fast skipping path is even an
>> optimization with your enhancements, if it's not we should just disable
>> it for shader_runner or remove it entirely, it would remove a lot of
>> complexity.
>
> If the fast skipping is the only issue, I can remove it.
>
>>
>> I'd be more than happy to help get the python work done and running,
>> since this would be really useful for us in our CI system.
>
> What else needs to be done in python?
>
> Marek
> _______________________________________________
> Piglit mailing list
> Piglit at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/piglit