[Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.

Tue Mar 3 12:45:36 PST 2015

Thanks for the reviews.

My setup is not complicated as yours.  Mostly because my main focus has 
been llvmpipe, and we're not actively adding supporting to new OpenGL 
extensions for it at this moment, so most of the new piglit tests either 
skip or pass. New failures are relatively rare.

Before I used 
https://wiki.jenkins-ci.org/display/JENKINS/Email-ext+plugin -- it 
allows to control if/when emails are sent with great flexibility. In 
particular it allows to send emails for new regressions, but not for 
tests that were failing previously.

Still, nothing beats being able to look at a bunch of test jobs and 
immediately tell all blue == all good.   Alhough I use jenkins for many 
years now, this is a lesson I only learned recently -- it's better to 
mask out expected failures somehow and get a boolean "all pass" or 
"fail" for the whole test stuie, than trying to track pass/fail for 
individual tests.  The latter just doesn't scale...

We do have an internal branches where we run piglit.  I confess I don't 
a good solution yet.  Your trick of maintaining a database of which git 
commit tests were added is quite neat.  Another thing worth considering 
would be to branch or tag/ piglit whenever Mesa is branched, and keep 
using a matching (and unchanging) piglit commit.

We also run testsuites through different APIs (namely D3D9/10).  These 
testsuite rarely get updated, and llvmpipe conformance is actually quite 
good to start with, so it's easy to get "all pass" there.

Piglit, by being continuously updated/extended, is indeed more of a 
challenge than other testsuites.

We also use piglit for testing our OpenGL guest driver, but we use an 
internal testing infrastructure to driver, not Jenkins.  So our 
experiences there don't apply.

I also have a few benchmarks on jenkins.  Again, I only keep track of 
performance metrics via Jenkins Plots and Measurements plugins, but I 
don't produce pass/fail based on those metrics.  I am however 
considering doing something of the sort -- e.g., getting the history of 
the metrics via jenkins JSON API, fit into a probablylity distribution, 
and fail when performance goes below a given percentile.

Jose

On 03/03/15 18:34, Mark Janes wrote:
> Thanks Jose! this is an improvement.
>
> In my experience, broken tests are introduced and fixed in mesa on a
> daily basis.  This has a few consequences:
>
>   - On a daily basis, I look at failures and update the expected
>     pass/fails depending on whether it is a new test or a regression.
>     Much of this process is automated.
>
>   - Branches quickly diverge on the basis of passing/failing tests.
>     Having separate pass/fail configs on release branches is
>     unmanageable.  To account for this, my automation records the
>     relevant commit sha as the value in the config file (the key is the
>     test name).  I post-process the junit xml to filter out test failures
>     with commits that occurred after the branch point.
>
>   - for platforms that are too slow to build each checkin, I run an
>     automated bisect which builds/tests in jenkins, then updates config
>     files.
>
>   - Our platform matrix generates over 350k unskipped tests for each
>     build.  We filter out skipped tests due to the memory consumption on
>     jenkins when displaying this many tests.
>
> I am interested in learning more about your test system, and sharing
> lessons learned / techniques.
>
> -Mark
>
> Reviewed-by: Mark Janes <mark.a.janes at intel.com>
>
> Jose Fonseca <jfonseca at vmware.com> writes:
>
>> I recently tried the junit backend's ability to ignore expected
>> failures/crashes and found it a godsend -- instead of having to look as
>> test graph results periodically, I can just tell jenkins to email me
>> when things go south.
>>
>> The only drawback is that by reporting the expected issues as passing it
>> makes it too easy to forget about them and misinterpret the pass-rates.
>> So this change modifies the junit backend to report the expected issues
>> as skipped, making it more obvious when looking at the test graphs that
>> these tests are not really passing, and that whatever functionality they
>> target is not being fully covered.
>>
>> This change also makes use of the junit `message` attribute to explain
>> the reason of the skip.  (In fact, we could consider using the `message`
>> attribute on other kind of failures to inform the piglit result, instead
>> of using the non-standard `type`.)
>> ---
>>   framework/backends/junit.py | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/framework/backends/junit.py b/framework/backends/junit.py
>> index 82f9c29..53b6086 100644
>> --- a/framework/backends/junit.py
>> +++ b/framework/backends/junit.py
>> @@ -129,17 +129,19 @@ class JUnitBackend(FileBackend):
>>               # Add relevant result value, if the result is pass then it doesn't
>>               # need one of these statuses
>>               if data['result'] == 'skip':
>> -                etree.SubElement(element, 'skipped')
>> +                res = etree.SubElement(element, 'skipped')
>>
>>               elif data['result'] in ['warn', 'fail', 'dmesg-warn', 'dmesg-fail']:
>>                   if expected_result == "failure":
>>                       err.text += "\n\nWARN: passing test as an expected failure"
>> +                    res = etree.SubElement(element, 'skipped', message='expected failure')
>>                   else:
>>                       res = etree.SubElement(element, 'failure')
>>
>>               elif data['result'] == 'crash':
>>                   if expected_result == "error":
>>                       err.text += "\n\nWARN: passing test as an expected crash"
>> +                    res = etree.SubElement(element, 'skipped', message='expected crash')
>>                   else:
>>                       res = etree.SubElement(element, 'error')
>>
>> --
>> 2.1.0