[Piglit] [PATCH] summary: fix support for old results file with duplicated subtests

Thu May 29 05:37:57 PDT 2014

On Thu, May 29, 2014 at 12:59 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On 05/28/2014 07:17 PM, Ilia Mirkin wrote:
>> Old files have duplicated entries for each subtest, in addition to a
>> filled subtest dictionary. Detect that the current test name is also a
>> subtest and treat it as though it were a complete test. This may have
>> false-negatives, but they're unlikely given test/subtest naming
>> convention.
>>
>> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
>> ---
>>
>> Dylan, I'm sure you hate this, but it does seem to work for me. Not sure where
>> you are with your fix, but this is a tool that lots of people use, so it does
>> need to be addressed. And keep in mind that by now there are both the "old"
>> and "new" formats running around, so just slapping a version number in there
>> won't be enough.
>
> Seriously?  We fixed a bug.  Subtests were *broken* - it stored insane
> amounts of duplicate data in the files, to work around a bug in the
> tools that processed those data files.  This caused huge amounts of
> wasted space and confusion.
>
> I don't understand the whole "let's not re-run Piglit to get a proper
> baseline unless something breaks" thinking.  It only takes 10-15 minutes
> to do a full Piglit run here.  Taking a proper baseline allows you to
> have confidence that any changes you see were caused by your patches,
> and not by other people's changes to Mesa or Piglit.  It just seems like
> good practice.
>
> Have things gotten to the point where we can't even fix a bug without
> people requesting reverts or workarounds?  It's bad enough that people
> keep insisting that we have to make this software work on 4 year old
> Python versions.
>
> Dylan's patches were on the list waiting for over a month, and bumped
> after two weeks, and AFAICS fix a long-standing bug.  All people have to
> do is re-run Piglit to get data files that aren't *broken*.  If the
> Piglit community won't even let us commit bug fixes, I don't know why I
> should continue contributing to this project.
>
> (Ilia - this isn't complaining about you specifically - it's just the
> attitude of the community in general I've observed over the last few
> months that frustrates me.  It seems like any time we commit anything,
> there are very vocal objections and people calling for reverts.  And
> that really frustrates me.)

Hi Ken,

First of all, I'd like to point out that at no point in time did I
complain about something being checked in or call for a revert. Merely
pointing out that certain use-cases should be supported, and had been,
but were recently broken. Bugs happen, but I'm surprised that not
everyone here agrees that this _is_ a bug. I don't have the
bandwidth/time/desire to review and test every piglit change, and this
seemed like a particularly nasty one, so I skipped it. I'm very happy
that the fix was done, I had noticed the subtests insanity myself and
it also annoyed me (although not enough for me to actually try to fix
it... xz is really good at compression).

At Intel, there are 2 relevant chips that anyone cares about (gen7 and
gen7.5 from the looks of it), and maybe 3 more that are borderline
(gen6, gen5, gen4), but there are a lot more NVIDIA chips out there.
You all have easy access to all of these chips (perhaps not at your
desk, but if you really wanted to find a gen4 chip, I suspect you
could without too big of a hassle). I personally have access to a very
limited selection and have to ask others to run the tests, or swap in
cards, or whatever. There can even be kernel interactions, which adds
a whole dimension to the testing matrix. The vast, vast, *vast*
majority of piglit tests don't change names/etc, so outside of a few
oddities, piglit runs are comparable across different piglit
checkouts.

Each piglit run takes upwards of 40-60 minutes and has the potential
to crash the machine. This is only counting the tests/gpu.py tests
(since tests/quick.py includes tons of tests I don't touch the code
for, like compiler/etc). It is this slow in large part because they're
run single-threaded and capture dmesg, but even if I didn't care about
dmesg, nouveau definitely can't handle multithreaded. You could say
"fix your driver!" but it's not quite that easy.

Anyways, if I'm the only one who cares about being able to compare
across piglit runs from different times, I'll drop the issue and stop
trying to track failures on nouveau. I'm relatively certain that it
would reverse a recent trend of improving piglit results on nouveau
though.

  -ilia