[Piglit] Speeding up QA: Printing test names and command lines

Wed May 9 00:05:29 PDT 2012

Recently, Gordon informed us that it took QA over two hours to do a full
Piglit run (with all.tests) on Sandybridge, and over 8.5 hours on Pineview.
This was a huge shock to us, as I am able to do the same run in 8 minutes
on my Sandybridge laptop.  With a touch of insight and profiling, we were
able to find the discrepancy---and it turned out to be something both
hilarious and strange.

Piglit has nearly 10,000 tests.  Developers run them like so:

$ ./piglit-run.py tests/all.tests results/snb-master-1

Piglit's runner starts up, executes all.tests (which generates the list
of available tests), and then runs all 10,000 tests, writing the JSON
results file.

Starting piglit-run.py has a small cost: it takes about 1 second (on my
laptop) to process all.tests, as it walks over the filesystem to find
shaders and does a bunch of stuff.  Most Piglit users don't mind, because
the run takes 10 minutes anyway.  In other words, the amortized overhead
is 1 second / 10000 tests = 0.0001 seconds/test.  Practically nothing,
and certainly not worth bothering to optimize.  Also, as we add more and
more tests, the amortized cost goes down.

However, QA invokes Piglit differently: they created their own test
infrastructure which runs each test individually, limiting the amount
of time tests can run (in case of infinite loops) and checking for GPU
hangs after each one.  If the GPU hangs, their runner reboots the system
and continues the run where it left off.  This is critically important
to maintain a robust, fully automated lab.  QA also needs to support
multiple test suites, such as Intel's OGLconform and the Khronos ES2
conformance test suite.

Instead, QA uses their runner to launch piglit-run.py tests/all.tests
-t <test name> for each of the 10,000 tests.  This causes it to read
all.tests---which again, takes 1 second---for each of the 10,000 tests.
This results in *2.7 hours* of overhead on a Sandybridge CPU; it's much
worse on Pineview.  And, as we add more tests, this cost goes up.

So while most Piglit users get a nice 1/10000, QA gets a nasty 1*10000!

Ultimately, I'd like to improve Piglit's runner to be more robust, and
hopefully allow QA to use it directly rather than requiring wrapper
infrastructure.

However, that may take some time and effort, so instead this series
implements a quick hack: a new piglit-print-commands.py script which
prints out a list of test names and the command line piglit-run.py
would have used to invoke it.  This allows QA to invoke tests directly
and avoid piglit-run.py's startup cost.

I'm not a huge fan of this, but it's useful for them, and simple enough
that I don't feel too bad about including it upstream.