Baselining EXA quality (r100)
Carl Worth
cworth at cworth.org
Wed May 16 11:03:38 PDT 2007
In concert with the effort I recently started to baseline EXA's
performance, I also want to baseline its quality. Again, I did this
with the hardware I had readily available, (still an r100---haven't
gotten a fancy new Intel GM965 yet).
The three things I decided to use for testing are the X test suite,
the rendercheck program, and cairo's test suite. The results I got for
each are detailed below.
To summarize the results:
X test suite: EXA fails fewer tests than XAA (82 compared to 96), but
I don't know how to interpret details of the failures.
Rendercheck: XAA passes all tests I ran while EXA fails two,
(transformed source and mask).
XAA fails other tests which I did not run to completion,
(and I haven't run against EXA at all).
Cairo test suite: From a first look, it appears this suite found 1 bug
in XAA and 2 or 3 bugs in EXA. This suite provides
images showing the failures:
http://people.freedesktop.org/~cworth/cairo-exa-vs-xaa/quality/
Hopefully that's helpful, and hopefully the details below provide
enough information for anybody who wants to replicate this kind of
testing with other driver+hardware combinations.
-Carl
X test suite
============
Instructions for obtaining, building and running the suite can be
found here:
http://xorg.freedesktop.org/wiki/BuildingXtest
I followed those instructions and ran the test suite against an XAA X
server, and then an EXA X server, (adding only AccelMethod:exa and
AccelDFS:True options to the configuration file). When comparing the
results of vswrpt from each run, the following lines are different:
CASES TESTS PASS UNSUP UNTST NOTIU WARN FIP FAIL UNRES UNIN ABORT
XAA:
Xlib4 29 324 280 11 27 5 0 0 1 0 0 0
Xlib8 29 165 133 10 22 0 0 0 0 0 0 0
Xlib9 46 1472 1174 23 36 201 8 0 30 0 0 0
TOTAL 996 5552 4156 96 789 268 10 0 96 137 0 0
EXA:
Xlib4 29 324 275 11 27 5 0 0 6 0 0 0
Xlib8 29 165 132 10 22 0 0 0 0 1 0 0
Xlib9 46 1472 1192 23 36 201 9 0 11 0 0 0
TOTAL 996 5552 4168 96 789 268 11 0 82 138 0 0
Finding the differences in the above chart can be challenging, (wdiff
helps, but then the columns get messed up). Here's a summary of what
the above shows when changing from XAA to EXA:
Xlib4: 5 PASS become FAIL
Xlib8: 1 PASS become UNRES
Xlib9: 19 FAIL become PASS
Xlib9: 1 PASS become WARN
I haven't yet looked into chasing down the specific test cases that
have behavioral changes. Does anyone have more information about how
to go about that?
Rendercheck
===========
The rendercheck utility can be obtained via git as follows:
git clone git://anongit.freedesktop.org/git/xorg/app/rendercheck
I ran into some gotchas when naively running the rendercheck program
that results from compiling:
1. It takes forever to complete.
I computed that on my laptop the composite and cacomposite tests
would each take over 17 hours to complete. And since I wanted to
run this against multiple X server I decided I just didn't have the
patience, so I dropped those tests.
2. It generates enormous amounts of data.
Once some tests start spewing errors, they spew a *lot*. The
gradients test had spewed many hundreds of megabytes of errors in a
rather short time before I interrupted it and dropped it from my
runs.
3. It doesn't save any data, nor warn the user to save it.
This is especially problematic in light of the above two
problems. After waiting forever for a test to complete, a user can
be in the sad situation of realizing that the output spewed to the
terminal and now lost was the only information generated by
rendercheck, (aside from a final count of tests passed and tests
run).
So it would be nice to see some fixes made to this tool to make it
more usable.
As is, here's the command-line I ended up using:
./rendercheck -t fill,dcoords,scoords,mcoords,tscoords,tmcoords,blend,repeat,triangles,bug7366 > precious-error-log.rendercheck
The explicit list of tests passed to the -t option differs from the
default by not including composite, cacomposite, and gradients (as
described above). This is somewhat unfortunate as each of these tests
were definitely spewing some actual errors with an XAA server before I
got bored and killed it. Maybe someone with more patience (and hard
drive space) than I have can go back and run these tests to completion
against XAA and EXA, (or fix the tests to be more efficient first).
As for results, here is the final line of output from runs against
both XAA and EXA:
XAA: 3571749 tests passed of 3571749 total
EXA: 3571747 tests passed of 3571749 total
That is, rendercheck is only counting two small strikes against
EXA. Looking at the log file, the two failures are in
transformed src coords test 2
and transformed mask coords test 2
More details can be seen in the log files here:
http://people.freedesktop.org/~cworth/cairo-exa-vs-xaa/quality/xaa.rendercheck
http://people.freedesktop.org/~cworth/cairo-exa-vs-xaa/quality/exa.rendercheck
Cairo test suite
================
The results of running the cairo test suite are most plain to see by
just looking at the resulting images:
http://people.freedesktop.org/~cworth/cairo-exa-vs-xaa/quality/
The NoAccel case has 0 failures, (we've basically been using that as a
baseline for cairo releases, so that's not too surprising).
With XAA, 3 different tests are flagged as failures, but two of them,
(radial-gradient and random-intersections), look fine to me by visual
inspection. The 3rd failure, (unantialiased-shapes with destination
alpha), is a definite bug.
With EXA, 10 different tests are flagged as failures. It looks to me
like there are perhaps only 2 or 3 bugs indicated by the failures:
1. On several of the tests, cairo draws a checkered
background. Whenever this is drawn with a 25x25 offset, the
resulting pattern is positioned incorrectly, (this may be the same
bug that rendercheck found). This appears to account for 6 of the
10 failures.
2. The pixman-rotate test demonstrates an old bug, (long since fixed
in pixman and apparently the X server software), in which rotated
sample points were taken with the wrong sub-pixel offset. So
perhaps a similar bug exists in EXA+ati.
3 The rotate-image-surface-paint test result is horribly wrong. It
doesn't look like any familiar bug to me.
The remaining two failures, (clip-operator and source-clip-scale),
look reasonable by visual inspection, but perhaps source-clip-scale
deserves a closer look, (the difference image has a couple of pixels
that stand out in odd ways that might indicate bugs).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.x.org/archives/xorg/attachments/20070516/2b0a2393/attachment.pgp>
More information about the xorg
mailing list