[cairo] What does it take to get a make check to pass with the xcb target, using CAIRO_REF_DIR?

Sat Jul 9 12:11:22 UTC 2016

On 01/07/16 06:31, Bryce Harrington wrote:
> We do know there are certain tests with race conditions that will
> alternate between passing and failing in two identical, sequential test
> runs.  Those test cases you mention don't ring a bell, but if you see
> it's the same set doing that each time, then that could be the case.  I
> don't know what to do about those, and would not oppose disabling those
> test cases (although I assume they were meant to test something
> important...)

Is this the pthreads tests? I'm assuming it is just testing for cairo
errors and crashes. If the output is never consistent maybe the image
comparison should be avoided for these tests. There are some tests that
don't compare images eg mime-surface-api.c.

> Perhaps a small subset of the test suite passed at one point but,
> general test runs have resulted in over half the test cases failing for
> as long as I've been involved in the project.

Most of the issues with the test suite are due to external dependencies.
The image backend has the least dependencies so I would start with
aiming for zero failures with image before trying the other backends.

The image backend output has dependencies on pixman, freetype, and the
fonts used. There is some information in the test/README on the fonts. I
just ran a test on image and I get the following failures:

- coverage-rhombus
- radial-gradient
- radial-gradient-source
- radial-gradient-mask-source
- pthread-same-source

Checking the index.html, the radial tests look like they just need a ref
image refresh. I can not see any thing wrong with the new images. Maybe
something changed in pixman? I'll push a fix for this.

The coverage-rhombus looks also looks like it may need a ref refresh.
However I am unfamiliar with what this test is testing.

The cairo ref images have usually been generated on Debian. Other
systems may have more failures due to different fonts being selected.

I've considered adding a test that checks the md5 of the fonts used. But
it may be better to look at adding scripts for setting up and running
the tests in a VM or container that contains the exact dependencies
required to pass the tests.

The other backends are more problematic due to the additional
dependencies. I'm only familiar with running the PDF/PS tests. These are
very sensitive to the versions of poppler and ghostscript used.
Currently poppler git is required for PDF due to a recent fix, and
ghostscript 9.06 is required for PS.

> I am still uncertain about
> whether it is appropriate to regenerate the reference images before
> running the test suite, or if that might create false positives.

For the non image backends, updating the references whenever there are
fails is fine as long as the output looks the same as the image backend.
But we should define the required versions of external dependencies so
you are not just making the tests pass on your own system.

The image backend refs may also require an occasional refresh due to
external changes but more care should be take to ensure that the new
image is correct as it is the master image that is used when checking to
see if failures in other backends are real failures or just some noise
due to an external change.