so I recently looked a bit at the time that our tests take and I think we
need to think about new designs for how we are executing tests. I don't
think that it is yet a huge problem but as the numbers will show we are
becoming victims of our own success.

So let's start with some numbers (all taken on gandalf, a TDF server with 8
cores, in a dbgutil build without ccache):

The results for the initial build without building or executing the tests:

real    70m17.990s
user    436m43.860s
sys    28m3.680s

After that the results of a time make, therefore forcing the build of
everything test related and executing the tests:

real    11m30.192s
user    58m43.384s
sys    1m36.876s

And finally a second time make to measure the time it takes to just execute
the tests:

real    6m37.479s
user    45m4.740s
sys    0m34.988s

So as can be seen there is still a huge difference between the time it
takes to complete a build and the time to run the tests. However I'm still
worried that the nearly 1 hour of CPU time to execute the tests might hurt
people with less powerful machines. And even the 45 minutes of CPU time
that it takes to just run the tests are a lot.

As mentioned I think it is nor yet so bad that we need an immediate
solution but I also think that now is the time to think how to solve the
problem. Based on some simple assumptions (5.5 years of LibO, 60 minute CPU
time, constant addition of tests) I guess that we add at least 10 minutes
of CPU time to the build each year, but I would assume that it is easily
more in the range of 15 to 20 minutes.

Now when it comes to solutions it is not that easy. As we all know the
tests have helped us a lot in the last years to improve the quality of the
product and decrease the number of regressions escaping into releases. And
as most of you know I'm still not happy that not for all or at least most
of the bug fixes we have a test. So I think we need a solution that still
encourages to write as many tests as possible while also finding a way to
keep the build time under control. If the tests take too long to execute we
risk more and more people to just skip executing the tests locally and
relying completely on gerrit.

The first solution that come to my mind is to move a few of the tests into
a test target that is not executed as part of make. We can still make sure
that gerrit/CI are executing these tests on all platforms but would save
quite some time executing them on developer machines. Developers that have
touched code related to a feature that is part of such a test are of course
encouraged to execute the tests locally. We already did something similar
at some point with make vs make slowcheck. A good set of tests would be all
our export tests as they are by far the tests that take most time. The big
disadvantage with this approach is that our tests are not executed on that
many different configurations any more. At least we used to find a few
problems in the past when tests failed on some strange platforms. However
this might have become less of an issue with all the improvements around
crash testing, fuzzing and static analyzers.

Another solution, well at least a way to treat the symptoms a bit, would be
to look at the existing slow tests and figure out why they are so slow. I
did that for the chart2export test already which took about 2 minutes of
CPU time and discovered that it needlessly imported all the files again
(which saved 30 seconds) and that we have some really inefficient xls/xlsx
export code (responsible for another 30 seconds). I believe that just
running VALGRIND=callgrind make and then analysing the results would help
quite a bit. Again the import and export tests are good targets for these
attempts and will most likely help with the general import and export

The last idea is to use more of the XPath assert stuff instead of the
import->export->import cycle and therefore making the export tests less

I hope this long text gives you something to think about.

P.S. If you are interested in how long a test takes you can look into
