[Intel-gfx] Making IGT runnable by CI and developers

Fri Jul 21 10:56:53 UTC 2017

On 20/07/2017 17:23, Martin Peres wrote:
> Hi everyone,
> 
> As some of you may already know, we have made great strides in making 
> our CI system usable, especially in the last 6 months when everything 
> started clicking together.
> 
> The CI team is no longer overwhelmed with fires and bug reports, so we 
> started working on increasing the coverage from just fast-feedback, to a 
> bigger set of IGT tests.
> 
> As some of you may know, running IGT has been a challenge that few 
> manage to overcome. Not only is the execution time counted in machine 
> months, but it can also lead to disk corruption, which does not 
> encourage developers to run it either. One test takes 21 days, on its 
> own, and it is a subset of another test which we never ran for obvious 
> reasons.
> 
> I would thus like to get the CI team and developers to work together to 
> decrease sharply the execution time of IGT, and get these tests run 
> multiple times per day!
> 
> There are three usages that the CI team envision (up for debate):
>   - Basic acceptance testing: Meant for developers and CI to check 
> quickly if a patch series is not completely breaking the world (< 10 
> minutes, timeout per test of 30s)
>   - Full run: Meant to be ran overnight by developers and users (< 6 hours)

We could start by splitting this budget to logical components/teams.

So far we have been talking about GEM and KMS, but I was just thinking 
that we may want to have a separate units on this level of likes of 
power management, DRM (core), external stuff like sw fences? TBD I guess.

Assuming GEM/KMS split only, fair thing seems to be split the time 
budget 50-50 and let the respective teams start working.

I assume this is x hours on the slowest machine?

Teams would also need easy access to up-to-date test run times.

>   - Stress tests: They can be in the test suite as a way to catch rare 
> issues, but they cannot be part of the default run mode. They likely 
> should be run on a case-by-case basis, on demand of a developer. Each 
> test could be allowed to take up to 1h.
> 
> There are multiple ways of getting to this situation (up for debate):
> 
>   1) All the tests exposed by default are fast and meant to be run:
>    - Fast-feedback is provided by a testlist, for BAT
>    - Stress tests ran using a special command, kept for on-demand testing
> 
>   2) Tests are all tagged with information about their exec time:
>    - igt at basic@.*: Meant for BAT
>    - igt at complete@.*: Meant for FULL
>    - igt at stress@.*: The stress tests
> 
>   3) Testlists all the way:
>    - fast-feedback: for BAT
>    - all: the tests that people are expected to run (CI will run them)
>    - Stress tests will not be part of any testlist.

I have a historical fondness for tagging and have just sent a v2 of my 
tagging RFC. There would be some work involved to convert all tests to 
support --list-subtest, but once there it sounds flexible and easy to 
use to me.

How well this would fit with the CI systems I don't have a good 
visibility to. So ultimately I don't care that much what gets picked 
unless it ends up being very cumbersome or work intensive for either side.

To re-iterate:

  * if we get a clear time allocation for GEM for example
  * URL showing us how do we stand relative to that dynamically
  * method of adding/removing tests to the default/full/extended 
(whatever people want to call it) test run

Then I think this is enough for us to start working towards the common goal.

> Whatever decision is being accepted, the CI team is mandating global 
> timeouts for both BAT and FULL testing, in order to guarantee 
> throughput. This will require the team as a whole to agree on time 
> quotas per sub-systems, and enforce them.

Is the current CI capable of adding together total per sub-system 
runtimes, and based on what does it do that? I am wondering about tests 
which do not prefix with gem_ or kms_ here.

Regards,

Tvrtko