[Intel-gfx] [RFC i-g-t 0/4] Redundant test pruning

Daniel Vetter daniel at ffwll.ch
Tue Jun 27 13:10:54 UTC 2017


On Tue, Jun 27, 2017 at 12:46:28PM +0100, Chris Wilson wrote:
> Quoting Daniel Vetter (2017-06-27 10:14:40)
> > On Tue, Jun 27, 2017 at 09:02:02AM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 26/06/2017 17:09, Daniel Vetter wrote:
> > > > On Fri, Jun 23, 2017 at 12:31:39PM +0100, Tvrtko Ursulin wrote:
> > > > > From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > > > 
> > > > > Small series which saves test execution time by removing the redundant tests.
> > > > > 
> > > > > Tvrtko Ursulin (4):
> > > > >    igt: Remove default from the engine list
> > > > >    gem_exec_basic: Exercise the default engine selection
> > > > >    gem_sync: Add all and store_all subtests
> > > > >    extended.testlist: Remove some test-subtest combinations
> > > > 
> > > > Ack on patches 1&2, but I'm not sold on patch 3. Atm gem_* takes a
> > > > ridiculous amount of machine time to run, you're adding more stuff. Are
> > > > those tests really drastially better at catching races if we run them 10x
> > > > longer? Is there no better way to exercise the races (lots more machines,
> > > > maybe slower ones, which is atm impossible since it just takes way, way
> > > > too long and we need an entire farm just for one machine).
> > > 
> > > New gem_sync subtests were suggested by Chris after I send the first version
> > > of the series with the goal of getting the same coverage in faster time.
> > > 
> > > If you look at patch 4, it removes 18 * 150s of gem_sync subtests, and adds
> > > 4 * 150s. So in total we are 35 minutes better of in the best case, a bit
> > > less on smaller machines.
> > 
> > So why keep the other 18 tests when we have coverage by the new ones? Some
> > developer modes (like e.g. kms_frontbuffer_tracking has) for testing is
> > all nice, but piling ever higher amounts of redundant tests up isn't great
> > imo.
> 
> They are redundant? The subtle differences have dramatic impact on
> timings and bug discovery. I was suggesting that if we were going to
> run a cutdown test, it may as well be engineered for the task. I am very
> happy if we could replace all of the bulk stess tests with a fuzzy
> approach. We obviously have to keep a minimal set to check expected
> behaviour and to catch old regressions, but trying to capture all the
> ways the hw can fail and muck up the driver should be automated. I've
> been wondering if we can write a mock device powered by BPF (or
> something) and see if we can do fault injection for the more obscure
> code paths. Regular fuzzing over the abi to maximise code coverage is
> much easier than defining how the hw is supposed to react and fuzzing
> the hw through the driver.
> 
> I don't agree that cutting them out of CI helps me at all trying to find
> bugs with mtbf of over 24 hours. CI scales by adding more machines, not
> by reducing tests. We need more diversity in our tests, not less.

The reality is that you want more machines than we can get right now. That
means CI won't be able to find stuff with a mtbf bigger than a few hours,
at least not consistently. Adapt your test strategy pls (or someone with
much less clue about what makes sense like me will chaos-monkey igt until
it fits).

Alternative is that we simply don't run any of the gem_* tests (this is
where we are right now, since way too long), and I don't see how that's
better.

In an ideal world I'd fully agree with you, but somehow we ended up in the
wrong multiverse ...

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list