[Intel-gfx] [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb

Lofstedt, Marta marta.lofstedt at intel.com
Thu Oct 26 11:19:25 UTC 2017



> -----Original Message-----
> From: Lofstedt, Marta
> Sent: Thursday, October 26, 2017 1:57 PM
> To: 'Martin Peres' <martin.peres at linux.intel.com>; Joonas Lahtinen
> <joonas.lahtinen at linux.intel.com>; Daniel Vetter <daniel.vetter at ffwll.ch>;
> Intel Graphics Development <intel-gfx at lists.freedesktop.org>; Latvala, Petri
> <petri.latvala at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Subject: RE: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
> 
> Since the discussion on this died and I believe that everyone are scared that
> the dodge suggestion would require someone to do some work.
> I could Ack the patch if:
> 
> +	igt_skip_on_f(IS_SANDYBRIDGE(intel_get_drm_devid(fd)),
> +		      "random incompletes in CI with this test\n");
> +
> 
> Was replaced with an igt_warn
> 
Forgot to write, it should be igt_warn paired with success on the test. This will produce the WARN result, note this is not the same as dmesg-warn. This is quite rare and it will be noticed.

> /Marta
> 
> > -----Original Message-----
> > From: Martin Peres [mailto:martin.peres at linux.intel.com]
> > Sent: Monday, October 23, 2017 12:29 PM
> > To: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>; Daniel Vetter:
> > <daniel.vetter at ffwll.ch>; Intel Graphics Development <intel-
> > gfx at lists.freedesktop.org>; Latvala, Petri <petri.latvala at intel.com>
> > Cc: Chris Wilson <chris at chris-wilson.co.uk>; Lofstedt, Marta
> > <marta.lofstedt at intel.com>
> > Subject: Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on
> > snb
> >
> > On 20/10/17 12:26, Joonas Lahtinen wrote:
> > > + Petri
> > >
> > > On Thu, 2017-10-19 at 16:29 +0300, Martin Peres wrote:
> > >> On 19/10/17 12:51, Daniel Vetter wrote:
> > >>> CI gets upset about it resulting in an incomplete, let's skip it
> > >>> until that's fixed to avoid havoc in the CI farm. Of course this
> > >>> should/will be reverted as soon as we have a fix (similar to how
> > >>> we dealt with the snb-dies-in-blt-hangs issue).
> > >>>
> > >>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > >>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > >>> Cc: "Lofstedt, Marta" <marta.lofstedt at intel.com>
> > >>> Cc: Martin Peres <martin.peres at linux.intel.com>
> > >>> References:
> > >>> https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@in-flight-sus
> > >>> pe
> > >>> nd.html
> > >>> References: https://bugs.freedesktop.org/show_bug.cgi?id=103289
> > >>> Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> > >
> > > <SNIP>
> > >
> > >> So, let's recap the problem here:
> > >>   - Any incomplete in sharded runs mean that the platform is unfit
> > >> for pre-merge (because any other test after will go from pass to notrun)
> > >>   - We can't fix issues immediately, especially for old platforms
> > >>
> > >> This patch is sweeping the test under the rug by using the skip
> > >> output, which is not only hard to track, it is also misleading.
> > >>
> > >> After discussing with Marta, Arek and Petri, we found some
> > >> consensus on the following proposal (terminology is up for debate):
> > >>
> > >> - Introduce igt_dodge_on(cond, label): Report a pre-emptive 'fail'
> > >> when the condition is true. Make sure this is over-ridable with
> > >> IGT_DODGE=0 so as we can easily run these tests without recompiling
> > them.
> > >
> > > Make this igt_skip_on_ci(cond) and require IGT_CI=1 to activate them.
> > > Much like with simulation.
> >
> > replace skip with fail, and we agree. Skips are too easy to ignore!
> >
> > >
> > > Still, a BIOS update to one of the CI machines might mean (if it's
> > > not now the case, not very far fetched for the future) that we go
> > > churn in the IGT codebase to drop bunch of these. That's not the
> > > optimal workflow I can think of when we're discussing a separate
> > > mailing list for IGT discussion and patches to make it more
> > > self-contained. Then we bind that new mailing list to our CI farm
> > > contents, and bind making fixes to the CI farm operation directly to the
> IGT reviewing bandwidth?
> >
> > Isn't what we are proposing doing exactly this? By changing the source
> > code of IGT, we allow people to send patches to remove some
> > workarounds and see if they pass or fail in the same way they would
> > propose any change to IGT. Moreover, we make the running of IGT in our
> > farm as transparent as possible.
> >
> > >
> > > I'm still thinking best way would be that CI would mask the known
> > > problematic ones from the failure/pass criteria, and then somebody
> > > actually looks at the masked on after their testing coverage is
> > > prioritized. I think IGT should try to provide a wide range of tests
> > > that are supposed to work on any certain hardware. If they don't,
> > > it's not a reason to change the tests itself.
> >
> > This is true that some skips will be highly-machine specific, but
> > isn't our role as developers to know what and what doesn't work? By
> > pushing this to an external whitelist only for CI, we miss an
> > opportunity to improve this CI blacklist.
> >
> > Now, let me remind you that this blacklist is *only* for tests that
> > hang the machine or leave it in an inconsistant state which will lead to a
> crash later.
> >
> > >
> > > With the filter, we can grow the testing coverage for the new
> > > platforms, even if CI happens to have odd machines that may not pass
> > > those tests (and we may not have the resources to immediately fix
> > > those). All this without churning on the IGT codebase.
> >
> > You are describing what cibuglog is already doing. Failing tests cases
> > are suppressed in pre-merge, and associated bugs[1].
> >
> > See above for what this proposal is about.
> >
> > [1] https://intel-gfx-ci.01.org/cibuglog/
> >
> > >
> > > But if this is the only technically viable solution in short-term,
> > > then so be it. I just see better options too.
> >
> > Maybe we need a write up our workflow. This time, a public one!
> >
> > I hope


More information about the Intel-gfx mailing list