[igt-dev] [PATCH i-g-t 1/2 v2] runner: Add support for aborting on network failure

Petri Latvala petri.latvala at intel.com
Mon May 20 09:36:10 UTC 2019


On Mon, May 20, 2019 at 09:17:58AM +0300, Arkadiusz Hiler wrote:
> On Thu, May 16, 2019 at 02:27:54PM +0300, Petri Latvala wrote:
> > If the network goes down while testing, CI tends to interpret that as
> > the device being down, cutting its power after a while. This causes an
> > incomplete to an innocent test, increasing noise in the results.
> > 
> > A new flag to --abort-on-monitored-error, "ping", uses liboping to
> > ping a host configured in .igtrc with one ping after each test
> > execution and aborts the run if there is no reply in a hardcoded
> > amount of time.
> > 
> > v2:
> >  - Use a higher timeout
> >  - Allow hostname configuration from environment
> > 
> > Signed-off-by: Petri Latvala <petri.latvala at intel.com>
> > Cc: Arkadiusz Hiler <arkadiusz.hiler at intel.com>
> > Cc: Martin Peres <martin.peres at linux.intel.com>
> > Cc: Tomi Sarvela <tomi.p.sarvela at intel.com>
> > ---
> >  meson.build        |   1 +
> >  meson_options.txt  |   6 ++
> >  runner/executor.c  | 135 +++++++++++++++++++++++++++++++++++++++++++++
> >  runner/meson.build |  12 +++-
> >  runner/settings.c  |   4 ++
> >  runner/settings.h  |   5 +-
> >  6 files changed, 160 insertions(+), 3 deletions(-)
> > 
> > diff --git a/meson.build b/meson.build
> > index a05e912c..aac67f1a 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -100,6 +100,7 @@ build_tests = get_option('build_tests')
> >  with_libdrm = get_option('with_libdrm')
> >  with_libunwind = get_option('with_libunwind')
> >  build_runner = get_option('build_runner')
> > +with_oping = get_option('with_oping')
> 
> I find it a bit distracting that we have this bit here, whereas
> everything else is in runner/meson.build. Is there any reason for that?

Oh, I thought we had all option fetches in toplevel. Will move that to
runner/.


> 
> >  _build_overlay = build_overlay != 'false'
> >  _overlay_required = build_overlay == 'true'
> <SNIP>
> > diff --git a/runner/meson.build b/runner/meson.build
> > index b658f1d2..a54aaab4 100644
> > --- a/runner/meson.build
> > +++ b/runner/meson.build
> > @@ -1,4 +1,13 @@
> >  jsonc = dependency('json-c', required: _runner_required)
> > +runner_deps = [jsonc, glib]
> > +have_oping = []
> > +if with_oping != 'false'
> > +	oping = dependency('liboping', required: with_oping == 'true')
> > +	if oping.found()
> > +		runner_deps += oping
> > +		have_oping = '-DHAVE_OPING=1'
> > +	endif
> > +endif
> 
> I think it is about time to:
> https://mesonbuild.com/Release-notes-for-0-47-0.html#new-type-of-build-option-for-features
> 
> I'll check meson versions in various popular distros Today and see how
> the patch would look like.
>

Oh yes please!



> >  runnerlib_sources = [ 'settings.c',
> >  		      'job_list.c',
> > @@ -21,7 +30,8 @@ if _build_runner and _build_tests and jsonc.found()
> >  
> >  	runnerlib = static_library('igt_runner', runnerlib_sources,
> >  				   include_directories : inc,
> > -				   dependencies : [jsonc, glib])
> > +				   c_args : have_oping,
> 
> I know that we have just a single dep here, but can we name this
> runner_lib_c_args or something?


Sure.


> 
> > +				   dependencies : runner_deps)
> >  
> >  	runner = executable('igt_runner', runner_sources,
> >  			    link_with : runnerlib,
> 
> Other than that the C part looks ok. I share Daniel's sentiment on
> network and NIC possibly being be a bit too "best-efforty" for aborting
> on a single dropped ping.


Yes, I'll make it retry a few times in the next version.


> It also seems like the feature haven't been properly tested, as the HAX
> does not specify any IP address.


Why would it need a HAX? The .igtrc configuration from the time of v1
was still in place. See run.log, "abort on ping" doesn't appear on the
machines I tested, and build log (not publicly available) says
liboping was used.


-- 
Petri Latvala


More information about the igt-dev mailing list