[Piglit] [PATCH][RFC] Add aborting option when a monitored error is detected
Dylan Baker
baker.dylan.c at gmail.com
Fri Apr 1 21:07:55 UTC 2016
Quoting Olivier Berthier (2016-04-01 03:18:50)
> On Thursday 17 Mar 2016 à 12:13:33 (-0700), Dylan Baker wrote:
> > Quoting Olivier Berthier (2016-03-17 10:44:04)
> > > This adds a policy which advises when the user should reboot the system
> > > to avoid noisy test results due to system becoming unstable, for
> > > instance, and therefore continues testing successfully.
> > >
> > > To do this, a new module is proposed. A class Monitoring is used for
> > > managing the monitoring rules. Two types of rules, MonitoringFile and
> > > MonitoringLinuxDmesg, derived from the abstract class MonitoringBase,
> > > have been implemented. The first allow to track a pattern on standard
> > > files or device files. The second, derived from dmesg.LinuxDmesg, will
> > > track a pattern on the dmesg.
> > >
> > > The monitoring rules must be defined in piglit.conf at the section
> > > monitored-errors. If one of the regex is found, Piglit will store the
> > > cause in the TestResult object, raise a PiglitAbortException exception,
> > > stop the test execution -terminating test thread pool- and exit with
> > > code 3.
> > >
> > > Then test execution resume, after rebooting the system or not, is done
> > > like usually with command line parameter "resume".
> > >
> > > To call it, use command line parameter: --abort-on-monitored-error
> > >
> > > This option implies --no-concurrent
> > > ---
> > > framework/exceptions.py | 13 +++
> > > framework/monitoring.py | 266 ++++++++++++++++++++++++++++++++++++++++++++++
> > > framework/options.py | 2 +
> > > framework/profile.py | 32 +++++-
> > > framework/programs/run.py | 18 +++-
> > > framework/results.py | 9 +-
> > > framework/test/base.py | 5 +-
> > > piglit.conf.example | 24 +++++
> > > 8 files changed, 362 insertions(+), 7 deletions(-)
> > > create mode 100644 framework/monitoring.py
> > >
> > > diff --git a/framework/exceptions.py b/framework/exceptions.py
> > > index 6e87bee..8e8e9a0 100644
> > > --- a/framework/exceptions.py
> > > +++ b/framework/exceptions.py
> > > @@ -30,6 +30,7 @@ __all__ = [
> > > 'PiglitInternalError',
> > > 'PiglitFatalError',
> > > 'PiglitException',
> > > + 'PiglitAbortException',
> > > 'handler',
> > > ]
> > >
> > > @@ -51,6 +52,9 @@ def handler(func):
> > > except PiglitFatalError as e:
> > > print('Fatal Error: {}'.format(str(e)), file=sys.stderr)
> > > sys.exit(1)
> > > + except PiglitAbortException as e:
> > > + print('Aborting Piglit execution: {}'.format(str(e)), file=sys.stderr)
> > > + sys.exit(3)
> > >
> > > return _inner
> > >
> > > @@ -87,3 +91,12 @@ class PiglitFatalError(Exception):
> > > to the top of the program where it exits.
> > >
> > > """
> > > +
> > > +
> > > +class PiglitAbortException(Exception):
> > > + """Class for non-errors that require piglit aborting.
> > > +
> > > + When this class (or a subclass) is raised it should be raised all the way
> > > + to the top of the program where it exits.
> > > +
> > > + """
> >
> > I think just calling this PiglitAbort would be better.
> >
> > > diff --git a/framework/monitoring.py b/framework/monitoring.py
> > > new file mode 100644
> > > index 0000000..65f1cc9
> > > --- /dev/null
> > > +++ b/framework/monitoring.py
> > > @@ -0,0 +1,266 @@
> > > +# Copyright (c) 2016 Intel Corporation
> > > +#
> > > +# Permission is hereby granted, free of charge, to any person obtaining a
> > > +# copy of this software and associated documentation files (the "Software"),
> > > +# to deal in the Software without restriction, including without limitation
> > > +# the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > > +# and/or sell copies of the Software, and to permit persons to whom the
> > > +# Software is furnished to do so, subject to the following conditions:
> > > +#
> > > +# The above copyright notice and this permission notice (including the next
> > > +# paragraph) shall be included in all copies or substantial portions of the
> > > +# Software.
> > > +#
> > > +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > > +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > > +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> > > +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > > +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > > +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > > +# IN THE SOFTWARE.
> > > +
> > > +""" Module implementing classes for monitoring
> >
> > It's kinda nitpicky, but please format opening lines to docstrings like
> > this:
> > """Module implementing classes for monintoring.
> >
> >
> > > +
> > > +This module provides a set of classes for monitoring the system on multiple
> > > +sources. The sources can be standard files, device files or the dmesg.
> > > +The patterns of the errors are defined in piglit.conf, at the section
> > > +monitored-errors.
> > > +
> > > +When one of the regex is found in the corresponding source, the message is
> > > +added to TestResult and abort Piglit with code 3.
> > > +
> > > +"""
> > > +
> >
> > We need to run on python 2.7 and 3+, so please be sure to add the
> > __future__ imports like the other files.
> >
> > Also if you could sort this that would be great
> >
> > > +import abc
> > > +import re
> >
> > sys in unused
> >
> > > +import sys
> > > +import ast
> > > +import os
> > > +import fcntl
> > > +import errno
> >
> >
> > > +
> > > +import six
> > > +
> >
> > exceptions is unused
> >
> > > +from framework import exceptions, core
> > > +from framework.dmesg import LinuxDmesg
> > > +
> >
> > Sort __all__ as well please.
> >
> > Also, the names here and the names of the classes don't match.
> >
> > > +__all__ = [
> > > + 'LinuxMonitoringDmesg',
> > > + 'LinuxMonitoringFile',
> > > + 'BaseMonitoring',
> > > + 'Monitoring',
> > > +]
> > > +
> > > +
> >
> > Since we need python 2.7 (and because it's just idiomatic), please make
> > all classes without another parent descend from object.
> >
> > > +
> > > +class Monitoring():
> > > + """ Class that initialize and update monitoring classes
> > > +
> > > + This reads the rules from the section monitored-errors in piglit.conf
> > > + and initializes the monitoring objects. Their type must be derived from
> > > + BaseMonitoring and specialized according the field 'type'.
> > > +
> > > + """
> > > +
> > > + # starting time: user must know current machine/GPU state before
> > > + # starting piglit: we can resume test execution based on last stopping
> > > + # if we consider that our system is still in good shape (ie not blocking
> > > + # run forever)
> > > + _abort_error = None
> > > + _monitoring_rules = None
> > > +
> > > + def __init__(self, monitoring_enabled):
> > > + """ Create a LinuxMonitored instance """
> > > + # Get the monitoring rules from piglit.conf and store them into a dict.
> > > + self._monitoring_rules = {}
> > > + if monitoring_enabled:
> > > + for (key, value) in core.PIGLIT_CONFIG.items('monitored-errors'):
> >
> > We don't use parens around values when exploding containers.
> >
> > > + params = ast.literal_eval(value)
> >
> > I'm not sure how I feel about using ast.literal_eval, it is better than
> > using eval for sure though. I just don't know what would be better.
> > Embedding JSON seems hacky, and representing this in a single block of
> > ini with parameters seems equally hacky.
> >
> > > + rule = None
> > > + if params['type'] == 'file':
> > > + rule = MonitoringFile(params['parameters'],
> > > + params['regex'])
> > > + elif params['type'] == 'device':
> > > + rule = MonitoringFile(params['parameters'],
> > > + params['regex'], True)
> > > + elif params['type'] == 'dmesg':
> > > + rule = MonitoringLinuxDmesg(params['parameters'],
> > > + params['regex'])
> > > + self._monitoring_rules[key] = rule
> > > +
> > > + @property
> > > + def abort_needed(self):
> >
> > If the docstring is one line just tuck it all up into one line:
> > """Simply return if _abort_error variable is not empty."""
> >
> > > + """ simply return if _abort_error variable is not empty
> > > +
> > > + """
> > > + return self._abort_error is not None
> > > +
> > > + def update_monitoring(self):
> > > + """ Update the new messages for each monitoring object
> > > +
> > > + """
> > > + if self._monitoring_rules:
> >
> > You will need to use six functions for iterating items to ensure the
> > same behavior on python 2.x and 3.x. six.itervalues would probably be
> > best, since it eliminates the unused "key" value.
> >
> > There are a lot of instances of this.
> >
> > > + for (key, monitoring_rule) in self._monitoring_rules.items():
> > > + monitoring_rule.update_monitoring()
> > > +
> > > + def update_result(self, result):
> > > + """ Check monitoring objects statue
> > > +
> > > + This method checks the state for each monitoring object.
> > > + If one of them found the pattern in the new messages,
> > > + set itself on abort_needed state and store the error message
> > > + in the result.
> > > +
> > > + Arguments:
> > > + result -- A TestResult instance
> > > +
> > > + """
> > > + # Get a new snapshot of the source
> > > + self.update_monitoring()
> > > +
> > > + if self._monitoring_rules:
> > > + for (key, monitoring_rule) in self._monitoring_rules.items():
> > > + # Get error message
> > > + self._abort_error = monitoring_rule.check_monitoring()
> > > + # if error message is not empty, abort is requested
> > > + if self.abort_needed:
> > > + # Save from which monitoring rule the error come
> > > + result.abort_error = "\n".join(
> > > + monitoring_rule._new_messages)
> >
> > You're publicly accessing a protected attribute. I would recommend
> > adding a public method like does the join for you too.
> >
> > > + result.abort_error_source = key
> > > + break
> > > +
> >
> > The space is odd here, you should move one newline after 'return result'
> >
> > > +
> > > + return result
> > > +
> > > + at six.add_metaclass(abc.ABCMeta)
> > > +class BaseMonitoring(object):
> > > + """ Abstract base class for Monitoring derived objects
> > > +
> > > + This provides the bases of the constructor and most subclasses should call
> > > + super() in __init__(). It provides a conctret implementation of the
> > > + check_monitoring() method, which should be suitible for all subclasses.
> > > +
> > > + The update_monitoring() method need to be override for all subclasses.
> > > +
> > > + """
> > > + @abc.abstractmethod
> > > + def __init__(self, monitoring_source, regex):
> > > + """ Abstract constructor for BaseMonitoring subclasses
> > > +
> > > + Arguments;
> > > + monitoring_source -- The source to monitor
> > > + regex -- The errors pattern
> > > +
> > > + """
> > > + self._monitoring_source = monitoring_source
> > > + self._monitoring_regex = re.compile(regex)
> >
> > You should add self._new_messages to this initializer since
> > check_monitoring uses it.
> >
> > > +
> > > + @abc.abstractmethod
> > > + def update_monitoring(self):
> > > + """ Update _new_messages list
> > > +
> > > + This method should read a monitoring source like a file or a program
> > > + output, determine new entries since the last read of the source,
> > > + and update self._new_messages with those entries
> > > +
> > > + """
> > > + pass
> > > +
> > > + def check_monitoring(self):
> > > + """ Check _new_messages
> > > +
> > > + This method checks if the regex is found in the new messages,
> > > + then return a string that contain the monitored location and
> > > + the matched line
> > > +
> > > + """
> > > + if self._new_messages and self._monitoring_regex:
> > > + for line in self._new_messages:
> > > + if self._monitoring_regex.search(line):
> > > + return "From {}:\n{}".format(self._monitoring_source,line)
> > > +
> > > +
> > > +class MonitoringFile(BaseMonitoring):
> > > + """ Monitoring from a file
> > > +
> > > + This class is for monitoring the system from a file that
> > > + can be a standard file or a device file
> > > +
> > > + Arguments:
> > > + isDevice -- True if the target is a device file
> >
> > We use PEP8 compliant names. This should be "is_device", the same for
> > the attribute name.
> >
> > > +
> > > + """
> > > + def __init__(self, monitoring_source, regex, isDevice=False):
> > > + """ Create a MonitoringFile instance """
> > > + self._last_message = None
> > > + self._isDevice = isDevice
> > > + super(MonitoringFile, self).__init__(monitoring_source, regex)
> > > +
> > > + def update_monitoring(self):
> > > + """ Open the file and get the differences
> > > +
> > > + Get the contents of the file, then calculate new messages.
> > > + This implements also a specific method for reading device files.
> > > +
> > > + """
> > > + try:
> > > + with open(self._monitoring_source, 'r') as f:
> > > + lines = []
> > > + if self._isDevice:
> > > + # Create a duplicated file descriptor, this avoid lock
> > > + fd = os.dup(f.fileno())
> >
> > Don't call close with the context manager, I would just dedent all of
> > the block after the else clause to drop out of the context manager.
> >
> > > + f.close()
> > > + # use I/O control for reading the lines
> > > + fcntl.fcntl(fd, fcntl.F_SETFL, os.O_NONBLOCK)
> > > + while True:
> > > + try:
> > > + line = os.read(fd, 1024)
> > > + except OSError as e:
> > > + if e.errno == errno.EAGAIN:
> > > + break
> > > + else:
> > > + raise e
> > > +
> > > + lines.append(line.decode("utf-8", "strict"))
> > > + os.close(fd)
> > > +
> > > + else:
> > > + lines = f.read().splitlines()
> > > +
> > > + # Find all new entries, do this by slicing the list of the lines to only
> > > + # returns elements after the last element stored. If there are not
> > > + # matches a value error is raised, that means all of the lines are new
> > > + l = 0
> > > + for index, item in enumerate(reversed(lines)):
> > > + if item == self._last_message:
> > > + l = len(lines) - index # don't include the matched element
> > > + break
> > > + self._new_messages = lines[l:]
> > > + # Attempt to store the last element of lines,
> > > + # unless there was no line
> > > + self._last_message = lines[-1] if lines else None
> > > + except Exception:
> > > + # if an error occured, we consider there are no new messages
> > > + self._new_messages = []
> > > + pass
> > > +
> > > +
> > > +class MonitoringLinuxDmesg(BaseMonitoring, LinuxDmesg):
> > > + """ Monitoring on dmesg
> > > +
> > > + This class is for monitoring on the system dmesg. It's inherited
> > > + from LinuxDmesg for the dmesg processing methods.
> > > +
> > > + Work only on Linux operating system.
> > > +
> > > + """
> > > + def __init__(self, monitoring_source, regex):
> > > + """ Create a MonitoringLinuxDmesg instance """
> > > + self.DMESG_COMMAND = monitoring_source.split()
> > > + BaseMonitoring.__init__(self, monitoring_source, regex)
> > > + LinuxDmesg.__init__(self)
> > > +
> > > + def update_monitoring(self):
> > > + """ Call update_dmesg """
> > > + self.update_dmesg()
> > > diff --git a/framework/options.py b/framework/options.py
> > > index cf49520..f6fb141 100644
> > > --- a/framework/options.py
> > > +++ b/framework/options.py
> > > @@ -178,6 +178,7 @@ class _Options(object): # pylint: disable=too-many-instance-attributes
> > > exclude_filter -- list of compiled regex which exclude tests that match
> > > valgrind -- True if valgrind is to be used
> > > dmesg -- True if dmesg checking is desired. This forces concurrency off
> > > + monitored -- True if monitoring is desired. This forces concurrency off
> > > env -- environment variables set for each test before run
> > >
> > > """
> > > @@ -193,6 +194,7 @@ class _Options(object): # pylint: disable=too-many-instance-attributes
> > > self.valgrind = False
> > > self.dmesg = False
> > > self.sync = False
> > > + self.monitored = False
> > >
> > > # env is used to set some base environment variables that are not going
> > > # to change across runs, without sending them to os.environ which is
> > > diff --git a/framework/profile.py b/framework/profile.py
> > > index fc38c56..d385db6 100644
> > > --- a/framework/profile.py
> > > +++ b/framework/profile.py
> > > @@ -35,6 +35,7 @@ import multiprocessing.dummy
> > > import importlib
> > > import contextlib
> > > import itertools
> > > +from functools import partial
> > >
> > > import six
> > >
> > > @@ -42,6 +43,7 @@ from framework import grouptools, exceptions, options
> > > from framework.dmesg import get_dmesg
> > > from framework.log import LogManager
> > > from framework.test.base import Test
> > > +from framework.monitoring import Monitoring
> > >
> > > __all__ = [
> > > 'TestProfile',
> > > @@ -162,6 +164,8 @@ class TestProfile(object):
> > > self._dmesg = None
> > > self.dmesg = False
> > > self.results_dir = None
> > > + self._monitoring = None
> > > + self.monitoring = False
> > >
> > > @property
> > > def dmesg(self):
> > > @@ -179,6 +183,22 @@ class TestProfile(object):
> > > """
> > > self._dmesg = get_dmesg(not_dummy)
> > >
> > > + @property
> > > + def monitoring(self):
> > > + """ Return monitoring """
> > > + return self._monitoring
> > > +
> > > + @monitoring.setter
> > > + def monitoring(self, monitored):
> > > + """ Set monitoring
> > > +
> > > + Arguments:
> > > + monitored -- if Truthy Monioring will enable monitoring according the
> >
> > spelling: Monitoring
> >
> > > + defined rules
> > > +
> > > + """
> > > + self._monitoring = Monitoring(monitored)
> > > +
> > > def _prepare_test_list(self):
> > > """ Prepare tests for running
> > >
> > > @@ -257,16 +277,19 @@ class TestProfile(object):
> > > self._prepare_test_list()
> > > log = LogManager(logger, len(self.test_list))
> > >
> > > - def test(pair):
> > > + def test(pair, this_pool=None):
> > > """Function to call test.execute from map"""
> > > name, test = pair
> > > with backend.write_test(name) as w:
> > > - test.execute(name, log.get(), self.dmesg)
> > > + test.execute(name, log.get(), self.dmesg, self.monitoring)
> > > w(test.result)
> > > + # in case current dmesg is supporting this property, check it
> > > + if self._monitoring.abort_needed:
> > > + this_pool.terminate()
> >
> > This feels a little hacky to me. I think it would be better to replace
> > pool.imap with pool.apply_async so we can raise exceptions out of the
> > worker threads back into the main thread. Then we can simply catch the
> > PiglitAbort exception, close the pool, and reraise, without having to
> > pass the pool into the test funciton. I think I have a patch to do that
> > on the list somewhere.
> >
>
> I patched and executed Piglit with the patch from
> https://lists.freedesktop.org/archives/piglit/2015-December/018372.html,
> but it seem that the execution of concurrent tests is a little slower even
> without the monitoring patch. Also when Piglit raises the abort exception and
> terminate the pool, it starts to run the next test in the pool before aborting.
> Maybe I'm doing something wrong.
hmmm, I don't think so. I think it's a race to terminate the pool before
the next test starts. I guess that means this wont work and we do need
to pass the pool into avoid starting the next test. I don't really like
it, but I also don't see another way to handle it.
>
> diff --git a/framework/profile.py b/framework/profile.py
> index 4831c16..902db82 100644
> --- a/framework/profile.py
> +++ b/framework/profile.py
> @@ -41,6 +41,7 @@ import six
> from framework import grouptools, exceptions, options
> from framework.dmesg import get_dmesg
> from framework.log import LogManager
> +from framework.monitoring import Monitoring
> from framework.test.base import Test
>
> __all__ = [
> @@ -162,6 +163,8 @@ class TestProfile(object):
> self._dmesg = None
> self.dmesg = False
> self.results_dir = None
> + self._monitoring = None
> + self.monitoring = False
>
> @property
> def dmesg(self):
> @@ -179,6 +182,22 @@ class TestProfile(object):
> """
> self._dmesg = get_dmesg(not_dummy)
>
> + @property
> + def monitoring(self):
> + """ Return monitoring """
> + return self._monitoring
> +
> + @monitoring.setter
> + def monitoring(self, monitored):
> + """ Set monitoring
> +
> + Arguments:
> + monitored -- if Truthy Monioring will enable monitoring according the
> + defined rules
> +
> + """
> + self._monitoring = Monitoring(monitored)
> +
> def _prepare_test_list(self):
> """ Prepare tests for running
>
> @@ -258,7 +277,7 @@ class TestProfile(object):
> def test(name, test):
> """Function to call test.execute from map"""
> with backend.write_test(name) as w:
> - test.execute(name, log.get(), self.dmesg)
> + test.execute(name, log.get(), self.dmesg, self.monitoring)
> w(test.result)
>
> def run_threads(pool, testlist):
> @@ -278,6 +297,9 @@ class TestProfile(object):
>
> for r in results:
> r.get()
> + if self._monitoring.abort_needed:
> + pool.terminate()
> + raise exceptions.PiglitAbort(self._monitoring._abort_error)
>
> pool.join()
>
> > >
> > > def run_threads(pool, testlist):
> > > """ Open a pool, close it, and join it """
> > > - pool.imap(test, testlist, chunksize)
> > > + pool.imap(partial(test, this_pool=pool), testlist, chunksize)
> >
> > Rather than using partial here I would us lambda. The distinction of
> > when to use one vs the other is kinda vague, but I generally think of it
> > as partial is for objects that are going to be assigned, lambda is for
> > passing into functions.
> >
> > lambda: test, this_pool=pool
> >
> >
> > > pool.close()
> > > pool.join()
> > >
> > > @@ -293,6 +316,9 @@ class TestProfile(object):
> > >
> > > self._post_run_hook()
> > >
> > > + if self._monitoring.abort_needed:
> > > + raise exceptions.PiglitAbortException(self._monitoring._abort_error)
> > > +
> > > def filter_tests(self, function):
> > > """Filter out tests that return false from the supplied function
> > >
> > > diff --git a/framework/programs/run.py b/framework/programs/run.py
> > > index 49f7b11..2920a7f 100644
> > > --- a/framework/programs/run.py
> > > +++ b/framework/programs/run.py
> > > @@ -136,6 +136,13 @@ def _run_parser(input_):
> > > action="store_true",
> > > help="Capture a difference in dmesg before and "
> > > "after each test. Implies -1/--no-concurrency")
> > > + parser.add_argument("--abort-on-monitored-error",
> > > + action="store_true",
> > > + dest="monitored",
> >
> > Please indent like the other help messages.
> >
> > > + help="Enable monitoring according the rules defined "
> > > + "in piglit.conf, and stop the execution when a"
> > > + "monitored error is detected. Exit code 3."
> > > + " Implies -1/--no-concurrency")
> > > parser.add_argument("-s", "--sync",
> > > action="store_true",
> > > help="Sync results to disk after every test")
> > > @@ -219,7 +226,7 @@ def run(input_):
> > >
> > > # If dmesg is requested we must have serial run, this is becasue dmesg
> > > # isn't reliable with threaded run
> > > - if args.dmesg:
> > > + if args.dmesg or args.monitored:
> > > args.concurrency = "none"
> > >
> > > # build up the include filter based on test_list
> > > @@ -236,6 +243,7 @@ def run(input_):
> > > options.OPTIONS.valgrind = args.valgrind
> > > options.OPTIONS.dmesg = args.dmesg
> > > options.OPTIONS.sync = args.sync
> > > + options.OPTIONS.monitored = args.monitored
> > >
> > > # Set the platform to pass to waffle
> > > options.OPTIONS.env['PIGLIT_PLATFORM'] = args.platform
> > > @@ -266,6 +274,10 @@ def run(input_):
> > > # Set the dmesg type
> > > if args.dmesg:
> > > profile.dmesg = args.dmesg
> > > +
> > > + if args.monitored:
> > > + profile.monitoring = args.monitored
> > > +
> > > profile.run(args.log_level, backend)
> > >
> > > results.time_elapsed.end = time.time()
> > > @@ -302,6 +314,7 @@ def resume(input_):
> > > options.OPTIONS.valgrind = results.options['valgrind']
> > > options.OPTIONS.dmesg = results.options['dmesg']
> > > options.OPTIONS.sync = results.options['sync']
> > > + options.OPTIONS.monitored = results.options['monitored']
> > >
> > > core.get_config(args.config_file)
> > >
> > > @@ -327,6 +340,9 @@ def resume(input_):
> > > if options.OPTIONS.dmesg:
> > > profile.dmesg = options.OPTIONS.dmesg
> > >
> > > + if options.OPTIONS.monitored:
> > > + profile.monitoring = options.OPTIONS.monitored
> > > +
> > > # This is resumed, don't bother with time since it wont be accurate anyway
> > > profile.run(results.options['log_level'], backend)
> > >
> > > diff --git a/framework/results.py b/framework/results.py
> > > index 3bc5363..f677f1e 100644
> > > --- a/framework/results.py
> > > +++ b/framework/results.py
> > > @@ -148,7 +148,7 @@ class TestResult(object):
> > > """An object represting the result of a single test."""
> > > __slots__ = ['returncode', '_err', '_out', 'time', 'command', 'traceback',
> > > 'environment', 'subtests', 'dmesg', '__result', 'images',
> > > - 'exception', 'pid']
> > > + 'exception', 'pid', 'abort_error_source', 'abort_error']
> > > err = StringDescriptor('_err')
> > > out = StringDescriptor('_out')
> > >
> > > @@ -163,6 +163,8 @@ class TestResult(object):
> > > self.traceback = None
> > > self.exception = None
> > > self.pid = None
> >
> > str will not work here, because of python 2. Either use '' (which will
> > work because of __future__.unicode_literals), or six.text_type().
> >
> > > + self.abort_error_source = str()
> > > + self.abort_error = str()
> > > if result:
> > > self.result = result
> > > else:
> > > @@ -203,6 +205,8 @@ class TestResult(object):
> > > 'traceback': self.traceback,
> > > 'dmesg': self.dmesg,
> > > 'pid': self.pid,
> > > + 'abort_error_source': self.abort_error_source,
> > > + 'abort_error': self.abort_error,
> > > }
> > > return obj
> > >
> > > @@ -222,7 +226,8 @@ class TestResult(object):
> > > inst = cls()
> > >
> > > for each in ['returncode', 'command', 'exception', 'environment',
> > > - 'time', 'traceback', 'result', 'dmesg', 'pid']:
> > > + 'time', 'traceback', 'result', 'dmesg', 'pid',
> > > + 'abort_error_source', 'abort_error']:
> > > if each in dict_:
> > > setattr(inst, each, dict_[each])
> > >
> > > diff --git a/framework/test/base.py b/framework/test/base.py
> > > index 0fe0f74..07cad12 100644
> > > --- a/framework/test/base.py
> > > +++ b/framework/test/base.py
> > > @@ -183,7 +183,7 @@ class Test(object):
> > > assert isinstance(timeout, int)
> > > self.timeout = timeout
> > >
> > > - def execute(self, path, log, dmesg):
> > > + def execute(self, path, log, dmesg, monitoring):
> > > """ Run a test
> > >
> > > Run a test, but with features. This times the test, uses dmesg checking
> > > @@ -193,6 +193,7 @@ class Test(object):
> > > path -- the name of the test
> > > log -- a log.Log instance
> > > dmesg -- a dmesg.BaseDmesg derived class
> > > + monitoring -- a monitoring.Monitoring instance
> > >
> > > """
> > > log.start(path)
> > > @@ -201,9 +202,11 @@ class Test(object):
> > > try:
> > > self.result.time.start = time.time()
> > > dmesg.update_dmesg()
> > > + monitoring.update_monitoring()
> > > self.run()
> > > self.result.time.end = time.time()
> > > self.result = dmesg.update_result(self.result)
> > > + self.result = monitoring.update_result(self.result)
> > > # This is a rare case where a bare exception is okay, since we're
> > > # using it to log exceptions
> > > except:
> > > diff --git a/piglit.conf.example b/piglit.conf.example
> > > index 0410a17..a1e6571 100644
> > > --- a/piglit.conf.example
> > > +++ b/piglit.conf.example
> > > @@ -138,3 +138,27 @@ run_test=./%(test_name)s
> > > [expected-crashes]
> > > ; Like expected-failures, but specifies that a test is expected to
> > > ; crash.
> > > +
> > > +[monitored-errors]
> > > +; Set the monitoring rules for checking when the system need to be rebooted.
> > > +; Piglit must be launched with --abort-on-monitored-error
> > > +;
> > > +; Each monitoring rule must define the type of monitoring (dmesg, file or device).
> > > +; Depending on the type, the parameter 'parameters' is a filename or a list of options.
> > > +; The regex is the pattern that causes Piglit aborting when it's found. Examples :
> > > +;
> > > +;i915_error_state={
> > > +; 'type':'file',
> >
> > Please format these like python code, with a space after ':', and please
> > use a comma on the last line>
> >
> > Regexes should also be raw strings (Use an "r" prefix directly before
> > the opening ' or ")
> >
> > > +; 'parameters':'/sys/kernel/debug/dri/0/i915_error_state',
> > > +; 'regex':'^((?!no error state collected).)*$'
> > > +; }
> > > +;kmsg={
> > > +; 'type':'device',
> > > +; 'parameters':'/dev/kmsg',
> > > +; 'regex':'\*ERROR\* ring create req|\*ERROR\* Failed to reset chip'
> > > +; }
> > > +;dmesg_error={
> > > +; 'type':'dmesg',
> > > +; 'parameters':'dmesg --level emerg,alert,crit,err,warn,notice',
> >
> > I don't think there's any reason to use a ''' here, a single ' or "
> > should be fine.
> >
> > > +; 'regex':'''\*ERROR\* ring create req|\*ERROR\* Failed to reset chip|BUG:|Oops:|turning off the locking correctness validator'''
> > > +; }
> > > --
> > > 2.7.0
> >
> > Beyond the specific comments I left in the code, you're changed some
> > function signatures and broken unit tests, please update the tests.
> >
> > While I'm not telling you you need them for my r-b, I think this is a
> > complex enough chunk of code that adding unit tests wouldn't be bad at
> > all.
> >
> > Finally, this patch introduces a lot of overhead into the hot path. I
> > note a 30 second increase in runtime for ~13000 tests. Our CI runs tests
> > numbering into the millions. It doesn't take very long for this to
> > become 10 minutes extra runtime at those numbers, (and I'm desperately
> > trying to get piglit runtime down right now). I'd really like to see a
> > mode that no-ops for those who don't want monitoring (I don't want it
> > normally), to see if we can negate some of the runtime overhead.
> >
> > Dylan
>
> Olivier
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: signature
URL: <https://lists.freedesktop.org/archives/piglit/attachments/20160401/e9184c39/attachment-0001.sig>
More information about the Piglit
mailing list