[Piglit] [PATCH v3 4/4] framework/test/base.py: use subprocess32 for timeouts.

Tue Dec 1 14:26:46 PST 2015

On Tue, Dec 01, 2015 at 10:25:59AM +0000, Thomas Wood wrote:
> On 30 November 2015 at 18:34, Dylan Baker <baker.dylan.c at gmail.com> wrote:
> > On Fri, Nov 27, 2015 at 12:04:34PM +0000, Thomas Wood wrote:
> >> On 12 November 2015 at 22:52,  <baker.dylan.c at gmail.com> wrote:
> >> > From: Dylan Baker <baker.dylan.c at gmail.com>
> >> >
> >> > Subprocess32 provides a backport of (ironically) python 3.3's subprocess
> >> > module, which has a timeout parameter. When the timeout runs out then an
> >> > exception is raised, and when that exception is caught we can kill the
> >> > process.
> >> >
> >> > This is fairly similar to the way the current timeout mechanism works,
> >> > except that the current mechanism is not thread safe. Since one of the
> >> > major features of piglit is that it offer's processes isolated
> >> > concurrency of tests, it makes sense to make the effort to provide a
> >> > timeout mechanism that can be used outside of IGT, which is a
> >> > non-concurrent test suite.
> >> >
> >> > This patch look pretty substantial. It's not as big as it looks, since
> >> > it adds some fairly big tests.
> >> >
> >> > v3: - Wait before terminating process
> >> >     - use os.getgrpid instead of os.getsid
> >>
> >> os.getpgid?
> >>
> >>
> >> >     - use subprocess32 for accel profile in tox
> >> >     - Add warning when using the dummy version of TimeoutMixin
> >>
> >> TimeoutExpired?
> >>
> >
> > Both of these fixed locally
> >
> >>
> >> >     - mark more unittests as slow and add timeouts to them
> >> >     - Remove the use of a Mixin, timeouts are an important enough
> >> >       feature every test suite should have the option to use them, and
> >> >       by not using a mixin there is no need for the ugly interface.
> >> >
> >> > Signed-off-by: Dylan Baker <dylanx.c.baker at intel.com>
> >> > ---
> >> >  framework/results.py          |   2 +
> >> >  framework/test/base.py        | 130 ++++++++++++++++--------------------------
> >> >  framework/tests/base_tests.py | 117 ++++++++++++++++++++++++++++++++++++-
> >> >  tox.ini                       |  10 ++--
> >> >  4 files changed, 171 insertions(+), 88 deletions(-)
> >> >
> >> > diff --git a/framework/results.py b/framework/results.py
> >> > index eeffcb7..ee41126 100644
> >> > --- a/framework/results.py
> >> > +++ b/framework/results.py
> >> > @@ -96,6 +96,8 @@ class StringDescriptor(object):  # pylint: disable=too-few-public-methods
> >> >              setattr(instance, self.__name, value.decode('utf-8', 'replace'))
> >> >          elif isinstance(value, unicode):
> >> >              setattr(instance, self.__name, value)
> >> > +        elif value is None:
> >> > +            setattr(instance, self.__name, u'')
> >>
> >> Unrelated change?
> >>
> >
> > I'll have to look to be sure, but I think this actually was required in
> > the timeout case.
> >
> >>
> >> >          else:
> >> >              raise TypeError('{} attribute must be a str or unicode instance, '
> >> >                              'but was {}.'.format(self.__name, type(value)))
> >> > diff --git a/framework/test/base.py b/framework/test/base.py
> >> > index 7c7c410..d1b0c35 100644
> >> > --- a/framework/test/base.py
> >> > +++ b/framework/test/base.py
> >> > @@ -25,16 +25,36 @@
> >> >  from __future__ import print_function, absolute_import
> >> >  import errno
> >> >  import os
> >> > -import subprocess
> >> >  import time
> >> >  import sys
> >> >  import traceback
> >> > -from datetime import datetime
> >> > -import threading
> >> > -import signal
> >> >  import itertools
> >> >  import abc
> >> >  import copy
> >> > +import signal
> >> > +import warnings
> >> > +
> >> > +try:
> >> > +    import subprocess32 as subprocess
> >> > +    _EXTRA_POPEN_ARGS = {'start_new_session': True}
> >> > +except ImportError:
> >> > +    import subprocess
> >> > +
> >> > +    class TimeoutExpired(Exception):
> >> > +        pass
> >> > +
> >> > +    class Popen(subprocess.Popen):
> >> > +        """Sublcass of Popen that accepts and ignores a timeout argument."""
> >> > +        def communicate(self, *args, **kwargs):
> >> > +            if 'timeout' in kwargs:
> >> > +                del kwargs['timeout']
> >> > +            return super(Popen, self).communicate(*args, **kwargs)
> >> > +
> >> > +    subprocess.TimeoutExpired = TimeoutExpired
> >> > +    subprocess.Popen = Popen
> >> > +    _EXTRA_POPEN_ARGS = {}
> >> > +
> >> > +    warnings.warn('Timeouts are not available')
> >> >
> >> >  from framework import exceptions, options
> >> >  from framework.results import TestResult
> >> > @@ -61,56 +81,6 @@ class TestRunError(exceptions.PiglitException):
> >> >          self.status = status
> >> >
> >> >
> >> > -class ProcessTimeout(threading.Thread):
> >> > -    """ Timeout class for test processes
> >> > -
> >> > -    This class is for terminating tests that run for longer than a certain
> >> > -    amount of time. Create an instance by passing it a timeout in seconds
> >> > -    and the Popen object that is running your test. Then call the start
> >> > -    method (inherited from Thread) to start the timer. The Popen object is
> >> > -    killed if the timeout is reached and it has not completed. Wait for the
> >> > -    outcome by calling the join() method from the parent.
> >> > -
> >> > -    """
> >> > -
> >> > -    def __init__(self, timeout, proc):
> >> > -        threading.Thread.__init__(self)
> >> > -        self.proc = proc
> >> > -        self.timeout = timeout
> >> > -        self.status = 0
> >> > -
> >> > -    def run(self):
> >> > -        start_time = datetime.now()
> >> > -        delta = 0
> >> > -
> >> > -        # poll() returns the returncode attribute, which is either the return
> >> > -        # code of the child process (which could be zero), or None if the
> >> > -        # process has not yet terminated.
> >> > -
> >> > -        while delta < self.timeout and self.proc.poll() is None:
> >> > -            time.sleep(1)
> >> > -            delta = (datetime.now() - start_time).total_seconds()
> >> > -
> >> > -        # if the test is not finished after timeout, first try to terminate it
> >> > -        # and if that fails, send SIGKILL to all processes in the test's
> >> > -        # process group
> >> > -
> >> > -        if self.proc.poll() is None:
> >> > -            self.status = 1
> >> > -            self.proc.terminate()
> >> > -            time.sleep(5)
> >> > -
> >> > -        if self.proc.poll() is None:
> >> > -            self.status = 2
> >> > -            if hasattr(os, 'killpg'):
> >> > -                os.killpg(self.proc.pid, signal.SIGKILL)
> >> > -            self.proc.wait()
> >> > -
> >> > -    def join(self):
> >> > -        threading.Thread.join(self)
> >> > -        return self.status
> >> > -
> >> > -
> >> >  def _is_crash_returncode(returncode):
> >> >      """Determine whether the given process return code correspond to a
> >> >      crash.
> >> > @@ -146,10 +116,10 @@ class Test(object):
> >> >      """
> >> >      __metaclass__ = abc.ABCMeta
> >> >      __slots__ = ['run_concurrent', 'env', 'result', 'cwd', '_command',
> >> > -                 '__proc_timeout']
> >> > +                 'timeout']
> >> >      timeout = 0
> >> >
> >> > -    def __init__(self, command, run_concurrent=False):
> >> > +    def __init__(self, command, run_concurrent=False, timeout=None):
> >> >          assert isinstance(command, list), command
> >> >
> >> >          self.run_concurrent = run_concurrent
> >> > @@ -157,7 +127,9 @@ class Test(object):
> >> >          self.env = {}
> >> >          self.result = TestResult()
> >> >          self.cwd = None
> >> > -        self.__proc_timeout = None
> >> > +        if timeout is not None:
> >> > +            assert isinstance(timeout, int)
> >> > +            self.timeout = timeout
> >> >
> >> >      def execute(self, path, log, dmesg):
> >> >          """ Run a test
> >> > @@ -203,11 +175,7 @@ class Test(object):
> >> >          """Convert the raw output of the test into a form piglit understands.
> >> >          """
> >> >          if _is_crash_returncode(self.result.returncode):
> >> > -            # check if the process was terminated by the timeout
> >> > -            if self.timeout > 0 and self.__proc_timeout.join() > 0:
> >> > -                self.result.result = 'timeout'
> >> > -            else:
> >> > -                self.result.result = 'crash'
> >> > +            self.result.result = 'crash'
> >> >          elif self.result.returncode != 0 and self.result.result == 'pass':
> >> >              self.result.result = 'warn'
> >> >
> >> > @@ -253,10 +221,6 @@ class Test(object):
> >> >          """
> >> >          pass
> >> >
> >> > -    def __set_process_group(self):
> >> > -        if hasattr(os, 'setpgrp'):
> >> > -            os.setpgrp()
> >> > -
> >> >      def _run_command(self):
> >> >          """ Run the test command and get the result
> >> >
> >> > @@ -283,29 +247,17 @@ class Test(object):
> >> >                                            self.env.iteritems()):
> >> >              fullenv[key] = str(value)
> >> >
> >> > -        # preexec_fn is not supported on Windows platforms
> >> > -        if sys.platform == 'win32':
> >> > -            preexec_fn = None
> >> > -        else:
> >> > -            preexec_fn = self.__set_process_group
> >> >
> >> >          try:
> >> >              proc = subprocess.Popen(self.command,
> >> > +                                    env=fullenv,
> >> >                                      stdout=subprocess.PIPE,
> >> >                                      stderr=subprocess.PIPE,
> >> >                                      cwd=self.cwd,
> >> > -                                    env=fullenv,
> >>
> >> Unrelated change?
> >>
> >>
> >
> > Yup, rebasing artifact. Fixed locally.
> >
> >> >                                      universal_newlines=True,
> >> > -                                    preexec_fn=preexec_fn)
> >> > -            # create a ProcessTimeout object to watch out for test hang if the
> >> > -            # process is still going after the timeout, then it will be killed
> >> > -            # forcing the communicate function (which is a blocking call) to
> >> > -            # return
> >> > -            if self.timeout > 0:
> >> > -                self.__proc_timeout = ProcessTimeout(self.timeout, proc)
> >> > -                self.__proc_timeout.start()
> >> > -
> >> > -            out, err = proc.communicate()
> >> > +                                    **_EXTRA_POPEN_ARGS)
> >> > +
> >> > +            out, err = proc.communicate(timeout=self.timeout)
> >> >              returncode = proc.returncode
> >> >          except OSError as e:
> >> >              # Different sets of tests get built under different build
> >> > @@ -315,6 +267,20 @@ class Test(object):
> >> >                  raise TestRunError("Test executable not found.\n", 'skip')
> >> >              else:
> >> >                  raise e
> >> > +        except subprocess.TimeoutExpired:
> >> > +            if proc.poll() is None:
> >> > +                time.sleep(3)
> >>
> >> The terminate signal can be sent straight away if a timeout has occurred.
> >
> > fixed locally
> >
> >>
> >>
> >> > +                proc.terminate()
> >> > +
> >> > +            if proc.poll() is None:
> >> > +                time.sleep(3)
> >> > +                os.killpg(os.getpgid(proc.pid), signal.SIGKILL)
> >> > +
> >> > +            self.result.out, self.result.err = proc.communicate()
> >> > +
> >> > +            raise TestRunError(
> >> > +                'Test run time exceeded timeout out {}\n'.format(self.timeout),
> >>
> >> "... timeout out {}\n" should perhaps be something like "... timeout
> >> value ({} seconds)\n"?
> >
> > Your way sounds better. Also changed locally.
> >
> >>
> >>
> >> > +                'timeout')
> >> >
> >> >          # The setter handles the bytes/unicode conversion
> >> >          self.result.out = out
> >> > diff --git a/framework/tests/base_tests.py b/framework/tests/base_tests.py
> >> > index 1b5ed9e..b1df9d4 100644
> >> > --- a/framework/tests/base_tests.py
> >> > +++ b/framework/tests/base_tests.py
> >> > @@ -21,11 +21,19 @@
> >> >  """ Tests for the exectest module """
> >> >
> >> >  from __future__ import print_function, absolute_import
> >> > +import tempfile
> >> > +import textwrap
> >> > +import os
> >> >
> >> >  import mock
> >> >  import nose.tools as nt
> >> >  from nose.plugins.attrib import attr
> >> >
> >> > +try:
> >> > +    import psutil
> >> > +except ImportError:
> >> > +    pass
> >> > +
> >> >  import framework.tests.utils as utils
> >> >  from framework.test.base import (
> >> >      Test, WindowResizeMixin, ValgrindMixin, TestRunError
> >> > @@ -68,8 +76,111 @@ def test_run_return_early():
> >> >
> >> >
> >> >  @attr('slow')
> >> > + at nt.timed(8)
> >> > +def test_timeout_kill_children():
> >> > +    """test.base.Test: kill children if terminate fails
> >> > +
> >> > +    This creates a process that forks multiple times, and then checks that the
> >> > +    children have been killed.
> >> > +
> >> > +    This test could leave processes running if it fails.
> >> > +
> >> > +    """
> >> > +    utils.module_check('subprocess32')
> >> > +    utils.module_check('psutil')
> >> > +
> >> > +    import subprocess32  # pylint: disable=import-error
> >> > +
> >> > +    class PopenProxy(object):
> >> > +        """An object that proxies Popen, and saves the Popen instance as an
> >> > +        attribute.
> >> > +
> >> > +        This is useful for testing the Popen instance.
> >> > +
> >> > +        """
> >> > +        def __init__(self):
> >> > +            self.popen = None
> >> > +
> >> > +        def __call__(self, *args, **kwargs):
> >> > +            self.popen = subprocess32.Popen(*args, **kwargs)
> >> > +
> >> > +            # if commuincate cis called successfully then the proc will be
> >> > +            # reset to None, whic will make the test fail.
> >> > +            self.popen.communicate = mock.Mock(return_value=('out', 'err'))
> >> > +
> >> > +            return self.popen
> >> > +
> >> > +    with tempfile.NamedTemporaryFile() as f:
> >> > +        # Create a file that will be executed as a python script
> >> > +        # Create a process with two subproccesses (not threads) that will run
> >> > +        # for a long time.
> >> > +        f.write(textwrap.dedent("""\
> >> > +            import time
> >> > +            from multiprocessing import Process
> >> > +
> >> > +            def p():
> >> > +                for _ in range(100):
> >> > +                    time.sleep(1)
> >> > +
> >> > +            a = Process(target=p)
> >> > +            b = Process(target=p)
> >> > +            a.start()
> >> > +            b.start()
> >> > +            a.join()
> >> > +            b.join()
> >> > +        """))
> >> > +        f.seek(0)  # we'll need to read the file back
> >> > +
> >> > +        # Create an object that will return a popen object, but also store it
> >> > +        # so we can access it later
> >> > +        proxy = PopenProxy()
> >> > +
> >> > +        test = TimeoutTest(['python', f.name])
> >> > +        test.timeout = 1
> >> > +
> >> > +        # mock out subprocess.Popen with our proxy object
> >> > +        with mock.patch('framework.test.base.subprocess') as mock_subp:
> >> > +            mock_subp.Popen = proxy
> >> > +            mock_subp.TimeoutExpired = subprocess32.TimeoutExpired
> >> > +            test.run()
> >> > +
> >> > +        # Check to see if the Popen has children, even after it should have
> >> > +        # recieved a TimeoutExpired.
> >> > +        proc = psutil.Process(os.getsid(proxy.popen.pid))
> >> > +        children = proc.children(recursive=True)
> >> > +
> >> > +        if children:
> >> > +            # If there are still running children attempt to clean them up,
> >> > +            # starting with the final generation and working back to the first
> >> > +            for child in reversed(children):
> >> > +                child.kill()
> >> > +
> >> > +            raise utils.TestFailure(
> >> > +                'Test process had children when it should not')
> >> > +
> >> > +
> >> > + at attr('slow')
> >> > + at nt.timed(8)
> >> >  def test_timeout():
> >> > -    """test.base.Test.run(): Sets status to 'timeout' when timeout exceeded"""
> >> > +    """test.base.TimeoutMixin: Stops running test after timeout expires
> >>
> >> Perhaps tests.base.TimeoutExpired rather than tests.base.TimeoutMixin
> >> here and in the following tests?
> >>
> >>
> >
> > I would have set them to tests.base.Test like the previous test, do you
> > prefer TimeoutExpired?
> 
> That's fine since it removes any mention of TimeoutMixin, which no
> longer exists.
> 
> 
> >
> >> > +
> >> > +    This is a little bit of extra time here, but without a sleep of 60 seconds
> >> > +    if the test runs 5 seconds it's run too long
> >> > +
> >> > +    """
> >> > +    utils.module_check('subprocess32')
> >> > +    utils.binary_check('sleep', 1)
> >> > +
> >> > +    test = TimeoutTest(['sleep', '60'])
> >> > +    test.timeout = 1
> >> > +    test.run()
> >> > +
> >> > +
> >> > + at attr('slow')
> >> > + at nt.timed(8)
> >> > +def test_timeout_timeout():
> >> > +    """test.base.TimeoutMixin: Sets status to 'timeout' when timeout exceeded"""
> >> > +    utils.module_check('subprocess32')
> >> >      utils.binary_check('sleep', 1)
> >> >
> >> >      test = TimeoutTest(['sleep', '60'])
> >> > @@ -78,9 +189,11 @@ def test_timeout():
> >> >      nt.eq_(test.result.result, 'timeout')
> >> >
> >> >
> >> > + at nt.timed(2)
> >> >  def test_timeout_pass():
> >> > -    """test.base.Test.run(): Doesn't change status when timeout not exceeded
> >> > +    """test.base.TimeoutMixin: Doesn't change status when timeout not exceeded
> >> >      """
> >> > +    utils.module_check('subprocess32')
> >> >      utils.binary_check('true')
> >> >
> >> >      test = TimeoutTest(['true'])
> >> > diff --git a/tox.ini b/tox.ini
> >> > index d1f1f16..3dd1851 100644
> >> > --- a/tox.ini
> >> > +++ b/tox.ini
> >> > @@ -13,20 +13,22 @@ commands = nosetests generated_tests/test_generators.py
> >> >  [testenv:py27-noaccel]
> >> >  passenv=HOME
> >> >  deps =
> >> > -    mako
> >> > -    nose
> >> >      coverage
> >> > +    mako
> >> >      mock
> >> > +    nose
> >> >  commands = nosetests framework/tests --attr="!privileged" --with-cover --cover-package=framework --cover-tests
> >> >
> >> >  [testenv:py27-accel]
> >> >  passenv=HOME
> >> >  deps =
> >> > -    mako
> >> > -    nose
> >> >      coverage
> >> > +    mako
> >> >      mock
> >> > +    nose
> >> > +    psutil
> >> >      simplejson
> >> >      lxml
> >> >      backports.lzma
> >> > +    subprocess32
> >> >  commands = nosetests framework/tests --attr="!privileged" --with-cover --cover-package=framework --cover-tests
> >> > --
> >> > 2.6.2
> >> >
> >> > _______________________________________________
> >> > Piglit mailing list
> >> > Piglit at lists.freedesktop.org
> >> > http://lists.freedesktop.org/mailman/listinfo/piglit
> >
> >
> > I'll look into whether that None case is necessary. Do you want to see a
> > v5?
> 
> I think it will be v4? The patch didn't apply cleanly to master so it
> would be good to have a new version to test.

Okay, I'll have a new version for you today.

> 
> 
> >
> > Dylan
> >
> > _______________________________________________
> > Piglit mailing list
> > Piglit at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/piglit
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/piglit/attachments/20151201/d4178019/attachment.sig>