[Piglit] [PATCH] framework: add a generic timeout mechanism

Thomas Wood thomas.wood at intel.com
Fri Sep 26 09:05:41 PDT 2014


On 25 September 2014 22:27, Dylan Baker <baker.dylan.c at gmail.com> wrote:
> Hi Thomas,
>
> I had a chance to run this (both this version and your original
> version), and the performance hit is massive on both GL and CL testing.
> I'm concerned with introducing a mechanism as generic when it made my
> test runs so long that I gave up running them.

I ran the "quick" and "cl" profiles both with and without the patch
and the time taken was about the same. If the timeout is not enabled
(i.e. set to zero) then no additional work is done, so there shouldn't
be any significant impact on the duration. I also tried enabling a 1
second timeout for one of the profiles and the time taken remained
about the same.

>
> We really do want a generic mechanism, but at this point I don't feel
> like this is ready to be generic. I think the best solution to get igt
> running again would be to change Test.__run_comamnd to Test._run_command
> (protected instead of private), which would allow you to modify it in a
> subclass.
>
> Dylan.
>
> On Wednesday, September 17, 2014 03:25:53 PM Thomas Wood wrote:
>> The timeout mechanism within igt.py can no longer be used since
>> get_command_result was renamed and made private in commit 5e9dd69
>> (exectest.py: rename get_command_result to __run_command). Therefore,
>> move the timeout mechanism into exectest.py, allowing all test profiles
>> to use it if needed and also avoiding code duplication between igt.py
>> and exectest.py.
>>
>> v2: Fix some small style issues. (Dylan Baker)
>>     Make timeout a class attribute of Test, rather than an instance
>>     attribute and remove it from the profile class. (Dylan Baker)
>>     Only check the timeout value if the process was abnormally
>>     terminated.
>>
>> Cc: Dylan Baker <baker.dylan.c at gmail.com>
>> Cc: Chad Versace <chad.versace at linux.intel.com>
>> Signed-off-by: Thomas Wood <thomas.wood at intel.com>
>> ---
>>  framework/exectest.py | 78 ++++++++++++++++++++++++++++++++++++++++--
>>  tests/igt.py          | 93 +--------------------------------------------------
>>  2 files changed, 76 insertions(+), 95 deletions(-)
>>
>> diff --git a/framework/exectest.py b/framework/exectest.py
>> index 3ed9b6f..6635f8e 100644
>> --- a/framework/exectest.py
>> +++ b/framework/exectest.py
>> @@ -29,6 +29,9 @@ import shlex
>>  import time
>>  import sys
>>  import traceback
>> +from datetime import datetime
>> +import threading
>> +import signal
>>  import itertools
>>  import abc
>>  try:
>> @@ -51,6 +54,56 @@ else:
>>                                                   '../bin'))
>>
>>
>> +class ProcessTimeout(threading.Thread):
>> +    """ Timeout class for test processes
>> +
>> +    This class is for terminating tests that run for longer than a certain
>> +    amount of time. Create an instance by passing it a timeout in seconds
>> +    and the Popen object that is running your test. Then call the start
>> +    method (inherited from Thread) to start the timer. The Popen object is
>> +    killed if the timeout is reached and it has not completed. Wait for the
>> +    outcome by calling the join() method from the parent.
>> +
>> +    """
>> +
>> +    def __init__(self, timeout, proc):
>> +        threading.Thread.__init__(self)
>> +        self.proc = proc
>> +        self.timeout = timeout
>> +        self.status = 0
>> +
>> +    def run(self):
>> +        start_time = datetime.now()
>> +        delta = 0
>> +
>> +        # poll() returns the returncode attribute, which is either the return
>> +        # code of the child process (which could be zero), or None if the
>> +        # process has not yet terminated.
>> +
>> +        while delta < self.timeout and self.proc.poll() is None:
>> +            time.sleep(1)
>> +            delta = (datetime.now() - start_time).total_seconds()
>> +
>> +        # if the test is not finished after timeout, first try to terminate it
>> +        # and if that fails, send SIGKILL to all processes in the test's
>> +        # process group
>> +
>> +        if self.proc.poll() is None:
>> +            self.status = 1
>> +            self.proc.terminate()
>> +            time.sleep(5)
>> +
>> +        if self.proc.poll() is None:
>> +            self.status = 2
>> +            if hasattr(os, 'killpg'):
>> +                os.killpg(self.proc.pid, signal.SIGKILL)
>> +            self.proc.wait()
>> +
>> +    def join(self):
>> +        threading.Thread.join(self)
>> +        return self.status
>> +
>> +
>>  class Test(object):
>>      """ Abstract base class for Test classes
>>
>> @@ -73,7 +126,8 @@ class Test(object):
>>      OPTS = Options()
>>      __metaclass__ = abc.ABCMeta
>>      __slots__ = ['run_concurrent', 'env', 'result', 'cwd', '_command',
>> -                 '_test_hook_execute_run']
>> +                 '_test_hook_execute_run', '__proc_timeout']
>> +    timeout = 0
>>
>>      def __init__(self, command, run_concurrent=False):
>>          self._command = None
>> @@ -82,6 +136,7 @@ class Test(object):
>>          self.env = {}
>>          self.result = TestResult({'result': 'fail'})
>>          self.cwd = None
>> +        self.__proc_timeout = None
>>
>>          # This is a hook for doing some testing on execute right before
>>          # self.run is called.
>> @@ -183,7 +238,11 @@ class Test(object):
>>          self.interpret_result()
>>
>>          if self.result['returncode'] < 0:
>> -            self.result['result'] = 'crash'
>> +            # check if the process was terminated by the timeout
>> +            if self.timeout > 0 and self.__proc_timeout.join() > 0:
>> +                self.result['result'] = 'timeout'
>> +            else:
>> +                self.result['result'] = 'crash'
>>          elif self.result['returncode'] != 0 and self.result['result'] == 'pass':
>>              self.result['result'] = 'warn'
>>
>> @@ -208,6 +267,10 @@ class Test(object):
>>          """
>>          return False
>>
>> +    def __set_process_group(self):
>> +        if hasattr(os, 'setpgrp'):
>> +            os.setpgrp()
>> +
>>      def __run_command(self):
>>          """ Run the test command and get the result
>>
>> @@ -240,7 +303,16 @@ class Test(object):
>>                                      stderr=subprocess.PIPE,
>>                                      cwd=self.cwd,
>>                                      env=fullenv,
>> -                                    universal_newlines=True)
>> +                                    universal_newlines=True,
>> +                                    preexec_fn=self.__set_process_group)
>> +            # create a ProcessTimeout object to watch out for test hang if the
>> +            # process is still going after the timeout, then it will be killed
>> +            # forcing the communicate function (which is a blocking call) to
>> +            # return
>> +            if self.timeout > 0:
>> +                self.__proc_timeout = ProcessTimeout(self.timeout, proc)
>> +                self.__proc_timeout.start()
>> +
>>              out, err = proc.communicate()
>>              returncode = proc.returncode
>>          except OSError as e:
>> diff --git a/tests/igt.py b/tests/igt.py
>> index 22250ce..13860a7 100644
>> --- a/tests/igt.py
>> +++ b/tests/igt.py
>> @@ -84,53 +84,13 @@ igtEnvironmentOk = checkEnvironment()
>>
>>  profile = TestProfile()
>>
>> -# This class is for timing out tests that hang. Create an instance by passing
>> -# it a timeout in seconds and the Popen object that is running your test. Then
>> -# call the start method (inherited from Thread) to start the timer. The Popen
>> -# object is killed if the timeout is reached and it has not completed. Wait for
>> -# the outcome by calling the join() method from the parent.
>> -
>> -class ProcessTimeout(threading.Thread):
>> -    def __init__(self, timeout, proc):
>> -        threading.Thread.__init__(self)
>> -        self.proc = proc
>> -        self.timeout = timeout
>> -        self.status = 0
>> -
>> -    def run(self):
>> -        start_time = datetime.now()
>> -        delta = 0
>> -        while (delta < self.timeout) and (self.proc.poll() is None):
>> -            time.sleep(1)
>> -            delta = (datetime.now() - start_time).total_seconds()
>> -
>> -        # if the test is not finished after timeout, first try to terminate it
>> -        # and if that fails, send SIGKILL to all processes in the test's
>> -        # process group
>> -
>> -        if self.proc.poll() is None:
>> -            self.status = 1
>> -            self.proc.terminate()
>> -            time.sleep(5)
>> -
>> -        if self.proc.poll() is None:
>> -            self.status = 2
>> -            os.killpg(self.proc.pid, signal.SIGKILL)
>> -            self.proc.wait()
>> -
>> -
>> -    def join(self):
>> -        threading.Thread.join(self)
>> -        return self.status
>> -
>> -
>> -
>>  class IGTTest(Test):
>>      def __init__(self, binary, arguments=None):
>>          if arguments is None:
>>              arguments = []
>>          super(IGTTest, self).__init__(
>>              [path.join(igtTestRoot, binary)] + arguments)
>> +        self.timeout = 600
>>
>>      def interpret_result(self):
>>          if not igtEnvironmentOk:
>> @@ -153,57 +113,6 @@ class IGTTest(Test):
>>
>>          super(IGTTest, self).run()
>>
>> -
>> -    def get_command_result(self):
>> -        out = ""
>> -        err = ""
>> -        returncode = 0
>> -        fullenv = os.environ.copy()
>> -        for key, value in self.env.iteritems():
>> -            fullenv[key] = str(value)
>> -
>> -        try:
>> -            proc = subprocess.Popen(self.command,
>> -                                    stdout=subprocess.PIPE,
>> -                                    stderr=subprocess.PIPE,
>> -                                    env=fullenv,
>> -                                    universal_newlines=True,
>> -                                    preexec_fn=os.setpgrp)
>> -
>> -            # create a ProcessTimeout object to watch out for test hang if the
>> -            # process is still going after the timeout, then it will be killed
>> -            # forcing the communicate function (which is a blocking call) to
>> -            # return
>> -            timeout = ProcessTimeout(600, proc)
>> -            timeout.start()
>> -            out, err = proc.communicate()
>> -
>> -            # check for pass or skip first
>> -            if proc.returncode in (0, 77):
>> -                returncode = proc.returncode
>> -            elif timeout.join() > 0:
>> -                returncode = 78
>> -            else:
>> -                returncode = proc.returncode
>> -
>> -        except OSError as e:
>> -            # Different sets of tests get built under
>> -            # different build configurations.  If
>> -            # a developer chooses to not build a test,
>> -            # Piglit should not report that test as having
>> -            # failed.
>> -            if e.errno == errno.ENOENT:
>> -                out = "PIGLIT: {'result': 'skip'}\n" \
>> -                    + "Test executable not found.\n"
>> -                err = ""
>> -                returncode = None
>> -            else:
>> -                raise e
>> -        self.result['out'] = out.decode('utf-8', 'replace')
>> -        self.result['err'] = err.decode('utf-8', 'replace')
>> -        self.result['returncode'] = returncode
>> -
>> -
>>  def listTests(listname):
>>      with open(path.join(igtTestRoot, listname + '.txt'), 'r') as f:
>>          lines = (line.rstrip() for line in f.readlines())
>> --
>> 1.9.3
>>


More information about the Piglit mailing list