[Piglit] [PATCH 0/6] Recursion tests v2 and fix OOM tests
Jose Fonseca
jfonseca at vmware.com
Fri Jul 29 13:09:28 PDT 2011
----- Original Message -----
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 07/28/2011 04:22 AM, Jose Fonseca wrote:
> > ----- Original Message -----
> >> This patch series does a couple things. First, it adds the
> >> ability
> >> to
> >> limit the amount of memory that a test can use. There are several
> >> test cases (including ones added in this series) that can exhaust
> >> system memory on failing implementations. Setting the rlimit
> >> prevents
> >> this unfriendly behavior. The rlimit can be set either via the
> >> command line (-rlimit option) or in the .shader_test file (rlimit
> >> in
> >> the requirements section).
> >>
> >> The second thing it does is resubmit my GLSL recursion tests with
> >> an
> >> rlimit set in all.tests.
> >>
> >> Finally, it sets an rlimit for the glsl-*-explosion tests and
> >> removes
> >> them from the blacklist.
> >
> > Another place to do this is in the piglit python framework, via
> > subprocess's preexec_fn argument.
> >
> > I have code to do that for several OSes in a bunch of scripts I use
> > to automate many GL testsuites with Hudson/Jenkins. I'm working on
> > folding some of these into piglit python framework, but I haven't
> > been able to do this.
> >
> > FWIW, below are the relevant bits:
>
> That's really cool, and I think that will be useful for a bunch of
> other
> things. Chad had made a similar comment to me about running the
> tests
> on embedded / small devices. There we'll want to limit the memory
> usage
> even more. I especially like that it limits the memory usage to some
> fraction of physical memory rather than being hardcoded to 256MB. I
> should probably add something like that to the code that I have.
>
> The reason I made it an option on the test is so that a person can
> reproduce the test run exactly in a debugger. That's much harder to
> do
> if an important bit of the test environment is controlled by the
> python
> framework.
>
> > #
> > http://www.velocityreviews.com/forums/t587425-python-system-information.html
> >
> > _meminfo_re =
> > re.compile(r'^(?P<key>\S*):\s*(?P<value>\d*)\s*kB' )
> >
> > def meminfo():
> > """-> dict of data from meminfo (str:int).
> > Values are in kilobytes.
> > """
> > result = {}
> > for line in open('/proc/meminfo'):
> > match = _meminfo_re.match(line)
> > if match:
> > key, value = match.groups(['key', 'value'])
> > result[key] = int(value) * 1024
> > return result
> >
> > def total_physical_memory():
> > return meminfo()['MemTotal']
> >
> > def preexec_fn():
> > #
> > http://stackoverflow.com/questions/1689505/python-ulimit-and-nice-for-subprocess-call-subprocess-popen
> > import resource
> >
> > # Generate core files so that we can see back traces
> > if sys.platform == 'darwin':
> > lim = resource.RLIM_INFINITY
> > else:
> > lim = 128*1024*1024
> > resource.setrlimit(resource.RLIMIT_CORE, (lim, lim))
> >
> > # Don't let the test program to use more than 3/4 of the
> > physical memory, to prevent excessive swapping, or
> > termination of the test harness programs via OOM
> > maxmem = total_physical_memory()*3/4
> > resource.setrlimit(resource.RLIMIT_AS, (maxmem, maxmem))
> >
> > p = subprocess.Popen(
> > args,
> > preexec_fn = preexec_fn,
> > )
> >
> > I also have code for timeouts etc.
>
> Timeouts are tricky. We want to be able to have tests that
> legitimately
> take a long time, but we want to detect cases where the test isn't
> going
> to make progress.
>
> You and Ken should talk. He's been working on some piglit hooks to
> determine when a test causes a GPU hang, kernel oops, etc. Ideally
> we'd
> like to be able to detect these catastrophic cases, kill the test,
> reboot, and continue the run with the next test. Some of Chad's work
> on
> the JSON serialization will also help.
>
> > IMHO, the python harness is a better place for health monitoring /
> > limit enforcing, first because in certain OSes (such as Window) is
> > impossible to do from inside the program reliably; second because
> > it makes it easier to integrate w
> >
> >
> > I also think this should be enabled in all tests, not just these
> > ones.
>
> Part of the problem is that the rlimit used (RLIMIT_AS) applies to
> all
> mapped memory. I had tired to RLIMIT_DATA, but malloc doesn't use
> sbrk
> these days. It uses mmap of anonymous files. As a result, if the
> driver maps a big chunk of GPU memory, it can easily exceed the set
> limit. I don't want my texture test that uses an 4096x4096 texture
> to
> fail because it exceeds the artificial rlimit.
>
> This is even worse on DRI1-like drivers that map all of GPU (or AGP)
> memory just for fun. Those architectures would either fail every
> test
> or wouldn't be able to prevent tests that OOM from disrupting the
> rest
> to the test run.
Yes, that's tricky.
A possible solution would be to parse /proc/<pid>/maps periodically then.
Jose
More information about the Piglit
mailing list