[Intel-gfx] xf86-video-intel memory leakage
chris at chris-wilson.co.uk
Thu Feb 12 02:23:35 PST 2009
On Thu, 2009-02-12 at 10:26 +0100, Stefano Avallone wrote:
> On Monday 09 February 2009 20:28:14 Johannes Engel wrote:
> > Jesse Barnes wrote:
> > > Interesting, thanks for trying to narrow it down. I don't see anything
> > > on re-review that would cause huge increases in the amount of memory
> > > used, though the additional alignment we apply in that patch will
> > > increase things somewhat, so might make the problem happen faster. Are
> > > you using UXA or EXA?
> > You are probably right here, Jesse: Letting Xorg run with UXA on my
> > GM945 turns out to show a similar problem after a couple of hours or
> > similar.
> > sudo lsof | grep "drm mm object" | wc -l
> > shows the incredible number of 2407...
Since cairo-perf will cause a "leak" of a couple of GiB in a few seconds
on my i915, I was able to track down the cause pretty quickly. It turns
out not to be limited to uxa at all, just uxa exercises the bufmgr much
more than exa.
The issue appears to be the bufmgr cache handling which appears
unbounded. Its use was introduced with:
Author: Carl Worth <cworth at cworth.org>
Date: Fri Jul 25 15:56:35 2008 -0700
Add call to intel_bufmgr_gem_enable_reuse
So you can try reverting that commit and confirming if that clears the
issue for you.
Longer term, I think the buffer cache should be moved into the kernel,
as when using client-side rendering we may end up with a bo cache per
application. Moving it to the kernel should allow the cache to be shared
and for it to reaped in low memory conditions. AIUI, we need to cache bo
in order to reduce the cost associated with creating a new shmem object,
mapping the page lists and to minimise clflush. The easiest approach
would seem to be to add a dev_priv->cache_list and a new create ioctl
that took interested domains and returned the current ones for the new
object. (One of the improvements I found in cache usage was in relaxing
the busy condition for buffers which I knew would only be used for GPU
writes and so would be serialised by the batch scheduling.) Suggestions?
More information about the xorg