[Intel-gfx] [PATCH] drm/i915: Evict everything if we detect we are buffer thrashing

Wed Dec 2 10:41:06 CET 2009

On Tue, 01 Dec 2009 16:12:30 -0800, Ian Romanick <idr at freedesktop.org> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Chris Wilson wrote:
> > By scanning the inactive list for a single buffer to evict instead of
> > just popping the first elements until we have sufficient room may be
> > give measurably higher throughput, but opens the possibility of
> > thrashing between two large objects. If we spot that this happening,
> > then simply clear the entire aperture and start afresh. The alternate
> > is the page-fault-of-doom!
> 
> I'm a little bit dubious of this approach.  Has this been tested on any
> applications that, for example, use a lot of large textures?  I'm
> concerned that this will cause performance regressions on such apps.

Evicting everything is a bit callous, granted. The problem is that our
aperture becomes fragmented and we do not clear the inactive and flushing
list. If we only clear the inactive list then we are never guaranteed to
clear enough contiguous aperture space to fit the two objects [the only way
we can be sure that there is enough space in the entire aperture to map the
two object simultaneously is because user-space makes that promise] and so
we will still be subject to a relatively easy to trigger page-fault-of-doom
scenario. [Just open a few large, but not huge, images in firefox.] So we
need someway to trigger a flush when the CPU is copying data between two
large objects mapped into the GTT but the inactive aperture space only
contains enough room for one.

I agree that the critical part of this is only evicting everything iff we
have to and this looked like the simplest heuristic to use.

An alternative that I was contemplating was not promoting a CPU domain to
the GTT inside the fault handler itself. The normal, uncontested map_gtt()
path would move the object into the GTT domain prior to the initial fault,
so it should only be the case whereby we are thrashing if we find the buffer
object inside the CPU domain during the fault. Swizzling makes this a
nightmare though. Another extremely complicated approach would be to only
bind a minimal segment of the buffer during the fault.

Ian, which performance tests spring to mind?
-ickle

-- 
Chris Wilson, Intel Open Source Technology Centre