[Intel-gfx] [PATCH 0/9] [RFC] fair-lru eviction

Daniel Vetter daniel at ffwll.ch
Wed May 19 18:57:52 CEST 2010


On Wed, May 19, 2010 at 09:06:52AM +0100, Chris Wilson wrote:
> On Tue, 18 May 2010 23:11:42 +0200, Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
> > Hi all,
> > 
> > This patch series implements the fair-lru eviction Chris Wilson already
> > posted with a twist. It's essentially the same idea & algorithm.
> > Differnences versus his patch:
> > - Doesn't do any allocations while scanning.
> > - Implemented in drm_mm.c
> > 
> > In other words, this should also be usable by ttm. The idea is simple:
> > Scan through the lru, marking objects as evictable until there is a
> > large area of memory free/free-able. Then walk through all the scanned
> > objects in reverse, checking which ones fall into this hole. Finally
> > evicting them.
> > 
> > Comments, ideas highly welcome.
> 
> The next adaptation I did was to clean up evict_something to add objects
> from the inactive, active&&!pinned&&!write, flushing&&!pinned,
> active&&!pinned&&write lists. This reduces the logic in evict_something to
> a single scan over the available objects in LRU order.

Is this really worth it? I've worried about the rescanning in case there's
no suitable hole in the inactive list, too. But we're doing that also in
the current code. And the new code doesn't change the algorithmic
complexity (still O(number_of_inactive_objects)) so we're not gonna hit an
ugly corner case.  Furthermore some printk instrumentation showed that for
a full cairo-perf-traces run on my i855 only three times (over the hole
run, including rescans when new stuff arrived on the inactive list) there
was no suitable hole in the inactive list. So I've stopped worrying.

> We still need the move-to-inactive-tail upon access by the CPU, and I
> think it is acceptable to maintain our preference of the GPU over the CPU.
> Recovering memory from the CPU is comparatively cheap.

Argh, I've totally forgotten to put that list_move_tail into gem_fault
(and probably gtt_(write|read)_fast, too).

> Comparing 'while :; do yes > /tmp/yes; done & cairo-perf-trace', there is
> no significant delta between the fair LRU and current. I'll rebase my
> evict_something() on top of your drm_mm, and rerun the tests.

I'm still crunching the numbers, but preliminary benchmarks on my i855
show that there are _no_ regressions in cairo-traces (poppler is the only
one to take a hit of 1% which is decently below it's noise floor of 2%;).
Speedups are comparable to what you've posted on the list for your patches

Cheers, Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48



More information about the Intel-gfx mailing list