TTM's role in score-based eviction

Thu Dec 5 08:45:03 PST 2013

On Thu, Dec 05, 2013 at 05:22:54PM +0100, Maarten Lankhorst wrote:
> op 05-12-13 16:49, Jerome Glisse schreef:
> > On Thu, Dec 05, 2013 at 11:26:46AM +0100, Thomas Hellstrom wrote:
> >> Hi!
> >>
> >> On 12/05/2013 10:36 AM, Lauri Kasanen wrote:
> >>> Hi list, Thomas,
> >>>
> >>> I will be investigating the use of a hotness score for each bo, to
> >>> replace the ping-pong causing LRU eviction in radeon*.
> >>>
> >>> The goal is to put all bos that fit in VRAM there, in order of hotness;
> >>> a new bo should only be placed there if its hotness score is greater
> >>> than the lowest VRAM bo's. Then the lowest-hotness-bos in
> >>> VRAM should be evicted until the new bo fits. This should result in a
> >>> more stable set with less ping-pong.
> >>>
> >>> Jerome advised that the bo placement should be done entirely outside
> >>> TTM. As I'm not (yet) too familiar with that side of the kernel, what is
> >>> the opinion of TTM folks?
> >> There are a couple of things to be considered:
> >> 1) You need to decide where a bo to be validated should be placed.
> >> The driver can give a list of possible placements to TTM and let
> >> TTM decide, trying each placement in turn. A driver that thinks this
> >> isn't sufficient can come up with its on strategy and give only a
> >> single placement to TTM. If TTM can't satisfy that, it will give you
> >> an error back, and the driver will need to validate with an
> >> alternative placement. I think Radeon already does this? vmwgfx does
> >> it to some extent.
> >>
> >> 2) As you say, TTM is evicting strictly on an lru basis, and is
> >> maintaining one LRU list per memory type, and also a global swap lru
> >> list for buffers that are backed by system pages (not VRAM). I guess
> >> what you would want to do is to replace the VRAM lru list with a
> >> priority queue where bos are continously sorted based on hotness.
> >> As long as you obey the locking rules:
> >> *) Locking order is bo::reserve -> lru-lock
> >> *) When walking the queue with the lru-lock held, you must therefore
> >> tryreserve if you want to reserve an object on the queue
> >> *) bo:s need to be removed from the queue as soon as they are reserved
> >> *) Don't remove a bo from the queue unless it is reserved
> >> Nothing stops you from doing this in the driver, but OTOH if this
> >> ends up being useful for other drivers I'd prefer we put it into
> >> TTM.
> > It will be useful to others, the point i am making is that others might
> > not use ttm either and there is nothing about bo placement that needs
> > to be ttm specific.
> >
> > To avoid bo eviction from lru list is just a matter of driver never
> > over committing bo on a pool of memory and driver doing eviction by
> > itself, ie deciding of a new placement for bo and moving that bo
> > before moving in other bo, which can be done outside ttm.
> >
> > The only thing that will needs modification to ttm is work done to
> > control memory fragmentation but this should be not be enforce on
> > all ttm user and should be a runtime decision. GPU with virtual
> > address space can scatter bo through vram by using vram pages making
> > memory fragmentation pretty much a non issue (some GPU still needs
> > contiguous memory for scan out buffer or other specific buffer).
> >
> You're correct it COULD be done like that, but that's a nasty workaround.
> Simply assign a priority to each buffer, then modify ttm_bo_add_to_lru,
> ttm_bo_swapout, ttm_mem_evict_first and be done with it.
> 
> Memory management is exactly the kind of thing that should be done in TTM,
> so why have something 'generic' for something that's little more than a renamed priority queue?

The end score and use of the score for placement decision be done in ttm
but the whole score computation and heuristic related to it should not.

Cheers,
Jerome