Implement svm without BO concept in xe driver

Wed Aug 16 22:51:57 UTC 2023

On Thu, 17 Aug 2023 at 08:15, Felix Kuehling <felix.kuehling at amd.com> wrote:
>
> On 2023-08-16 13:30, Zeng, Oak wrote:
> > I spoke with Thomas. We discussed two approaches:
> >
> > 1) make ttm_resource a central place for vram management functions such as eviction, cgroup memory accounting. Both the BO-based driver and BO-less SVM codes call into ttm_resource_alloc/free functions for vram allocation/free.
> >      *This way BO driver and SVM driver shares the eviction/cgroup logic, no need to reimplment LRU eviction list in SVM driver. Cgroup logic should be in ttm_resource layer. +Maarten.
> >      *ttm_resource is not a perfect match for SVM to allocate vram. It is still a big overhead. The *bo* member of ttm_resource is not needed for SVM - this might end up with invasive changes to ttm...need to look into more details
>
> Overhead is a problem. We'd want to be able to allocate, free and evict
> memory at a similar granularity as our preferred migration and page
> fault granularity, which defaults to 2MB in our SVM implementation.
>
>
> >
> > 2) svm code allocate memory directly from drm-buddy allocator, and expose memory eviction functions from both ttm and svm so they can evict memory from each other. For example, expose the ttm_mem_evict_first function from ttm side so hmm/svm code can call it; expose a similar function from svm side so ttm can evict hmm memory.
>
> I like this option. One thing that needs some thought with this is how
> to get some semblance of fairness between the two types of clients.
> Basically how to choose what to evict. And what share of the available
> memory does each side get to use on average. E.g. an idle client may get
> all its memory evicted while a busy client may get a bigger share of the
> available memory.

I'd also like to suggest we try to write any management/generic code
in driver agnostic way as much as possible here. I don't really see
much hw difference should be influencing it.

I do worry about having effectively 2 LRUs here, you can't really have
two "leasts".

Like if we hit the shrinker paths who goes first? do we shrink one
object from each side in turn?

Also will we have systems where we can expose system SVM but userspace
may choose to not use the fine grained SVM and use one of the older
modes, will that path get emulated on top of SVM or use the BO paths?

Dave.