[Intel-xe] Implement svm without BO concept in xe driver

Fri Aug 18 16:54:36 UTC 2023

On 2023-08-18 12:10, Zeng, Oak wrote:
> Thanks Thomas. I will then look into more details of option 3:
>
>     * create a lean drm layer vram manager, a central control place for vram eviction and cgroup accounting. Single LRU for eviction fairness.
>     * pretty much move the current ttm_resource eviction/cgroups logic to drm layer
>     * the eviction/allocation granularity should be flexible so svm can do 2M while ttm can do arbitrary size

SVM will need smaller sizes too, for VMAs that are smaller or not 
aligned to 2MB size.

Regards,
   Felix

>     * both ttm_resource and svm code should call the new drm_vram_manager for eviction/accounting
>
> I will come back with some RFC proof of concept codes later.
>
> Cheers,
> Oak
>
>> -----Original Message-----
>> From: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>> Sent: August 18, 2023 3:36 AM
>> To: Zeng, Oak <oak.zeng at intel.com>; Dave Airlie <airlied at gmail.com>; Felix
>> Kuehling <felix.kuehling at amd.com>
>> Cc: Christian König <christian.koenig at amd.com>; Brost, Matthew
>> <matthew.brost at intel.com>; maarten.lankhorst at linux.intel.com;
>> Vishwanathapura, Niranjana <niranjana.vishwanathapura at intel.com>; Welty,
>> Brian <brian.welty at intel.com>; Philip Yang <Philip.Yang at amd.com>; intel-
>> xe at lists.freedesktop.org; dri-devel at lists.freedesktop.org
>> Subject: Re: Implement svm without BO concept in xe driver
>>
>>
>> On 8/17/23 04:12, Zeng, Oak wrote:
>>>> -----Original Message-----
>>>> From: Dave Airlie <airlied at gmail.com>
>>>> Sent: August 16, 2023 6:52 PM
>>>> To: Felix Kuehling <felix.kuehling at amd.com>
>>>> Cc: Zeng, Oak <oak.zeng at intel.com>; Christian König
>>>> <christian.koenig at amd.com>; Thomas Hellström
>>>> <thomas.hellstrom at linux.intel.com>; Brost, Matthew
>>>> <matthew.brost at intel.com>; maarten.lankhorst at linux.intel.com;
>>>> Vishwanathapura, Niranjana <niranjana.vishwanathapura at intel.com>; Welty,
>>>> Brian <brian.welty at intel.com>; Philip Yang <Philip.Yang at amd.com>; intel-
>>>> xe at lists.freedesktop.org; dri-devel at lists.freedesktop.org
>>>> Subject: Re: Implement svm without BO concept in xe driver
>>>>
>>>> On Thu, 17 Aug 2023 at 08:15, Felix Kuehling <felix.kuehling at amd.com> wrote:
>>>>> On 2023-08-16 13:30, Zeng, Oak wrote:
>>>>>> I spoke with Thomas. We discussed two approaches:
>>>>>>
>>>>>> 1) make ttm_resource a central place for vram management functions such
>> as
>>>> eviction, cgroup memory accounting. Both the BO-based driver and BO-less
>> SVM
>>>> codes call into ttm_resource_alloc/free functions for vram allocation/free.
>>>>>>        *This way BO driver and SVM driver shares the eviction/cgroup logic, no
>>>> need to reimplment LRU eviction list in SVM driver. Cgroup logic should be in
>>>> ttm_resource layer. +Maarten.
>>>>>>        *ttm_resource is not a perfect match for SVM to allocate vram. It is still a
>>>> big overhead. The *bo* member of ttm_resource is not needed for SVM - this
>>>> might end up with invasive changes to ttm...need to look into more details
>>>>> Overhead is a problem. We'd want to be able to allocate, free and evict
>>>>> memory at a similar granularity as our preferred migration and page
>>>>> fault granularity, which defaults to 2MB in our SVM implementation.
>>>>>
>>>>>
>>>>>> 2) svm code allocate memory directly from drm-buddy allocator, and
>> expose
>>>> memory eviction functions from both ttm and svm so they can evict memory
>>>> from each other. For example, expose the ttm_mem_evict_first function
>> from
>>>> ttm side so hmm/svm code can call it; expose a similar function from svm side
>> so
>>>> ttm can evict hmm memory.
>>>>> I like this option. One thing that needs some thought with this is how
>>>>> to get some semblance of fairness between the two types of clients.
>>>>> Basically how to choose what to evict. And what share of the available
>>>>> memory does each side get to use on average. E.g. an idle client may get
>>>>> all its memory evicted while a busy client may get a bigger share of the
>>>>> available memory.
>>>> I'd also like to suggest we try to write any management/generic code
>>>> in driver agnostic way as much as possible here. I don't really see
>>>> much hw difference should be influencing it.
>>>>
>>>> I do worry about having effectively 2 LRUs here, you can't really have
>>>> two "leasts".
>>>>
>>>> Like if we hit the shrinker paths who goes first? do we shrink one
>>>> object from each side in turn?
>>> One way to solve this fairness problem is to create a driver agnostic
>> drm_vram_mgr. Maintain a single LRU in drm_vram_mgr. Move the memory
>> eviction/cgroups memory accounting logic from ttm_resource manager to
>> drm_vram_mgr. Both BO-based driver and SVM driver calls to drm_vram_mgr to
>> allocate/free memory.
>>> I am not sure whether this meets the 2M allocate/free/evict granularity
>> requirement Felix mentioned above. SVM can allocate 2M size blocks. But BO
>> driver should be able to allocate any arbitrary sized blocks - So the eviction is also
>> arbitrary size.
>>
>> This is not far from what a TTM resource manager does with TTM
>> resources, only made generic at the drm level, and making the "resource"
>> as lean as possible. With 2M granularity this seems plausible.
>>
>>>> Also will we have systems where we can expose system SVM but userspace
>>>> may choose to not use the fine grained SVM and use one of the older
>>>> modes, will that path get emulated on top of SVM or use the BO paths?
>>> If by "older modes" you meant the gem_bo_create (such as xe_gem_create or
>> amdgpu_gem_create), then today both amd and intel implement those
>> interfaces using BO path. We don't have a plan to emulate that old mode on tope
>> of SVM, afaict.
>>
>> I think we might end up emulating "older modes" on top of SVM at some
>> point, not to far out, although what immediately comes to mind would be
>> eviction based on something looking like NUMA- and CGROUP aware
>> shrinkers for integrated bo drivers if that turns out to be sufficient
>> from a memory usage starvation POW. This is IMHO indeed something to
>> start thinking about, but for the current situation trying to solve a
>> mutual SVM-TTM fair eviction problem would be a reasonable scope.
>>
>> Thanks,
>>
>> Thomas
>>
>>
>>> Thanks,
>>> Oak
>>>
>>>> Dave.