[PATCH 0/5] prevent OOM triggered by TTM

He, Roger Hongbo.He at amd.com
Tue Feb 6 10:07:59 UTC 2018


	Move the new call out of ttm_mem_global_reserve() and into ttm_page_alloc.c or ttm_page_alloc_dma.c (but keep it in 	ttm_memory.c).  ttm_mem_global_reserve() is called for each page allocated and si_mem_available() is a bit to heavy for that.

Good idea! Agree with you completely, because initially I also concern that but no better way at that time.
Going to improve the patches. Thanks!

-----Original Message-----
From: Koenig, Christian 
Sent: Tuesday, February 06, 2018 5:52 PM
To: He, Roger <Hongbo.He at amd.com>; amd-gfx at lists.freedesktop.org; dri-devel at lists.freedesktop.org
Cc: thomas at shipmail.org
Subject: Re: [PATCH 0/5] prevent OOM triggered by TTM

Nice work, but a few comments.

First of all you need to reorder the patches. Adding the exceptions to the restrictions should come first, then the restriction itself. 
Otherwise we might break a setup in between the patches and that is bad for bisecting.

Then make all values configurable, e.g. take a closer look at ttm_memory.c. Just add attributes directly under the memory_accounting directory (see ttm_mem_global_init).

Additional to that you can't put device specific information (the no_retry flag) into ttm_mem_global, that is driver unspecific and won't work like this.

Move the new call out of ttm_mem_global_reserve() and into ttm_page_alloc.c or ttm_page_alloc_dma.c (but keep it in ttm_memory.c). 
ttm_mem_global_reserve() is called for each page allocated and
si_mem_available() is a bit to heavy for that.

Maybe name TTM_OPT_FLAG_ALLOW_ALLOC_ANYWAY something like _FORCE_ALLOCATION or _ALLOW_OOM.

And please also try if a criteria like (si_mem_available() +
get_nr_swap_pages()) < limit works as well. This way we would have only a single new limit.

Regards,
Christian.

Am 06.02.2018 um 10:04 schrieb Roger He:
> currently ttm code has no any allocation limit. So it allows pages 
> allocatation unlimited until OOM. Because if swap space is full of 
> swapped pages and then system memory will be filled up with ttm pages. 
> and then any memory allocation request will trigger OOM.
>
>
> the following patches is for prevent OOM triggered by TTM.
> the basic idea is when allocating TTM pages, check the free swap space 
> firt. if it is less than the fixe limit, reject the allocation request.
> but there are two exceptions which should allow it regardless of zone 
> memory account limit.
> a. page fault
>     for ttm_mem_global_reserve if serving for page fault routine,
>     because page fault routing already grabbed system memory so the
>     allowance of this exception is harmless. Otherwise, it will trigger
>      OOM killer.
> b. suspend
>     anyway, we should allow suspend success always.
>
>
> at last, if bdev.no_retry is false (by defaut), keep the original 
> behavior no any change.
>
> Roger He (5):
>    drm/ttm: check if the free swap space is under limit 256MB
>    drm/ttm: keep original behavior except with flag no_retry
>    drm/ttm: use bit flag to replace allow_reserved_eviction in
>      ttm_operation_ctx
>    drm/ttm: add bit flag TTM_OPT_FLAG_ALLOW_ALLOC_ANYWAY
>    drm/ttm: add input parameter allow_allo_anyway for ttm_bo_evict_mm
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 10 +++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c     |  8 +++--
>   drivers/gpu/drm/nouveau/nouveau_drm.c       |  2 +-
>   drivers/gpu/drm/qxl/qxl_object.c            |  4 +--
>   drivers/gpu/drm/radeon/radeon_device.c      |  6 ++--
>   drivers/gpu/drm/radeon/radeon_object.c      |  6 ++--
>   drivers/gpu/drm/radeon/radeon_object.h      |  3 +-
>   drivers/gpu/drm/ttm/ttm_bo.c                | 19 +++++++----
>   drivers/gpu/drm/ttm/ttm_bo_vm.c             |  6 ++--
>   drivers/gpu/drm/ttm/ttm_memory.c            | 51 ++++++++++++++++++++++++++---
>   drivers/gpu/drm/ttm/ttm_page_alloc_dma.c    |  1 -
>   drivers/gpu/drm/vmwgfx/vmwgfx_drv.c         |  6 ++--
>   include/drm/ttm/ttm_bo_api.h                | 14 ++++++--
>   include/drm/ttm/ttm_memory.h                |  6 ++++
>   18 files changed, 111 insertions(+), 43 deletions(-)
>



More information about the amd-gfx mailing list