[PATCH] drm/amdgpu: only use kernel zone if need_dma32 is not required
Christian.Koenig at amd.com
Thu Jun 13 08:54:10 UTC 2019
Am 12.06.19 um 23:13 schrieb Yang, Philip:
> On 2019-06-12 3:28 p.m., Christian König wrote:
>> Am 12.06.19 um 17:13 schrieb Yang, Philip:
>>> TTM create two zones, kernel zone and dma32 zone for system memory. If
>>> system memory address allocated is below 4GB, this account to dma32 zone
>>> and will exhaust dma32 zone and trigger unnesssary TTM eviction.
>>> Patch "drm/ttm: Account for kernel allocations in kernel zone only" only
>>> handle the allocation for acc_size, the system memory page allocation is
>>> through ttm_mem_global_alloc_page which still account to dma32 zone if
>>> page is below 4GB.
>> NAK, as the name says the mem_glob is global for all devices in the system.
>> So this will break if you mix DMA32 and non DMA32 in the same system
>> which is exactly the configuration my laptop here has :(
> I didn't find path use dma32 zone, but I may missed something.
Well the point is there is non in our driver, but many drivers in the
system still need DMA32 memory.
> There is
> an issue found by KFDTest.BigBufStressTest, it allocates buffers up to
> 3/8 of total 256GB system memory, each buffer size is 128MB, then use
> queue to write to the buffers. If ttm_mem_global_alloc_page get page pfn
> is below 4GB, it account to dma32 zone and will exhaust 2GB limit, then
> ttm_check_swapping will schedule ttm_shrink_work to start eviction. It
> takes minutes to finish restore (retry many times if busy), the test
> failed because queue timeout. This eviction is unnecessary because we
> still have enough free system memory.
No that is definitely necessary. For example on my Laptop it's the sound
system which needs DMA32.
Without this when an application uses a lot of memory we run into the
OOM killer as soon as some audio starts playing.
> It's random case, happens about 1/5. I can change test to increase the
> timeout value to workaround this, but this seems TTM bug. This will slow
> application performance a lot if this random issue happens.
One thing we could try is to dig into why the kernel gives us DMA32
pages when there are still other pages available. Please take a look at
/proc/buddyinfo on that box for this.
The next thing coming to mind is that we can most likely completely skip
this if IOMMU is active.
The last thing of hand coming to my mind is to improve TTM to actually
only evict BOs which use DMA32 memory, cause currently we just evict
stuff randomly and not check if its DMA32 or other memory.
>>> Change-Id: I289b85d891b8f64a1422c42b1eab398098ab7ef7
>>> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> index 2778ff63d97d..79bb9dfe617b 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> @@ -1686,6 +1686,10 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>>> adev->mman.initialized = true;
>>> + /* Only kernel zone (no dma32 zone) if device does not require
>>> dma32 */
>>> + if (!adev->need_dma32)
>>> + adev->mman.bdev.glob->mem_glob->num_zones = 1;
>>> /* We opt to avoid OOM on system pages allocations */
>>> adev->mman.bdev.no_retry = true;
More information about the amd-gfx