[PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3
Liang, Prike
Prike.Liang at amd.com
Thu May 23 08:27:58 UTC 2019
Hi, Christian
Thanks for you patch .Those patches can fix amdgpu bo pinned failed issue during perform dm_plane_helper_prepare_fb
and Abaqus performance seems improved.
But there some error message can be observer. Do we need drop amdgpu_vm_validate_pt_bos() error message
and other warning debug info .
[ 1910.674541] Call Trace:
[ 1910.676944] [<ffffffff8b361dc1>] dump_stack+0x19/0x1b
[ 1910.682236] [<ffffffff8ac97648>] __warn+0xd8/0x100
[ 1910.687195] [<ffffffff8ac9778d>] warn_slowpath_null+0x1d/0x20
[ 1910.693167] [<ffffffffc0603619>] amdgpu_bo_move+0x169/0x1c0 [amdgpu]
[ 1910.699719] [<ffffffffc05c82bb>] ttm_bo_handle_move_mem+0x26b/0x5d0 [amdttm]
[ 1910.706976] [<ffffffffc05c8767>] ttm_bo_evict+0x147/0x3b0 [amdttm]
[ 1910.713358] [<ffffffffc04e88d9>] ? drm_mm_insert_node_in_range+0x299/0x4d0 [drm]
[ 1910.720881] [<ffffffffc057652e>] ? _kcl_reservation_object_reserve_shared+0xfe/0x1a0 [amdkcl]
[ 1910.729710] [<ffffffffc05c8c6e>] ttm_mem_evict_first+0x29e/0x3a0 [amdttm]
[ 1910.736705] [<ffffffffc05c8f1e>] amdttm_bo_mem_space+0x1ae/0x300 [amdttm]
[ 1910.743696] [<ffffffffc05c9544>] amdttm_bo_validate+0xc4/0x140 [amdttm]
[ 1910.750529] [<ffffffffc060c035>] amdgpu_cs_bo_validate+0xa5/0x220 [amdgpu]
[ 1910.757625] [<ffffffffc060c1f7>] amdgpu_cs_validate+0x47/0x2e0 [amdgpu]
[ 1910.764463] [<ffffffffc060c1b0>] ? amdgpu_cs_bo_validate+0x220/0x220 [amdgpu]
[ 1910.771736] [<ffffffffc0620652>] amdgpu_vm_validate_pt_bos+0x92/0x140 [amdgpu]
[ 1910.779248] [<ffffffffc060e547>] amdgpu_cs_ioctl+0x18a7/0x1d50 [amdgpu]
[ 1910.785992] [<ffffffffc060cca0>] ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[ 1910.793486] [<ffffffffc04e3f2c>] drm_ioctl_kernel+0x6c/0xb0 [drm]
[ 1910.799777] [<ffffffffc04e4647>] drm_ioctl+0x1e7/0x420 [drm]
[ 1910.805643] [<ffffffffc060cca0>] ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[ 1910.813090] [<ffffffffc05ec04b>] amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
[ 1910.819639] [<ffffffff8ae56210>] do_vfs_ioctl+0x3a0/0x5a0
[ 1910.825217] [<ffffffff8b36744a>] ? __schedule+0x13a/0x890
[ 1910.830795] [<ffffffff8ae564b1>] SyS_ioctl+0xa1/0xc0
[ 1910.835943] [<ffffffff8b374ddb>] system_call_fastpath+0x22/0x27
[ 1910.842048] ---[ end trace a5c00b151c061d53 ]---
[ 1910.846814] [TTM] Buffer eviction failed
[ 1910.850838] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* amdgpu_vm_validate_pt_bos() failed.
[ 1910.858905] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -22!
.......
Thanks,
Prike
-----Original Message-----
From: Christian König <ckoenig.leichtzumerken at gmail.com>
Sent: Wednesday, May 22, 2019 9:00 PM
To: Olsak, Marek <Marek.Olsak at amd.com>; Zhou, David(ChunMing) <David1.Zhou at amd.com>; Liang, Prike <Prike.Liang at amd.com>; dri-devel at lists.freedesktop.org; amd-gfx at lists.freedesktop.org
Subject: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3
[CAUTION: External Email]
This avoids OOM situations when we have lots of threads submitting at the same time.
v3: apply this to the whole driver, not just CS
Signed-off-by: Christian König <christian.koenig at amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 4 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 20f2955d2a55..3e2da24cd17a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}
r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true,
- &duplicates, true);
+ &duplicates, false);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 06f83cac0d3a..f660628e6af9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
list_add(&csa_tv.head, &list);
amdgpu_vm_get_pd_bo(vm, &list, &pd);
- r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, true);
+ r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, false);
if (r) {
DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index d513a5ad03dd..ed25a4e14404 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);
- r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates, true);
+ r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates,
+ false);
if (r) {
dev_err(adev->dev, "leaking bo va because "
"we fail to reserve bo (%d)\n", r); @@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
amdgpu_vm_get_pd_bo(&fpriv->vm, &list, &vm_pd);
- r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates, true);
+ r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates,
+ false);
if (r)
goto error_unref;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index c430e8259038..d60593cc436e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -155,7 +155,7 @@ static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, bool no_intr)
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
int r;
- r = ttm_bo_reserve(&bo->tbo, !no_intr, false, NULL);
+ r = __ttm_bo_reserve(&bo->tbo, !no_intr, false, NULL);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
dev_err(adev->dev, "%p reserve failed\n", bo);
--
2.17.1
More information about the amd-gfx
mailing list