[PATCH] amd/amdgpu: Fix resv shared fence overflow

Pan, Xinhui Xinhui.Pan at amd.com
Tue Sep 29 05:39:31 UTC 2020


[AMD Official Use Only - Internal Distribution Only]

Pls ignore this patch.

-----Original Message-----
From: Pan, Xinhui <Xinhui.Pan at amd.com>
Sent: 2020年9月29日 13:17
To: amd-gfx at lists.freedesktop.org
Cc: Koenig, Christian <Christian.Koenig at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; Pan, Xinhui <Xinhui.Pan at amd.com>
Subject: [PATCH] amd/amdgpu: Fix resv shared fence overflow

[  179.556745] kernel BUG at drivers/dma-buf/dma-resv.c:282!
[snip]
[  179.702910] Call Trace:
[  179.705696]  amdgpu_bo_fence+0x21/0x50 [amdgpu] [  179.710707]  amdgpu_vm_sdma_commit+0x299/0x430 [amdgpu] [  179.716497]  amdgpu_vm_bo_update_mapping.constprop.0+0x29f/0x390 [amdgpu] [  179.723927]  ? find_held_lock+0x38/0x90 [  179.728183]  amdgpu_vm_handle_fault+0x1af/0x420 [amdgpu] [  179.734063]  gmc_v9_0_process_interrupt+0x245/0x2e0 [amdgpu] [  179.740347]  ? kgd2kfd_interrupt+0xb8/0x1e0 [amdgpu] [  179.745808]  amdgpu_irq_dispatch+0x10a/0x3c0 [amdgpu] [  179.751380]  ? amdgpu_irq_dispatch+0x10a/0x3c0 [amdgpu] [  179.757159]  amdgpu_ih_process+0xbb/0x1a0 [amdgpu] [  179.762466]  amdgpu_irq_handle_ih1+0x27/0x40 [amdgpu] [  179.767997]  process_one_work+0x23c/0x580 [  179.772371]  worker_thread+0x50/0x3b0 [  179.776356]  ? process_one_work+0x580/0x580 [  179.780939]  kthread+0x128/0x160 [  179.784462]  ? kthread_park+0x90/0x90 [  179.788466]  ret_from_fork+0x1f/0x30

For unlocked case, we add last_unlocked fence to root bo resv if it has not been signaled.
And we will add another job fence to root bo resv in ->commit(). That causes the shared fence count bigger than it reserves.

Signed-off-by: xinhui pan <xinhui.pan at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37221b99ca96..77689cecd189 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1615,6 +1615,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 struct dma_fence *tmp = dma_fence_get_stub();

 amdgpu_bo_fence(vm->root.base.bo, vm->last_unlocked, true);
+dma_resv_reserve_shared(vm->root.base.bo->tbo.base.resv, 1);
 swap(vm->last_unlocked, tmp);
 dma_fence_put(tmp);
 }
--
2.25.1



More information about the amd-gfx mailing list