some render processes enter the D state, waiting for the dma fence , without GPU hang

孙南勇 497141801 at qq.com
Fri Jul 19 11:27:30 UTC 2019


Dear All,
I use AMD GPU wx5100 for rendering and sometimes some processes enter the D state , 
I checked the dmesg log and the first error log show it is waiting for the fence:
2019-07-02T06:46:12.144409+08:00|err|kernel[-]|[1748923.714105] INFO: task Binder:2577_7:147603 blocked for more than 120 seconds.
2019-07-02T06:46:12.144434+08:00|err|kernel[-]|[1748923.714107] Tainted: P OE 4.19.36-1.2.159.aarch64 #1
2019-07-02T06:46:12.144460+08:00|err|kernel[-]|[1748923.714108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2019-07-02T06:46:12.144484+08:00|info|kernel[-]|[1748923.714110] Binder:2577_7 D 0 147603 240764 0x0040002d
2019-07-02T06:46:12.144508+08:00|warning|kernel[-]|[1748923.714114] Call trace:
2019-07-02T06:46:12.144531+08:00|warning|kernel[-]|[1748923.714115] __switch_to+0x94/0xe8
2019-07-02T06:46:12.144568+08:00|warning|kernel[-]|[1748923.714119] __schedule+0x28c/0x940
2019-07-02T06:46:12.144593+08:00|warning|kernel[-]|[1748923.714120] schedule+0x2c/0x88
2019-07-02T06:46:12.144616+08:00|warning|kernel[-]|[1748923.714122] schedule_timeout+0x22c/0x468
2019-07-02T06:46:12.144639+08:00|warning|kernel[-]|[1748923.714126] dma_fence_wait_any_timeout+0x234/0x2d0
2019-07-02T06:46:12.144663+08:00|warning|kernel[-]|[1748923.714201] amdgpu_sa_bo_new+0x3b0/0x548 [amdgpu]
2019-07-02T06:46:12.144687+08:00|warning|kernel[-]|[1748923.714263] amdgpu_ib_get+0x60/0xc8 [amdgpu]
2019-07-02T06:46:12.144711+08:00|warning|kernel[-]|[1748923.714329] amdgpu_job_alloc_with_ib+0x70/0xb0 [amdgpu]
2019-07-02T06:46:12.144735+08:00|warning|kernel[-]|[1748923.714390] amdgpu_vm_bo_update_mapping+0x2c0/0x3b0 [amdgpu]
2019-07-02T06:46:12.144763+08:00|warning|kernel[-]|[1748923.714453] amdgpu_vm_clear_freed+0xd8/0x1c8 [amdgpu]
2019-07-02T06:46:12.144787+08:00|warning|kernel[-]|[1748923.714513] amdgpu_gem_object_close+0x178/0x1d0 [amdgpu]
2019-07-02T06:46:12.144810+08:00|warning|kernel[-]|[1748923.714538] drm_gem_object_release_handle+0x3c/0x98 [drm]
2019-07-02T06:46:12.144834+08:00|warning|kernel[-]|[1748923.714542] idr_for_each+0x70/0x128
2019-07-02T06:46:12.144908+08:00|warning|kernel[-]|[1748923.714610] drm_release+0xb0/0x138 [drm]
2019-07-02T06:46:12.144931+08:00|warning|kernel[-]|[1748923.714612] __fput+0xac/0x218
2019-07-02T06:46:12.144955+08:00|warning|kernel[-]|[1748923.714614] ____fput+0x20/0x30
2019-07-02T06:46:12.144978+08:00|warning|kernel[-]|[1748923.714617] task_work_run+0xc0/0xf8
2019-07-02T06:46:12.145006+08:00|warning|kernel[-]|[1748923.714619] do_exit+0x300/0x5b0
2019-07-02T06:46:12.145030+08:00|warning|kernel[-]|[1748923.714621] do_group_exit+0x3c/0xe0
2019-07-02T06:46:12.145054+08:00|warning|kernel[-]|[1748923.714623] get_signal+0x12c/0x6e0
2019-07-02T06:46:12.145077+08:00|warning|kernel[-]|[1748923.714625] do_signal+0x180/0x288
2019-07-02T06:46:12.145101+08:00|warning|kernel[-]|[1748923.714628] do_notify_resume+0x100/0x188
2019-07-02T06:46:12.145124+08:00|warning|kernel[-]|[1748923.714630] work_pending+0x8/0x10



Before this log , there is no any error log like "gfx timeout" or other, so , maybe it is not a GPU hang?


My software's version:
Ubuntu18.04.1
Mesa:18.3.4
LLVM:7.0
Kernel:4.19.36



Can you explain this , or how to debug this problem?


Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190719/247f9899/attachment.html>


More information about the amd-gfx mailing list