[Bug 216200] New: AMDGPU hung after enabling HIP for gpu acceleration in Blender Cycles 3.2
bugzilla-daemon at kernel.org
bugzilla-daemon at kernel.org
Sun Jul 3 22:50:07 UTC 2022
https://bugzilla.kernel.org/show_bug.cgi?id=216200
Bug ID: 216200
Summary: AMDGPU hung after enabling HIP for gpu acceleration in
Blender Cycles 3.2
Product: Drivers
Version: 2.5
Kernel Version: 5.18.9
Hardware: AMD
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: Video(DRI - non Intel)
Assignee: drivers_video-dri at kernel-bugs.osdl.org
Reporter: toadron at yandex.ru
Regression: No
Created attachment 301326
--> https://bugzilla.kernel.org/attachment.cgi?id=301326&action=edit
Full journal from the moment the system was launched
Description:
HIP for gpu acceleration in Blender render cycles 3.2 causes the screen to
freeze.
Video showing the problem on Youtube video hosting:
https://www.youtube.com/watch?v=tZzTuvRn3cw
Hardware:
CPU: AMD Ryzen™ 5 3600
MOTHERBOARD: MSI X470 GAMING PLUS MAX
GPU: SAPPHIRE Radeon RX 6600 8192Mb PULSE (11310-01-20G)
Software version:
Arch Linux x86-64
linux 5.18.9.arch1-1
xf86-video-amdgpu 22.0.0-1
mesa 22.1.3-1
rocm-llvm 5.2.0-1
hip-runtime-amd 5.2.0-3
blender 3.2.0-4
Partial log with the problem (see attachment for full log):
Jul 04 01:01:55 sanka kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]]
*ERROR* Waiting for fences timed out!
Jul 04 01:01:55 sanka kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
gfx_0.0.0 timeout, signaled seq=6213, emitted seq=6215
Jul 04 01:01:55 sanka kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process blender pid 2776 thread blender:cs0 pid 2798
Jul 04 01:01:55 sanka kernel: amdgpu 0000:29:00.0: amdgpu: GPU reset begin!
Jul 04 01:01:55 sanka kernel: amdgpu: Failed to suspend process 0x800c
Jul 04 01:01:55 sanka /usr/lib/gdm-x-session[1604]: [2022-07-04 01:01:55.072]
[1649] (device_info_linux.cc:45): NumberOfDevices
Jul 04 01:01:55 sanka /usr/lib/gdm-x-session[1604]: [2022-07-04 01:01:55.189]
[1649] (device_info_linux.cc:45): NumberOfDevices
Jul 04 01:01:55 sanka /usr/lib/gdm-x-session[1604]: [2022-07-04 01:01:55.189]
[1649] (device_info_linux.cc:78): GetDeviceName
Jul 04 01:01:55 sanka kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]]
*ERROR* Waiting for fences timed out!
Jul 04 01:01:55 sanka kernel: amdgpu 0000:29:00.0: [drm:amdgpu_ring_test_helper
[amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Jul 04 01:01:55 sanka kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ
disable failed
Jul 04 01:01:55 sanka kernel: amdgpu 0000:29:00.0: [drm:amdgpu_ring_test_helper
[amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Jul 04 01:01:55 sanka kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ
disable failed
Jul 04 01:01:55 sanka kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed
to halt cp gfx
Jul 04 01:01:55 sanka kernel: [drm] free PSP TMR buffer
Jul 04 01:01:55 sanka kernel: CPU: 5 PID: 158 Comm: kworker/u64:7 Tainted: G
OE 5.18.9-arch1-1 #1 137f0035b2ece06cb65382579db27e9de66af504
Jul 04 01:01:55 sanka kernel: Hardware name: Micro-Star International Co., Ltd.
MS-7B79/X470 GAMING PLUS MAX (MS-7B79), BIOS H.F1 05/24/2022
Jul 04 01:01:55 sanka kernel: Workqueue: amdgpu-reset-dev
drm_sched_job_timedout [gpu_sched]
Jul 04 01:01:55 sanka kernel: Call Trace:
Jul 04 01:01:55 sanka kernel: <TASK>
Jul 04 01:01:55 sanka kernel: dump_stack_lvl+0x48/0x5d
Jul 04 01:01:55 sanka kernel: amdgpu_do_asic_reset+0x2a/0x470 [amdgpu
c3399060640045ce33894f35f697ceceab8d3be0]
Jul 04 01:01:55 sanka kernel: amdgpu_device_gpu_recover_imp.cold+0x537/0x8cc
[amdgpu c3399060640045ce33894f35f697ceceab8d3be0]
Jul 04 01:01:55 sanka kernel: amdgpu_job_timedout+0x18c/0x1c0 [amdgpu
c3399060640045ce33894f35f697ceceab8d3be0]
Jul 04 01:01:55 sanka kernel: drm_sched_job_timedout+0x76/0x100 [gpu_sched
b54a976254cd79f6332eedc913d0037b3c33b883]
Jul 04 01:01:55 sanka kernel: process_one_work+0x1c7/0x380
Jul 04 01:01:55 sanka kernel: worker_thread+0x51/0x380
Jul 04 01:01:55 sanka kernel: ? rescuer_thread+0x3a0/0x3a0
Jul 04 01:01:55 sanka kernel: kthread+0xde/0x110
Jul 04 01:01:55 sanka kernel: ? kthread_complete_and_exit+0x20/0x20
Jul 04 01:01:55 sanka kernel: ret_from_fork+0x22/0x30
Jul 04 01:01:55 sanka kernel: </TASK>
Jul 04 01:01:55 sanka kernel: amdgpu 0000:29:00.0: amdgpu: MODE1 reset
Jul 04 01:01:55 sanka kernel: amdgpu 0000:29:00.0: amdgpu: GPU mode1 reset
Jul 04 01:01:55 sanka kernel: amdgpu 0000:29:00.0: amdgpu: GPU smu mode1 reset
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: GPU reset succeeded,
trying to resume
Jul 04 01:01:56 sanka kernel: [drm] PCIE GART of 512M enabled (table at
0x0000008000300000).
Jul 04 01:01:56 sanka kernel: [drm] VRAM is lost due to GPU reset!
Jul 04 01:01:56 sanka kernel: [drm] PSP is resuming...
Jul 04 01:01:56 sanka kernel: [drm] reserve 0xa00000 from 0x81fe000000 for PSP
TMR
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: RAS: optional ras ta
ucode is not available
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: SECUREDISPLAY:
securedisplay ta ucode is not available
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: SMU is resuming...
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: smu driver if
version = 0x0000000f, smu fw if version = 0x00000013, smu fw program = 0,
version = 0x003b2900 (59.41.0)
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: SMU driver if
version not matched
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: SMU is resumed
successfully!
Jul 04 01:01:56 sanka kernel: [drm] DMUB hardware initialized:
version=0x0202000F
Jul 04 01:01:56 sanka kernel: [drm] kiq ring mec 2 pipe 1 q 0
Jul 04 01:01:56 sanka kernel: [drm] VCN decode and encode initialized
successfully(under DPG Mode).
Jul 04 01:01:56 sanka kernel: [drm] JPEG decode initialized successfully.
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring gfx_0.0.0 uses
VM inv eng 0 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring comp_1.0.0 uses
VM inv eng 1 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring comp_1.1.0 uses
VM inv eng 4 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring comp_1.2.0 uses
VM inv eng 5 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring comp_1.3.0 uses
VM inv eng 6 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring comp_1.0.1 uses
VM inv eng 7 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring comp_1.1.1 uses
VM inv eng 8 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring comp_1.2.1 uses
VM inv eng 9 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring comp_1.3.1 uses
VM inv eng 10 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring kiq_2.1.0 uses
VM inv eng 11 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring sdma0 uses VM
inv eng 12 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring sdma1 uses VM
inv eng 13 on hub 0
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring vcn_dec_0 uses
VM inv eng 0 on hub 1
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring vcn_enc_0.0
uses VM inv eng 1 on hub 1
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring vcn_enc_0.1
uses VM inv eng 4 on hub 1
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: ring jpeg_dec uses
VM inv eng 5 on hub 1
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: recover vram bo from
shadow start
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: recover vram bo from
shadow done
Jul 04 01:01:56 sanka kernel: [drm] Skip scheduling IBs!
Jul 04 01:01:56 sanka kernel: [drm] Skip scheduling IBs!
Jul 04 01:01:56 sanka kernel: amdgpu 0000:29:00.0: amdgpu: GPU reset(2)
succeeded!
Jul 04 01:01:56 sanka kernel: [drm] Skip scheduling IBs!
Jul 04 01:01:56 sanka kernel: [drm] Skip scheduling IBs!
Jul 04 01:01:56 sanka kernel: [drm] Skip scheduling IBs!
...I skip repeated lines...
Jul 04 01:01:56 sanka kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
Jul 04 01:01:56 sanka kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
Jul 04 01:01:56 sanka kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
Jul 04 01:01:56 sanka /usr/lib/gdm-x-session[2747]: amdgpu: The CS has been
cancelled because the context is lost.
Jul 04 01:01:56 sanka /usr/lib/gdm-x-session[1000]: amdgpu: The CS has been
cancelled because the context is lost.
Jul 04 01:01:56 sanka /usr/lib/gdm-x-session[1000]: amdgpu: The CS has been
cancelled because the context is lost.
Jul 04 01:01:56 sanka /usr/lib/gdm-x-session[1000]: amdgpu: The CS has been
cancelled because the context is lost.
Jul 04 01:01:56 sanka /usr/lib/gdm-x-session[1000]: amdgpu: The CS has been
cancelled because the context is lost.
Jul 04 01:01:56 sanka /usr/lib/gdm-x-session[1000]: amdgpu: The CS has been
cancelled because the context is lost.
Jul 04 01:01:56 sanka /usr/lib/gdm-x-session[1000]: amdgpu: The CS has been
cancelled because the context is lost.
Jul 04 01:01:56 sanka /usr/lib/gdm-x-session[1000]: amdgpu: The CS has been
cancelled because the context is lost.
Jul 04 01:01:56 sanka kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
Jul 04 01:01:56 sanka kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
Jul 04 01:01:56 sanka kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
More information about the dri-devel
mailing list