[Bug 112304] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout causes system freeze

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Nov 17 06:37:58 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=112304

--- Comment #10 from saadnaji89 at gmail.com ---
Comment on attachment 145981
  --> https://bugs.freedesktop.org/attachment.cgi?id=145981
additional-journalctl-logs-during-game-play


>Nov 17 01:02:31 archlinux audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
>Nov 17 01:02:31 archlinux kernel: audit: type=1131 audit(1573970551.160:173): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
>Nov 17 01:02:36 archlinux kernel: amdgpu 0000:01:00.0: GPU fault detected: 146 0x066e480c
>Nov 17 01:02:36 archlinux kernel: amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00100DB3
>Nov 17 01:02:36 archlinux kernel: amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0E04800C
>Nov 17 01:02:36 archlinux kernel: amdgpu 0000:01:00.0: VM fault (0x0c, vmid 7) at page 1052083, read from '' (0x00000000) (72)
>Nov 17 01:02:36 archlinux kernel: amdgpu 0000:01:00.0: GPU fault detected: 146 0x06ae880c
>Nov 17 01:02:36 archlinux kernel: amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
>Nov 17 01:02:36 archlinux kernel: amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0F008010
>Nov 17 01:02:36 archlinux kernel: amdgpu 0000:01:00.0: VM fault (0x10, vmid 7) at page 0, write from '' (0x00000000) (8)
>Nov 17 01:02:45 archlinux kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=23, emitted seq=24
>Nov 17 01:02:45 archlinux kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process hl2_linux pid 2252 thread hl2_linux:cs0 pid 2254
>Nov 17 01:02:45 archlinux kernel: amdgpu 0000:01:00.0: GPU reset begin!
>Nov 17 01:02:45 archlinux kernel: amdgpu 0000:01:00.0: GPU reset succeeded, trying to resume
>Nov 17 01:02:45 archlinux kernel: [drm] PCIE gen 3 link speeds already enabled
>Nov 17 01:02:45 archlinux kernel: amdgpu 0000:01:00.0: PCIE GART of 1024M enabled (table at 0x000000F400000000).
>Nov 17 01:02:46 archlinux kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx test failed (-110)
>Nov 17 01:02:46 archlinux kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v6_0> failed -110
>Nov 17 01:02:46 archlinux kernel: [drm:si_dpm_set_power_state [amdgpu]] *ERROR* si_restrict_performance_levels_before_switch failed
>Nov 17 01:02:46 archlinux kernel: amdgpu 0000:01:00.0: GPU reset(1) failed
>Nov 17 01:02:46 archlinux kernel: amdgpu 0000:01:00.0: GPU reset end with ret = -110
>Nov 17 01:02:49 archlinux kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=24, emitted seq=24
>Nov 17 01:02:49 archlinux kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process hl2_linux pid 2252 thread hl2_linux:cs0 pid 2254
>Nov 17 01:02:49 archlinux kernel: amdgpu 0000:01:00.0: GPU reset begin!
>Nov 17 01:03:49 archlinux kernel: [drm] schedsdma0 is not ready, skipping
>Nov 17 01:03:49 archlinux kernel: [drm] schedsdma1 is not ready, skipping
>Nov 17 01:03:49 archlinux kernel: amdgpu 0000:01:00.0: failed to clear page tables on GEM object close (-2)
>Nov 17 01:03:49 archlinux kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
>Nov 17 01:03:49 archlinux kernel: #PF: supervisor read access in kernel mode
>Nov 17 01:03:49 archlinux kernel: #PF: error_code(0x0000) - not-present page
>Nov 17 01:03:49 archlinux kernel: PGD 0 P4D 0 
>Nov 17 01:03:49 archlinux kernel: Oops: 0000 [#1] SMP PTI
>Nov 17 01:03:49 archlinux kernel: CPU: 1 PID: 2262 Comm: hl2_linu:shlo0 Not tainted 5.3.11-2-clear #1
>Nov 17 01:03:49 archlinux kernel: Hardware name: CLEVO                             P150EM/P150EM, BIOS 1.02.17PM v2 07/01/2013
>Nov 17 01:03:49 archlinux kernel: RIP: 0010:amdgpu_vm_sdma_commit+0x34/0x100 [amdgpu]
>Nov 17 01:03:49 archlinux kernel: Code: 49 89 f5 41 54 53 48 89 fb 48 83 ec 10 48 8b 47 08 48 8b 57 18 4c 8b b0 80 00 00 00 4c 8b a2 88 01 00 00 48 8b 80 c8 00 00 00 <4c> 8b 78 08 41 8b 44 24 08 4d 8d 47 88 85 c0 0f 84 49 ae 1e 00 49
>Nov 17 01:03:49 archlinux kernel: RSP: 0018:ffffb9290250b9a0 EFLAGS: 00010286
>Nov 17 01:03:49 archlinux kernel: RAX: 0000000000000000 RBX: ffffb9290250b9e8 RCX: 0000000000100400
>Nov 17 01:03:49 archlinux kernel: RDX: ffff9d2c19ba1c00 RSI: ffffb9290250ba60 RDI: ffffb9290250b9e8
>Nov 17 01:03:49 archlinux kernel: RBP: ffffb9290250b9d8 R08: 0000000000001000 R09: 0000000000200000
>Nov 17 01:03:49 archlinux kernel: R10: ffffb929004c5600 R11: 0000000000000012 R12: ffff9d2c19ba1df8
>Nov 17 01:03:49 archlinux kernel: R13: ffffb9290250ba60 R14: ffff9d2b3a472000 R15: ffff9d2bbc2f12a0
>Nov 17 01:03:49 archlinux kernel: FS:  0000000000000000(0000) GS:ffff9d2c1f040000(0000) knlGS:0000000000000000
>Nov 17 01:03:49 archlinux kernel: CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
>Nov 17 01:03:49 archlinux kernel: CR2: 0000000000000008 CR3: 0000000306a0a003 CR4: 00000000001606e0
>Nov 17 01:03:49 archlinux kernel: Call Trace:
>Nov 17 01:03:49 archlinux kernel:  amdgpu_vm_bo_update_mapping+0x9e/0xb0 [amdgpu]
>Nov 17 01:03:49 archlinux kernel:  amdgpu_vm_clear_freed+0xb5/0x170 [amdgpu]
>Nov 17 01:03:49 archlinux kernel:  amdgpu_gem_object_close+0x127/0x170 [amdgpu]
>Nov 17 01:03:49 archlinux kernel:  drm_gem_object_release_handle+0x81/0xc0
>Nov 17 01:03:49 archlinux kernel:  ? drm_gem_object_handle_put_unlocked+0xa0/0xa0
>Nov 17 01:03:49 archlinux kernel:  idr_for_each+0x51/0xc0
>Nov 17 01:03:49 archlinux kernel:  drm_gem_release+0x1c/0x30
>Nov 17 01:03:49 archlinux kernel:  drm_file_free.part.0+0x2b1/0x300
>Nov 17 01:03:49 archlinux kernel:  drm_close_helper.isra.0+0x6e/0x80
>Nov 17 01:03:49 archlinux kernel:  drm_release+0x4c/0x7e
>Nov 17 01:03:49 archlinux kernel:  __fput+0xbf/0x260
>Nov 17 01:03:49 archlinux kernel:  ____fput+0x9/0x10
>Nov 17 01:03:49 archlinux kernel:  task_work_run+0x8f/0xb0
>Nov 17 01:03:49 archlinux kernel:  do_exit+0x302/0x730
>Nov 17 01:03:49 archlinux kernel:  do_group_exit+0x36/0xa0
>Nov 17 01:03:49 archlinux kernel:  get_signal+0x15c/0x810
>Nov 17 01:03:49 archlinux kernel:  ? do_futex+0x121/0x540
>Nov 17 01:03:49 archlinux kernel:  do_signal+0x2f/0x260
>Nov 17 01:03:49 archlinux kernel:  ? __audit_syscall_entry+0xd6/0x120
>Nov 17 01:03:49 archlinux kernel:  exit_to_usermode_loop+0x98/0xc0
>Nov 17 01:03:49 archlinux kernel:  do_fast_syscall_32+0x29d/0x350
>Nov 17 01:03:49 archlinux kernel:  ? do_int80_syscall_32+0x195/0x1f0
>Nov 17 01:03:49 archlinux kernel:  entry_SYSENTER_compat+0x7c/0x8e
>Nov 17 01:03:49 archlinux kernel: RIP: 0023:0xf7f79949
>Nov 17 01:03:49 archlinux kernel: Code: Bad RIP value.
>Nov 17 01:03:49 archlinux kernel: RSP: 002b:00000000e84fb190 EFLAGS: 00000282 ORIG_RAX: 00000000000000f0
>Nov 17 01:03:49 archlinux kernel: RAX: fffffffffffffe00 RBX: 000000000ab8415c RCX: 0000000000000080
>Nov 17 01:03:49 archlinux kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000ab84130
>Nov 17 01:03:49 archlinux kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
>Nov 17 01:03:49 archlinux kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>Nov 17 01:03:49 archlinux kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
>Nov 17 01:03:49 archlinux kernel: Modules linked in: xt_nat veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter overlay ath9k ath9k_common ath9k_hw snd_hda_codec_hdmi mac80211 snd_hda_codec_realtek snd_hda_codec_generic ath mei_hdcp ledtrig_audio wmi_bmof snd_hda_intel uvcvideo snd_hda_codec videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_hda_core videodev snd_hwdep cfg80211 joydev mc snd_pcm snd_timer rtsx_pci_ms r8169 i2c_i801 psmouse snd mei_me memstick rfkill soundcore realtek mei lpc_ich libphy thermal wmi battery ac ip_tables hid_logitech_hidpp atkbd libps2 serio_raw i8042 amdgpu amd_iommu_v2 hid_logitech_dj gpu_sched
>Nov 17 01:03:49 archlinux kernel: CR2: 0000000000000008
>Nov 17 01:03:49 archlinux kernel: ---[ end trace c63f21dbba2ef5cd ]---
>Nov 17 01:03:49 archlinux kernel: RIP: 0010:amdgpu_vm_sdma_commit+0x34/0x100 [amdgpu]
>Nov 17 01:03:49 archlinux kernel: Code: 49 89 f5 41 54 53 48 89 fb 48 83 ec 10 48 8b 47 08 48 8b 57 18 4c 8b b0 80 00 00 00 4c 8b a2 88 01 00 00 48 8b 80 c8 00 00 00 <4c> 8b 78 08 41 8b 44 24 08 4d 8d 47 88 85 c0 0f 84 49 ae 1e 00 49
>Nov 17 01:03:49 archlinux kernel: RSP: 0018:ffffb9290250b9a0 EFLAGS: 00010286
>Nov 17 01:03:49 archlinux kernel: RAX: 0000000000000000 RBX: ffffb9290250b9e8 RCX: 0000000000100400
>Nov 17 01:03:49 archlinux kernel: RDX: ffff9d2c19ba1c00 RSI: ffffb9290250ba60 RDI: ffffb9290250b9e8
>Nov 17 01:03:49 archlinux kernel: RBP: ffffb9290250b9d8 R08: 0000000000001000 R09: 0000000000200000
>Nov 17 01:03:49 archlinux kernel: R10: ffffb929004c5600 R11: 0000000000000012 R12: ffff9d2c19ba1df8
>Nov 17 01:03:49 archlinux kernel: R13: ffffb9290250ba60 R14: ffff9d2b3a472000 R15: ffff9d2bbc2f12a0
>Nov 17 01:03:49 archlinux kernel: FS:  0000000000000000(0000) GS:ffff9d2c1f040000(0000) knlGS:0000000000000000
>Nov 17 01:03:49 archlinux kernel: CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
>Nov 17 01:03:49 archlinux kernel: CR2: 00000000f7f7991f CR3: 0000000366c06002 CR4: 00000000001606e0
>Nov 17 01:03:49 archlinux kernel: Fixing recursive fault but reboot is needed!

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20191117/2b09cb2d/attachment.html>


More information about the dri-devel mailing list