<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - Amdgpu randomly hangs and only ssh works. Mouse cursor moves sometimes but does nothing. Keyboard stops working."
href="https://bugs.freedesktop.org/show_bug.cgi?id=105733#c29">Comment # 29</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - Amdgpu randomly hangs and only ssh works. Mouse cursor moves sometimes but does nothing. Keyboard stops working."
href="https://bugs.freedesktop.org/show_bug.cgi?id=105733">bug 105733</a>
from <span class="vcard"><a class="email" href="mailto:andrey.grodzovsky@amd.com" title="Andrey Grodzovsky <andrey.grodzovsky@amd.com>"> <span class="fn">Andrey Grodzovsky</span></a>
</span></b>
<pre>(In reply to Jan Jurzitza from <a href="show_bug.cgi?id=105733#c28">comment #28</a>)
<span class="quote">> (In reply to Andrey Grodzovsky from <a href="show_bug.cgi?id=105733#c25">comment #25</a>)
>
> Still same issue happening here on both projects built from git. One issue
> here which doesn't seem completely related:
> Aug 23 20:41:20 archlinux kernel: ------------[ cut here ]------------
> Aug 23 20:41:20 archlinux kernel: CPU update of VM recommended only for
> large BAR system
> Aug 23 20:41:20 archlinux kernel: WARNING: CPU: 5 PID: 1092 at
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2606 amdgpu_vm_init+0x477/0x490
> [amdgpu]
> Aug 23 20:41:20 archlinux kernel: Modules linked in: bnep nct6775 hwmon_vid
> joydev btusb btrtl btbcm btintel bluetooth snd_usb_audio snd_usbmidi_lib
> snd_rawmidi input_leds snd_seq_device ecdh_generic mousedev nls_iso8859_1
> nls_cp437 vfat fat btrfs zstd_compress libcrc32c zstd_decompress xxhash xor
> arc4 amdkfd amd_iommu_v2 amdgpu iwlmvm mac80211 edac_mce_amd led_class
> kvm_amd iwlwifi snd_hda_codec_realtek chash gpu_sched kvm snd_hda_codec_hdmi
> snd_hda_codec_generic ttm snd_hda_intel drm_kms_helper irqbypass
> snd_hda_codec cfg80211 morus1280_avx2 drm morus1280_sse2 morus1280_glue
> morus640_sse2 morus640_glue snd_hda_core aegis256_aesni aegis128l_aesni
> aegis128_aesni igb snd_hwdep crct10dif_pclmul crc32_pclmul
> ghash_clmulni_intel snd_pcm pcbc snd_timer agpgart evdev ccp sp5100_tco
> aesni_intel snd syscopyarea i2c_algo_bit sysfillrect
> Aug 23 20:41:20 archlinux kernel: aes_x86_64 wmi_bmof mac_hid crypto_simd
> sysimgblt raid6_pq cryptd glue_helper fb_sys_fops soundcore k10temp
> i2c_piix4 dca rfkill rng_core wmi button acpi_cpufreq sch_fq_codel
> vboxnetflt(O) vboxnetadp(O) pci_stub vboxpci(O) vboxdrv(O) sg crypto_user
> ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto sr_mod
> cdrom sd_mod uas usb_storage hid_uclogic hid_generic usbhid hid ahci libahci
> xhci_pci libata crc32c_intel xhci_hcd usbcore scsi_mod usb_common
> Aug 23 20:41:20 archlinux kernel: CPU: 5 PID: 1092 Comm: Xorg.wrap Tainted:
> G O 4.18.0-rc1-5024f8dfe478 #1
> Aug 23 20:41:20 archlinux kernel: Hardware name: To Be Filled By O.E.M. To
> Be Filled By O.E.M./X370 Gaming-ITX/ac, BIOS P3.40 11/07/2017
> Aug 23 20:41:20 archlinux kernel: RIP: 0010:amdgpu_vm_init+0x477/0x490
> [amdgpu]
> Aug 23 20:41:20 archlinux kernel: Code: b8 08 d8 ff ff e8 79 89 7c e8 e9 ee
> fe ff ff 41 89 ef e9 e6 fe ff ff 48 c7 c7 08 65 f0 c0 c6 05 41 af 2b 00 01
> e8 a3 8f 37 e8 <0f> 0b 0f b6 8b 60 01 00 00 e9 b4 fc ff ff e8 26 8d 37 e8 66
> 0f 1f
> Aug 23 20:41:20 archlinux kernel: RSP: 0018:ffffacc2c8df7b60 EFLAGS: 00010286
> Aug 23 20:41:20 archlinux kernel: RAX: 0000000000000000 RBX:
> ffff9b10f7bf9000 RCX: 0000000000000006
> Aug 23 20:41:20 archlinux kernel: RDX: 0000000000000007 RSI:
> 0000000000000002 RDI: ffff9b10fe7564d0
> Aug 23 20:41:20 archlinux kernel: RBP: ffff9b10f5640000 R08:
> 0000001856da5330 R09: 0000000000000036
> Aug 23 20:41:20 archlinux kernel: R10: 0000000000000424 R11:
> 000000000006ad48 R12: ffff9b10f7bf90b8
> Aug 23 20:41:20 archlinux kernel: R13: 000000000000000a R14:
> 0000000000000000 R15: 0000000000000000
> Aug 23 20:41:20 archlinux kernel: FS: 00007fcf6cc95500(0000)
> GS:ffff9b10fe740000(0000) knlGS:0000000000000000
> Aug 23 20:41:20 archlinux kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> Aug 23 20:41:20 archlinux kernel: CR2: 00007fcf6cb1d960 CR3:
> 00000007e1190000 CR4: 00000000003406e0
> Aug 23 20:41:20 archlinux kernel: Call Trace:
> Aug 23 20:41:20 archlinux kernel: ? ida_simple_get+0x91/0xf0
> Aug 23 20:41:20 archlinux kernel: amdgpu_driver_open_kms+0x83/0x1d0 [amdgpu]
> Aug 23 20:41:20 archlinux kernel: drm_open+0x20b/0x440 [drm]
> Aug 23 20:41:20 archlinux kernel: drm_stub_open+0xaf/0xf0 [drm]
> Aug 23 20:41:20 archlinux kernel: chrdev_open+0xa3/0x1b0
> Aug 23 20:41:20 archlinux kernel: ? cdev_put.part.3+0x20/0x20
> Aug 23 20:41:20 archlinux kernel: do_dentry_open+0x1ab/0x2d0
> Aug 23 20:41:20 archlinux kernel: path_openat+0x31b/0x1440
> Aug 23 20:41:20 archlinux kernel: ? alloc_set_pte+0x1fd/0x4e0
> Aug 23 20:41:20 archlinux kernel: do_filp_open+0x93/0x100
> Aug 23 20:41:20 archlinux kernel: ? __check_object_size+0x9c/0x171
> Aug 23 20:41:20 archlinux kernel: do_sys_open+0x186/0x210
> Aug 23 20:41:20 archlinux kernel: do_syscall_64+0x4e/0x100
> Aug 23 20:41:20 archlinux kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
> Aug 23 20:41:20 archlinux kernel: RIP: 0033:0x7fcf6cbbc452
> Aug 23 20:41:20 archlinux kernel: Code: 25 00 00 41 00 3d 00 00 41 00 74 4c
> 48 8d 05 f5 70 0d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c
> ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33
> 0c 25
> Aug 23 20:41:20 archlinux kernel: RSP: 002b:00007ffe9a15b0a0 EFLAGS:
> 00000246 ORIG_RAX: 0000000000000101
> Aug 23 20:41:20 archlinux kernel: RAX: ffffffffffffffda RBX:
> 0000000000000000 RCX: 00007fcf6cbbc452
> Aug 23 20:41:20 archlinux kernel: RDX: 0000000000000002 RSI:
> 00007ffe9a15b180 RDI: 00000000ffffff9c
> Aug 23 20:41:20 archlinux kernel: RBP: 00007ffe9a15b130 R08:
> 0000000000000000 R09: 0000000000000000
> Aug 23 20:41:20 archlinux kernel: R10: 0000000000000000 R11:
> 0000000000000246 R12: 00007ffe9a15b180
> Aug 23 20:41:20 archlinux kernel: R13: 0000000000000000 R14:
> 0000000000000000 R15: 0000000000000000
> Aug 23 20:41:20 archlinux kernel: ---[ end trace eb5bc55fd8b7f883 ]---
>
> </span >
This is just a warning meaning you use CPU to update GPU page tables, any
reason why ? try passing kernel
amdgpu.vm_update_mode=0 instead.
<span class="quote">> and then the issue OP posted too:
>
>
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0: GPU fault detected:
> 147 0x00a60401 for process payday2_release pid 6643 thread amdgpu_cs:0 pid
> 6644
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0:
> VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x06ABF814
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0:
> VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x2B004001
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0: VM fault (0x01, vmid
> 5, pasid 32776) at page 111933460, write from 'TC1' (0x54433100) (4)
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0: GPU fault detected:
> 147 0x00a60401 for process payday2_release pid 6643 thread amdgpu_cs:0 pid
> 6644
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0:
> VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x06ABF814
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0:
> VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x2B004001
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0: VM fault (0x01, vmid
> 5, pasid 32776) at page 111933460, write from 'TC1' (0x54433100) (4)
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0: GPU fault detected:
> 147 0x00a60401 for process payday2_release pid 6643 thread amdgpu_cs:0 pid
> 6644
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0:
> VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x06ABF814
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0:
> VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x23004001
> Aug 23 19:40:06 archlinux kernel: amdgpu 0000:0d:00.0: VM fault (0x01, vmid
> 1, pasid 32776) at page 111933460, write from 'TC1' (0x54433100) (4)
> Aug 23 19:42:06 archlinux kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> ring gfx timeout, signaled seq=519868, emitted seq=519871
> Aug 23 19:42:06 archlinux kernel: [drm] GPU recovery disabled.
>
>
> Happens on pretty much any application using Vulkan after some time or Core
> OpenGL applications too. Doesn't happen on normal desktop usage with Chrome.</span >
So is it only Vulkan specific ?
<span class="quote">>
> Happens on 4.18.3 and these traces are from 4.18.0-rc1-5024f8dfe478
> X370 chipset (like OP)
> RX 480 (same as OP)
> Ryzen 7 1700x
> Mesa 18.1.6
> xorg 1.20.1
> i3wm</span ></pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>