Spontaneous reboots when using RX 560

Sylvain Munaut 246tnt at gmail.com
Fri Oct 18 14:35:17 UTC 2019


Hi Alex,


> Does disabling the IOMMU help?  E.g., append IOMMU=off or IOMMU=pt on
> the kernel command line in grub.

Good suggestion, I should have tried that earlier, unfortunately it
doesn't change anything :/

I tried both independently and also combining with pci=noats and cg/pg
mask=0. Same behavior.
The actual message in dmesg vary slightly but same idea. Here's the
two most unique ones :

[  122.525452] gmc_v8_0_process_interrupt: 14 callbacks suppressed
[  122.525456] amdgpu 0000:06:00.0: GPU fault detected: 146 0x0140440c
for process gnome-shell pid 2069 thread gnome-shel:cs0 pid 2084
[  122.525459] amdgpu 0000:06:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000028
[  122.525460] amdgpu 0000:06:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0E04400C
[  122.525462] amdgpu 0000:06:00.0: VM fault (0x0c, vmid 7, pasid
32770) at page 40, read from 'TC1' (0x54433100) (68)
[  127.745969] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]]
*ERROR* Waiting for fences timed out or interrupted!
[  129.675374] clocksource: timekeeping watchdog on CPU0: Marking
clocksource 'tsc' as unstable because the skew is too large:
[  129.675377] clocksource:                       'hpet' wd_now:
6ef9849c wd_last: 6dd444eb mask: ffffffff
[  129.675377] clocksource:                       'tsc' cs_now:
10c4a3c5c88 cs_last: 10b779feadc mask: ffffffffffffffff
[  129.675378] tsc: Marking TSC unstable due to clocksource watchdog
[  130.480703] igb 0000:07:00.0 enp7s0: PCIe link lost

The above was with "iommu=off pci=noats amdgpu.cg_mask=0 amdgpu.pg_mask=0"

I also saw this stack trace with iommu=pt :

[   89.211541] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]]
*ERROR* Waiting for fences timed out or interrupted!
[   89.463287] invalid opcode: 0000 [#1] SMP NOPTI
[   89.463292] CPU: 1 PID: 1647 Comm: InputThread Tainted: P
OE     5.3.0-18-generic #19-Ubuntu
[   89.463294] Hardware name: To Be Filled By O.E.M. To Be Filled By
O.E.M./X570 Pro4, BIOS P1.70 09/10/2019
[   89.463383] RIP: 0010:amdgpu_dm_atomic_check+0x63c/0x6c0 [amdgpu]
[   89.463385] Code: 8d 78 f0 49 39 c5 0f 85 0a fb ff ff e9 5c fb ff
ff 41 89 c4 e9 ec fe ff ff 41 89 c4 e9 14 f0 9e 00 44 85 ff 0f 85 b8
fd ff ce <d4> b0 1c 10 02 24 84 00 ff ff ff 00 8b 90 c0 48 89 b0 e8 7d
eb 19
[   89.463387] RSP: 0018:ffffa4d001adf9c0 EFLAGS: 00010246
[   89.463389] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   89.463391] RDX: 00000000000009f6 RSI: ffff8e45de670140 RDI: 0000000000030140
[   89.463392] RBP: ffffa4d001adfa20 R08: ffff8e45b78adc00 R09: 0000000000000000
[   89.463393] R10: ffff8e45d8270000 R11: ffff8e45da49d000 R12: 0000000000000000
[   89.463394] R13: ffff8e45d8270000 R14: ffff8e45cb4f2480 R15: 0000000000000000
[   89.463396] FS:  00007fbd627fc700(0000) GS:ffff8e45de640000(0000)
knlGS:0000000000000000
[   89.463397] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   89.463399] CR2: 000055f405646450 CR3: 0000000811268000 CR4: 0000000000340ee0
[   89.463400] Call Trace:
[   89.463420]  drm_atomic_check_only+0x2d6/0x3d0 [drm]
[   89.463433]  drm_atomic_commit+0x18/0x50 [drm]
[   89.463443]  drm_atomic_helper_update_plane+0xea/0x100 [drm_kms_helper]
[   89.463457]  __setplane_atomic+0xcb/0x110 [drm]
[   89.463470]  drm_mode_cursor_universal+0x140/0x260 [drm]
[   89.463484]  drm_mode_cursor_common+0xcc/0x220 [drm]
[   89.463496]  ? drm_mode_setplane+0x2b0/0x2b0 [drm]
[   89.463507]  drm_mode_cursor_ioctl+0x4a/0x60 [drm]
[   89.463519]  drm_ioctl_kernel+0xae/0xf0 [drm]
[   89.463531]  drm_ioctl+0x234/0x3d0 [drm]
[   89.463542]  ? drm_mode_setplane+0x2b0/0x2b0 [drm]
[   89.463548]  ? _copy_to_user+0x2c/0x30
[   89.463551]  ? input_event_to_user+0x42/0xa0
[   89.463604]  amdgpu_drm_ioctl+0x4e/0x80 [amdgpu]
[   89.463608]  do_vfs_ioctl+0x407/0x670
[   89.463611]  ? __vfs_read+0x1b/0x40
[   89.463613]  ? vfs_read+0xab/0x160
[   89.463616]  ksys_ioctl+0x67/0x90
[   89.463619]  __x64_sys_ioctl+0x1a/0x20
[   89.463622]  do_syscall_64+0x5a/0x130
[   89.463625]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   89.463627] RIP: 0033:0x7fbe0550667b
[   89.463629] Code: 0f 1e fa 48 8b 05 15 28 0d 00 64 c7 00 26 00 00
00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e5 27 0d 00 f7 d8 64 89
01 48
[   89.463631] RSP: 002b:00007fbd627fa2d8 EFLAGS: 00003246 ORIG_RAX:
0000000000000010
[   89.463632] RAX: ffffffffffffffda RBX: 00007fbd627fa310 RCX: 00007fbe0550667b
[   89.463634] RDX: 00007fbd627fa310 RSI: 00000000c01c64a3 RDI: 000000000000000d
[   89.463635] RBP: 00000000c01c64a3 R08: 000000000000002a R09: 0000000000000001
[   89.463636] R10: 0000000000000000 R11: 0000000000003246 R12: 000055d1b53bd790
[   89.463637] R13: 000000000000000d R14: 000000000000002e R15: 000000000000057a
[   89.463641] Modules linked in: edac_mce_amd kvm_amd binfmt_misc
nls_iso8859_1 kvm irqbypass nvidia_uvm(OE) snd_hda_codec_generic
ledtrig_audio crct10dif_pclmul snd_hda_codec_hdmi crc32_pclmul
nvidia_drm(POE) amdgpu nvidia_modeset(POE) snd_hda_intel snd_seq_midi
ghash_clmulni_intel nvidia(POE) aesni_intel snd_hda_codec
snd_seq_midi_event snd_hda_core aes_x86_64 snd_rawmidi amd_iommu_v2
crypto_simd gpu_sched cryptd joydev input_leds wmi_bmof snd_hwdep
snd_seq glue_helper ttm snd_pcm ucsi_ccg drm_kms_helper typec_ucsi
snd_seq_device typec drm ccp ipmi_devintf snd_timer ipmi_msghandler
snd fb_sys_fops syscopyarea sysfillrect sysimgblt soundcore mac_hid
sch_fq_codel nct6775 hwmon_vid parport_pc ppdev lp parport ip_tables
x_tables autofs4 hid_logitech_hidpp hid_logitech_dj hid_generic usbhid
hid ixgbe i2c_piix4 igb nvme ahci i2c_nvidia_gpu libahci xfrm_algo
i2c_algo_bit nvme_core dca mdio wmi
[   89.463704] ---[ end trace 455cf9a155c384cb ]---

The "To Be Filled By O.E.M. To Be Filled By O.E.M./" really inspires
confidence ...


Cheers,

    Sylvain Munaut


More information about the amd-gfx mailing list