<div dir="ltr"><div>Hi,                                                                                                                                <br></div><div><div>                                                                                                                                         </div><div>I have an intermittent deadlock/hang in the amdgpu driver. It seems to happen when I open a new tab in qutebrowser(v1.1.1), while I am doing other stuff, like watching youtube through mpv or playing dota 2. It seems to be pretty arbitrary how often it happens. Sometimes it is once a week and sometimes multiple times a day. I have a vega 64.                                                                           </div><div>                                                                                                                                         </div><div>What happens is that the screen freezes but I still hear sound and can ssh in to the box. If I reboot it remotely, I get dropped back to tty and it tries to reboot but it gets stuck on blocking processes(mpv etc) so I have to reset it manually.                                                                                                                                                                             </div><div>Repro steps:                                                                                                                    </div><div>                                                                                                                                    </div><div>* run qutebrowser                                                                                                                          </div><div>* Do a bunch of other stuff, videos, games etc                                                                                                    </div><div>* Switch back to qutebrowser and hit "Ctrl+t" & be "lucky"                                                                                 </div><div>                                                                                                                                           </div><div>This seems to happen on all release candidates for 4.15 and 4.15 itself:                                                                   </div><div>                                                                                                                                           </div><div>4.15:</div></div><div><div>[ 2211.463021] INFO: task amdgpu_cs:0:1053 blocked for more than 120 seconds.                                                              </div><div>[ 2211.463026]       Not tainted 4.15.0-ARCH+ #1</div><div>[ 2211.463028] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.                                                   </div><div>[ 2211.463030] amdgpu_cs:0     D    0  1053   1051 0x00000000                                                                              </div><div>[ 2211.463032] Call Trace:                                                                                                                 </div><div>[ 2211.463040]  ? __schedule+0x297/0x8b0                                                                                                   </div><div>[ 2211.463043]  schedule+0x2f/0x90                                                                                                         </div><div>[ 2211.463045]  schedule_timeout+0x1fd/0x3a0                                                                                               </div><div>[ 2211.463085]  ? amdgpu_job_alloc+0x37/0xc0 [amdgpu]                                                                                      </div><div>[ 2211.463088]  dma_fence_default_wait+0x1cc/0x270                                                                                         </div><div>[ 2211.463090]  ? dma_fence_release+0xa0/0xa0                                                                                              </div><div>[ 2211.463092]  dma_fence_wait_timeout+0x39/0x110                                                                                          </div><div>[ 2211.463119]  amdgpu_ctx_wait_prev_fence+0x46/0x80 [amdgpu]                                                                              </div><div>[ 2211.463145]  amdgpu_cs_ioctl+0x98/0x1ac0 [amdgpu]                                                                                       </div><div>[ 2211.463149]  ? dequeue_entity+0xdc/0x460                                                                                                </div><div>[ 2211.463174]  ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu]                                                                                </div><div>[ 2211.463185]  drm_ioctl_kernel+0x5b/0xb0 [drm]                                                                                           </div><div>[ 2211.463194]  drm_ioctl+0x2ae/0x350 [drm]                                                                                                </div><div>[ 2211.463218]  ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu]                                                                                </div><div>[ 2211.463239]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]                                                                                        </div><div>[ 2211.463243]  do_vfs_ioctl+0xa4/0x630                                                                                                    </div><div>[ 2211.463246]  ? SyS_futex+0x12d/0x180                                                                                                    </div><div>[ 2211.463248]  SyS_ioctl+0x74/0x80                                                                                                        </div><div>[ 2211.463251]  entry_SYSCALL_64_fastpath+0x20/0x83                                                                                        </div><div>[ 2211.463254] RIP: 0033:0x7f21b27b6d87                                                                                                    </div><div>[ 2211.463255] RSP: 002b:00007f21a83acab8 EFLAGS: 00000246                                                                                 </div><div>[ 2334.343027] INFO: task amdgpu_cs:0:1053 blocked for more than 120 seconds.                                                              </div><div>[ 2334.343032]       Not tainted 4.15.0-ARCH+ #1                                                                                           </div><div>[ 2334.343034] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.                                                   </div><div>[ 2334.343036] amdgpu_cs:0     D    0  1053   1051 0x00000000                                                                              </div><div>[ 2334.343039] Call Trace:                                                                                                                 </div><div>[ 2334.343046]  ? __schedule+0x297/0x8b0                                                                                                   </div><div>[ 2334.343049]  schedule+0x2f/0x90                                                                                                         </div><div>[ 2334.343051]  schedule_timeout+0x1fd/0x3a0                                                                                               </div><div>[ 2334.343091]  ? amdgpu_job_alloc+0x37/0xc0 [amdgpu]                                                                                      </div><div>[ 2334.343095]  dma_fence_default_wait+0x1cc/0x270                                                                                         </div><div>[ 2334.343097]  ? dma_fence_release+0xa0/0xa0                                                                                              </div><div>[ 2334.343098]  dma_fence_wait_timeout+0x39/0x110                                                                                          </div><div>[ 2334.343125]  amdgpu_ctx_wait_prev_fence+0x46/0x80 [amdgpu]                                                                              </div><div>[ 2334.343151]  amdgpu_cs_ioctl+0x98/0x1ac0 [amdgpu]                                                                                       </div><div>[ 2334.343155]  ? dequeue_entity+0xdc/0x460                                                                                                </div><div>[ 2334.343181]  ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu]                                                                                </div><div>[ 2334.343191]  drm_ioctl_kernel+0x5b/0xb0 [drm]                                                                                           </div><div>[ 2334.343200]  drm_ioctl+0x2ae/0x350 [drm]                                                                                                </div><div>[ 2334.343224]  ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu]                                                                                </div><div>[ 2334.343245]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]                                                                                        </div><div>[ 2334.343249]  do_vfs_ioctl+0xa4/0x630                                                                                                    </div><div>[ 2334.343252]  ? SyS_futex+0x12d/0x180                                                                                                    </div><div>[ 2334.343254]  SyS_ioctl+0x74/0x80</div><div>[ 2334.343257]  entry_SYSCALL_64_fastpath+0x20/0x83</div><div>[ 2334.343259] RIP: 0033:0x7f21b27b6d87                                                                                                    </div><div>[ 2334.343260] RSP: 002b:00007f21a83acab8 EFLAGS: 00000246</div><div>[ 2457.222859] INFO: task amdgpu_cs:0:1053 blocked for more than 120 seconds.</div><div>[ 2457.222862]       Not tainted 4.15.0-ARCH+ #1</div><div>[ 2457.222863] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</div><div>[ 2457.222864] amdgpu_cs:0     D    0  1053   1051 0x00000000</div></div><div><div>[ 2457.222866] Call Trace:</div><div>[ 2457.222872]  ? __schedule+0x297/0x8b0</div><div>[ 2457.222873]  schedule+0x2f/0x90</div><div>[ 2457.222875]  schedule_timeout+0x1fd/0x3a0</div><div>[ 2457.222900]  ? amdgpu_job_alloc+0x37/0xc0 [amdgpu]</div><div>[ 2457.222902]  dma_fence_default_wait+0x1cc/0x270</div><div>[ 2457.222903]  ? dma_fence_release+0xa0/0xa0</div><div>[ 2457.222904]  dma_fence_wait_timeout+0x39/0x110</div><div>[ 2457.222918]  amdgpu_ctx_wait_prev_fence+0x46/0x80 [amdgpu]</div><div>[ 2457.222932]  amdgpu_cs_ioctl+0x98/0x1ac0 [amdgpu]</div><div>[ 2457.222935]  ? dequeue_entity+0xdc/0x460</div><div>[ 2457.222948]  ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu]</div><div>[ 2457.222955]  drm_ioctl_kernel+0x5b/0xb0 [drm]</div><div>[ 2457.222960]  drm_ioctl+0x2ae/0x350 [drm]</div><div>[ 2457.222972]  ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu]</div><div>[ 2457.222983]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]</div><div>[ 2457.222986]  do_vfs_ioctl+0xa4/0x630</div><div>[ 2457.222989]  ? SyS_futex+0x12d/0x180</div><div>[ 2457.222989]  SyS_ioctl+0x74/0x80</div><div>[ 2457.222991]  entry_SYSCALL_64_fastpath+0x20/0x83</div><div>[ 2457.222993] RIP: 0033:0x7f21b27b6d87</div><div>[ 2457.222993] RSP: 002b:00007f21a83acab8 EFLAGS: 00000246</div><div>[ 2580.102828] INFO: task amdgpu_cs:0:1053 blocked for more than 120 seconds.</div><div>[ 2580.102831]       Not tainted 4.15.0-ARCH+ #1</div><div>[ 2580.102832] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</div><div>[ 2580.102833] amdgpu_cs:0     D    0  1053   1051 0x00000000</div><div>[ 2580.102835] Call Trace:</div><div>[ 2580.102841]  ? __schedule+0x297/0x8b0</div><div>[ 2580.102842]  schedule+0x2f/0x90</div><div>[ 2580.102843]  schedule_timeout+0x1fd/0x3a0</div><div>[ 2580.102868]  ? amdgpu_job_alloc+0x37/0xc0 [amdgpu]</div><div>[ 2580.102871]  dma_fence_default_wait+0x1cc/0x270</div><div>[ 2580.102872]  ? dma_fence_release+0xa0/0xa0</div><div>[ 2580.102873]  dma_fence_wait_timeout+0x39/0x110</div><div>[ 2580.102887]  amdgpu_ctx_wait_prev_fence+0x46/0x80 [amdgpu]</div><div>[ 2580.102900]  amdgpu_cs_ioctl+0x98/0x1ac0 [amdgpu]</div><div>[ 2580.102903]  ? dequeue_entity+0xdc/0x460</div><div>[ 2580.102916]  ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu]</div><div>[ 2580.102923]  drm_ioctl_kernel+0x5b/0xb0 [drm]</div><div>[ 2580.102928]  drm_ioctl+0x2ae/0x350 [drm]</div><div>[ 2580.102940]  ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu]</div><div>[ 2580.102951]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]</div><div>[ 2580.102953]  do_vfs_ioctl+0xa4/0x630</div><div>[ 2580.102956]  ? SyS_futex+0x12d/0x180</div><div>[ 2580.102957]  SyS_ioctl+0x74/0x80</div><div>[ 2580.102958]  entry_SYSCALL_64_fastpath+0x20/0x83</div><div>[ 2580.102960] RIP: 0033:0x7f21b27b6d87</div></div><div><br></div><div><div>4.15rc9:                                                                                                                                   </div><div>[11181.701121] INFO: task amdgpu_cs:0:828 blocked for more than 120 seconds.                                                               </div><div>[11181.701126]       Not tainted 4.15.0-rc9-ga8750ddca918+ #3                                                                              </div><div>[11181.701127] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.                                                   </div><div>[11181.701129] amdgpu_cs:0     D    0   828    826 0x00000000                                                                              </div><div>[11181.701132] Call Trace:                                                                                                                 </div><div>[11181.701140]  ? __schedule+0x293/0x8a0                                                                                                   </div><div>[11181.701143]  schedule+0x2f/0x90                                                                                                         </div><div>[11181.701145]  schedule_timeout+0x1fa/0x3a0                                                                                               </div><div>[11181.701147]  ? _raw_spin_unlock+0xa/0x20                                                                                                </div><div>[11181.701180]  ? amdgpu_vm_update_directories+0x460/0x5e0 [amdgpu]                                                                        </div><div>[11181.701184]  dma_fence_default_wait+0x1cc/0x270                                                                                         </div><div>[11181.701187]  ? dma_fence_release+0xa0/0xa0                                                                                              </div><div>[11181.701189]  dma_fence_wait_timeout+0x33/0x100                                                                                          </div><div>[11181.701220]  amdgpu_ctx_wait_prev_fence+0x47/0x80 [amdgpu]                                                                              </div><div>[11181.701249]  amdgpu_cs_ioctl+0x98/0x1ac0 [amdgpu]                                                                                       </div><div>[11181.701252]  ? dequeue_entity+0xd9/0x450                                                                                                </div><div>[11181.701282]  ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu]                                                                                </div><div>[11181.701293]  drm_ioctl_kernel+0x59/0xb0 [drm]                                                                                           </div><div>[11181.701302]  drm_ioctl+0x2d5/0x370 [drm]                                                                                                </div><div>[11181.701330]  ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu]                                                                                </div><div>[11181.701355]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]                                                                                        </div><div>[11181.701358]  do_vfs_ioctl+0xa1/0x620                                                                                                    </div><div>[11181.701361]  ? SyS_futex+0x12d/0x180                                                                                                    </div><div>[11181.701363]  SyS_ioctl+0x74/0x80                                                                                                        </div><div>[11181.701365]  entry_SYSCALL_64_fastpath+0x20/0x83                                                                                        </div><div>[11181.701367] RIP: 0033:0x7feb78366d27</div></div><div><br></div><div><br></div><div>I also get this error when I boot:<br></div><div><br></div><div>amdgpu 0000:43:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff<br></div><div><br></div><div>Am I "supposed" to have that?</div><div><br></div><div>Regards,</div><div>Daniel</div></div>