rcu_sched detected expedited stalls in amdgpu after suspend

Alex Xu (Hello71) alex_y_xu at yahoo.ca
Mon Jun 27 19:22:24 UTC 2022


Hi,

Since Linux 5.19-ish, I consistently get these types of errors when 
resuming from S3:

[15652.909157] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 11-... } 7 jiffies s: 9981 root: 0x800/.
[15652.909162] rcu: blocking rcu_node structures (internal RCU debug):
[15652.909163] Task dump for CPU 11:
[15652.909164] task:kworker/u24:65  state:R  running task     stack:    0 pid:210218 ppid:     2 flags:0x00004008
[15652.909167] Workqueue: events_unbound async_run_entry_fn
[15652.909172] Call Trace:
[15652.909173]  <TASK>
[15652.909174]  ? atom_get_src_int+0x38e/0x680
[15652.909179]  ? atom_op_test+0x67/0x190
[15652.909181]  ? amdgpu_atom_execute_table_locked+0x19a/0x300
[15652.909184]  ? atom_op_calltable+0xb1/0x110
[15652.909186]  ? amdgpu_atom_execute_table_locked+0x19a/0x300
[15652.909189]  ? atom_op_calltable+0xb1/0x110
[15652.909191]  ? amdgpu_atom_execute_table_locked+0x19a/0x300
[15652.909193]  ? __switch_to+0x137/0x440
[15652.909195]  ? amdgpu_atom_asic_init+0xe0/0x100
[15652.909198]  ? pci_bus_read_config_dword+0x36/0x50
[15652.909201]  ? amdgpu_device_resume+0x10b/0x3e0
[15652.909203]  ? amdgpu_pmops_resume+0x32/0x60
[15652.909204]  ? pci_pm_suspend+0x2b0/0x2b0
[15652.909206]  ? dpm_run_callback+0x35/0x1f0
[15652.909209]  ? device_resume+0x1ca/0x220
[15652.909211]  ? async_resume+0x19/0xe0
[15652.909213]  ? async_run_entry_fn+0x33/0x120
[15652.909215]  ? process_one_work+0x1d6/0x350
[15652.909218]  ? worker_thread+0x24d/0x480
[15652.909220]  ? kthread+0x137/0x150
[15652.909221]  ? worker_clr_flags+0x40/0x40
[15652.909224]  ? kthread_blkcg+0x30/0x30
[15652.909226]  ? ret_from_fork+0x22/0x30
[15652.909227]  </TASK>
[15653.015808] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 11-... } 7 jiffies s: 9985 root: 0x800/.
[15653.015812] rcu: blocking rcu_node structures (internal RCU debug):
[15653.015813] Task dump for CPU 11:
[15653.015813] task:kworker/u24:65  state:R  running task     stack:    0 pid:210218 ppid:     2 flags:0x00004008
[15653.015816] Workqueue: events_unbound async_run_entry_fn
[15653.015820] Call Trace:
[15653.015820]  <TASK>
[15653.015821]  ? amdgpu_cgs_read_register+0x10/0x10
[15653.015825]  ? smu7_copy_bytes_to_smc+0xd4/0x200
[15653.015828]  ? polaris10_program_memory_timing_parameters+0x195/0x1b0
[15653.015831]  ? sysvec_apic_timer_interrupt+0xa/0x80
[15653.015834]  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[15653.015836]  ? amdgpu_cgs_destroy_device+0x10/0x10
[15653.015839]  ? sysvec_apic_timer_interrupt+0xa/0x80
[15653.015841]  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[15653.015843]  ? amdgpu_cgs_destroy_device+0x10/0x10
[15653.015846]  ? amdgpu_device_rreg+0x8f/0xd0
[15653.015847]  ? phm_wait_for_register_unequal+0x99/0xd0
[15653.015850]  ? smu7_send_msg_to_smc+0x95/0x130
[15653.015853]  ? smum_send_msg_to_smc+0x5d/0xa0
[15653.015854]  ? amdgpu_cgs_read_ind_register+0xa0/0xa0
[15653.015857]  ? smu7_enable_dpm_tasks+0x241f/0x28c0
[15653.015859]  ? hwmgr_resume+0x31/0x70
[15653.015861]  ? amdgpu_device_resume+0x1fa/0x3e0
[15653.015863]  ? amdgpu_pmops_resume+0x32/0x60
[15653.015864]  ? pci_pm_suspend+0x2b0/0x2b0
[15653.015866]  ? dpm_run_callback+0x35/0x1f0
[15653.015868]  ? device_resume+0x1ca/0x220
[15653.015870]  ? async_resume+0x19/0xe0
[15653.015872]  ? async_run_entry_fn+0x33/0x120
[15653.015874]  ? process_one_work+0x1d6/0x350
[15653.015877]  ? worker_thread+0x24d/0x480
[15653.015878]  ? kthread+0x137/0x150
[15653.015880]  ? worker_clr_flags+0x40/0x40
[15653.015882]  ? kthread_blkcg+0x30/0x30
[15653.015884]  ? ret_from_fork+0x22/0x30
[15653.015886]  </TASK>

I have not noticed any resulting problems. I am reporting this in the 
hope that it is easy to fix the issue and remove the error messages 
which may obscure some future problem.

Thanks,
Alex.


More information about the amd-gfx mailing list