[PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU

Fri Aug 11 22:00:34 UTC 2023

one checkpoint: I saw they use serial port for console at kernel 
parameter: console=ttyS0,115200n8

  *

    Booting Linux using a console connection that is too slow to keep up
    with the boot-time console-message rate. For example, a 115Kbaud
    serial console can be/way/too slow to keep up with boot-time message
    rates, and will frequently result in RCU CPU stall warning messages.
    Especially if you have added debug|printk()|
    <https://www.kernel.org/doc/html/latest/core-api/printk-basics.html#c.printk>s.


On 8/11/2023 4:31 PM, Felix Kuehling wrote:
> If you have a complete kernel log, it may be worth looking at 
> backtraces from other threads, to better understand the interactions. 
> I'd expect that there is a thread there that's in an RCU read critical 
> section. It may not be in our driver, though. If it's a customer 
> system, it may also help to see the kernel config. Maybe the kernel 
> was configured without preemption:
>
> -       For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the 
> kernel
>         without invoking schedule().  If the looping in the kernel is
>         really expected and desirable behavior, you might need to add
>         some calls to cond_resched().
>
> But then I would expect cond_resched() to fix the problem, according 
> to this document.
>
> Regards,
>   Felix
>
>
> On 2023-08-11 17:27, Chen, Xiaogang wrote:
>>
>> On 8/11/2023 4:22 PM, Felix Kuehling wrote:
>>> On 2023-08-11 17:12, Chen, Xiaogang wrote:
>>>>
>>>> I know the original jira ticket. The system got RCU cpu stall, then 
>>>> kernel enter panic, then no response or ssh. This patch let prange 
>>>> list update task yield cpu after each range update. It can prevent 
>>>> task holding mm lock too long.
>>>
>>> Calling schedule does not drop the lock. If anything, it causes the 
>>> lock to be held longer, because the function takes longer to complete.
>>>
>>> Regards,
>>>   Felix
>>>
>> Right. I do not see either how this patch target the root cause. It 
>> is on customer system that can have many RCU operations(not necessary 
>> from our code). Any read critical section can cause write stall.
>>
>> I think we can use some RCU parameters first to see if thing can 
>> change: like config_rcu_cpu_stall_timeout to increase grace period, 
>> or rcuupdate.rcu_cpu_stall_suppress to surppress RCU stall.
>>
>> Regards
>>
>> Xiaogang
>>
>>>> mm lock is rw_semophore, not RCU mechanism. Can you explain how 
>>>> that can prevent RCU cpu stall in this case?
>>>>
>>>> Regards
>>>>
>>>> Xiaogang
>>>>
>>>> On 8/11/2023 2:11 PM, James Zhu wrote:
>>>>> Caution: This message originated from an External Source. Use 
>>>>> proper caution when opening attachments, clicking links, or 
>>>>> responding.
>>>>>
>>>>>
>>>>> update_list could be big in list_for_each_entry(prange, 
>>>>> &update_list, update_list),
>>>>> mmap_read_lock(mm) is kept hold all the time, adding schedule() 
>>>>> can remove
>>>>> RCU stall on CPU for this case.
>>>>>
>>>>> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
>>>>> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 
>>>>> 65 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 
>>>>> <89> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
>>>>> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
>>>>> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
>>>>> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
>>>>> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
>>>>> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
>>>>> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
>>>>> __mmu_notifier_invalidate_range_start+0x132/0x1d0
>>>>> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
>>>>> migrate_vma_setup+0x6c7/0x8f0
>>>>> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
>>>>> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
>>>>> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
>>>>> kfd_ioctl+0x271/0x4e0 [amdgpu]
>>>>> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
>>>>> __x64_sys_ioctl+0x92/0xd0
>>>>>
>>>>> Signed-off-by: James Zhu <James.Zhu at amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>>>>>   1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> index 113fd11aa96e..9f2d48ade7fa 100644
>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, 
>>>>> struct mm_struct *mm,
>>>>>                  r = svm_range_trigger_migration(mm, prange, 
>>>>> &migrated);
>>>>>                  if (r)
>>>>>                          goto out_unlock_range;
>>>>> +               schedule();
>>>>>
>>>>>                  if (migrated && (!p->xnack_enabled ||
>>>>>                      (prange->flags & 
>>>>> KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
>>>>> -- 
>>>>> 2.34.1
>>>>>