Regression on linux-next (next-20240722)
Borah, Chaitanya Kumar
chaitanya.kumar.borah at intel.com
Tue Jul 23 19:08:52 UTC 2024
Hello Anna-Maria,
Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
Since the version next-20240722 [2], we are seeing the following regression
`````````````````````````````````````````````````````````````````````````````````
<6>[ 0.787321] Timer migration: 2 hierarchy levels; 8 children per group; 2 crossnode level
<4>[ 0.787330] ------------[ cut here ]------------
<4>[ 0.787335] WARNING: CPU: 0 PID: 1 at kernel/time/timer_migration.c:1714 tmigr_cpu_prepare+0x5f2/0x680
<4>[ 0.787340] Modules linked in:
<4>[ 0.787341] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.0-next-20240722-next-20240722-gdee7f101b642+ #1
<4>[ 0.787342] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023
<4>[ 0.787343] RIP: 0010:tmigr_cpu_prepare+0x5f2/0x680
<4>[ 0.787344] Code: fc ff ff 80 3d dc d5 6c 01 00 0f 85 56 fc ff ff 48 c7 c7 f8 ba 48 82 c6 05 c8 d5 6c 01 01 e8 95 1b f0 ff 0f 0b e9 3c fc ff ff <0f> 0b e9 41 fa ff ff 4c 89 e7 48 89 2c 24 e8 7b cd 11 00 48 c7 c7
<4>[ 0.787345] RSP: 0000:ffffc90000067d18 EFLAGS: 00010246
<4>[ 0.787346] RAX: 0000000000000000 RBX: ffff88885f0214e0 RCX: 0000000000000000
<4>[ 0.787347] RDX: 0000000000000001 RSI: ffffffff8243cfef RDI: 0000000000000000
<4>[ 0.787347] RBP: 000000000002e74c R08: 0000000000000000 R09: 0000000000000000
<4>[ 0.787347] R10: ffffc90000067e08 R11: ffff888100ce8040 R12: 0000000000000000
<4>[ 0.787348] R13: 0000000000000040 R14: ffffffff81198620 R15: ffffffff8264b880
<4>[ 0.787348] FS: 0000000000000000(0000) GS:ffff88885f000000(0000) knlGS:0000000000000000
<4>[ 0.787349] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 0.787350] CR2: ffff88887f7ff000 CR3: 000000000663a000 CR4: 0000000000f50ef0
<4>[ 0.787350] PKRU: 55555554
<4>[ 0.787351] Call Trace:
<4>[ 0.787351] <TASK>
<4>[ 0.787352] ? __warn+0x91/0x1a0
<4>[ 0.787354] ? tmigr_cpu_prepare+0x5f2/0x680
<4>[ 0.787355] ? report_bug+0x1f8/0x200
<4>[ 0.787359] ? handle_bug+0x3c/0x70
<4>[ 0.787361] ? exc_invalid_op+0x18/0x70
<4>[ 0.787362] ? asm_exc_invalid_op+0x1a/0x20
<4>[ 0.787364] ? __pfx_tmigr_cpu_prepare+0x10/0x10
<4>[ 0.787367] ? tmigr_cpu_prepare+0x5f2/0x680
<4>[ 0.787369] ? __pfx_tmigr_cpu_prepare+0x10/0x10
<4>[ 0.787370] cpuhp_invoke_callback+0x17b/0x6b0
<4>[ 0.787372] cpuhp_issue_call+0x9a/0x1d0
<4>[ 0.787374] __cpuhp_setup_state_cpuslocked+0x1cc/0x2c0
<4>[ 0.787376] ? __pfx_tmigr_cpu_prepare+0x10/0x10
<4>[ 0.787377] __cpuhp_setup_state+0xb8/0x220
<4>[ 0.787379] ? __pfx_tmigr_init+0x10/0x10
<4>[ 0.787381] tmigr_init+0xd8/0x140
<4>[ 0.787383] do_one_initcall+0x5c/0x2b0
<4>[ 0.787385] ? call_rcu_tasks_generic.constprop.0+0x182/0x3c0
<4>[ 0.787388] kernel_init_freeable+0xae/0x340
<4>[ 0.787390] ? __pfx_kernel_init+0x10/0x10
<4>[ 0.787392] kernel_init+0x15/0x130
<4>[ 0.787393] ret_from_fork+0x2c/0x50
<4>[ 0.787395] ? __pfx_kernel_init+0x10/0x10
<4>[ 0.787396] ret_from_fork_asm+0x1a/0x30
<4>[ 0.787399] </TASK>
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [3].
After bisecting the tree, the following patch [4] seems to be the first "bad" commit
`````````````````````````````````````````````````````````````````````````````````````````````````````````
commit 7a5ee4aa61afa9f1570c80ffba92987bc73ce3ab
Author: Anna-Maria Behnsen mailto:anna-maria at linutronix.de
Date: Wed Jul 17 11:49:40 2024 +0200
timers/migration: Move hierarchy setup into cpuhotplug prepare callback
`````````````````````````````````````````````````````````````````````````````````````````````````````````
We could not revert the patch because of a merge conflicts but resetting to the parent of the commit seems to fix the issue
Could you please check why the patch causes this regression and provide a fix if necessary?
Thank you.
Regards
Chaitanya
[1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
[2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20240722
[3] http://gfx-ci.igk.intel.com/tree/linux-next/next-20240722/bat-rpls-4/boot0.txt
[4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20240722&id=7a5ee4aa61afa9f1570c80ffba92987bc73ce3ab
More information about the Intel-gfx
mailing list