Regression on linux-next (next-20250321)
Borah, Chaitanya Kumar
chaitanya.kumar.borah at intel.com
Wed Mar 26 08:31:15 UTC 2025
> -----Original Message-----
> From: Nicolin Chen <nicolinc at nvidia.com>
> Sent: Tuesday, March 25, 2025 1:10 PM
> To: Borah, Chaitanya Kumar <chaitanya.kumar.borah at intel.com>
> Cc: iommu at lists.linux.dev; intel-gfx at lists.freedesktop.org; intel-
> xe at lists.freedesktop.org; Kurmi, Suresh Kumar
> <suresh.kumar.kurmi at intel.com>; Saarinen, Jani <jani.saarinen at intel.com>;
> jgg at nvidia.com
> Subject: Re: Regression on linux-next (next-20250321)
>
> (CC += Jason)
>
> Hi Chaitanya,
>
> On Tue, Mar 25, 2025 at 05:39:39AM +0000, Borah, Chaitanya Kumar wrote:
> > Hello Nicolin,
> >
> > Hope you are doing well. I am Chaitanya from the linux graphics team in
> Intel.
> >
> > This mail is regarding a regression we are seeing in our CI runs[1] on linux-
> next repository.
> >
> > Since the version next-20250321 [2], we are seeing the following regression
> >
> > `````````````````````````````````````````````````````````````````````````````````
> > <4>[ 0.226495] Unpatched return thunk in use. This should not happen!
> > <4>[ 0.226502] WARNING: CPU: 0 PID: 1 at
> arch/x86/kernel/cpu/bugs.c:3107 __warn_thunk+0x62/0x70
>
> Hmm....I wonder why x86 can be affected...
>
> The only four callers of iommu_dma_prepare_msi() are ARM platforms.
>
> > <4>[ 0.226513] Modules linked in:
> > <4>[ 0.226521] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted
> 6.14.0-rc7-next-20250321-next-20250321-g9388ec571cb1+ #1
> PREEMPT(voluntary)
> > <4>[ 0.226532] Hardware name: ASUS System Product Name/PRIME
> Z790-P WIFI, BIOS 0812 02/24/2023
> > <4>[ 0.226539] RIP: 0010:__warn_thunk+0x62/0x70
> > <4>[ 0.226544] Code: 34 4c 5d 02 01 e8 fe f6 a7 00 84 c0 75 d9 48 c7 c7
> f8 bf 0d 83 e8 7e c6 08 00 48 c7 c7 a0 a2 a0 82 e8 e2 f6 a7 00 84 c0 75 bd
> <0f> 0b eb b9 cc cc cc cc cc cc cc cc cc cc 90 90 90 90 90 90 90 90
> > <4>[ 0.226559] RSP: 0000:ffffc90000067d78 EFLAGS: 00010246
> > <4>[ 0.226565] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000000
> > <4>[ 0.226571] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000000
> > <4>[ 0.226577] RBP: ffffc90000067d80 R08: 0000000000000000 R09:
> 0000000000000000
> > <4>[ 0.226583] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> > <4>[ 0.226589] R13: ffffffff83c9417c R14: ffff88887f344bc0 R15:
> ffff888102370100
> > <4>[ 0.226595] FS: 0000000000000000(0000)
> GS:ffff8888dacfd000(0000) knlGS:0000000000000000
> > <4>[ 0.226602] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > <4>[ 0.226608] CR2: ffff88887f7ff000 CR3: 000000000344a000 CR4:
> 0000000000f50ef0
> > <4>[ 0.226614] PKRU: 55555554
> > <4>[ 0.226617] Call Trace:
> > <4>[ 0.226620] <TASK>
> > <4>[ 0.226624] ? show_regs+0x6c/0x80
> > <4>[ 0.226630] ? __warn+0x94/0x210
> > <4>[ 0.226635] ? __warn_thunk+0x62/0x70
> > <4>[ 0.226640] ? __report_bug+0x110/0x280
> > <4>[ 0.227000] ? __lock_acquire+0x447/0x2c70
> > <4>[ 0.227011] ? _prb_read_valid+0x25a/0x310
> > <4>[ 0.227018] ? __lock_acquire+0x447/0x2c70
> > <4>[ 0.227024] ? prb_read_valid+0x1c/0x30
> > <4>[ 0.227037] ? lock_acquire+0xc4/0x330
> > <4>[ 0.227055] ? _prb_read_valid+0x25a/0x310
> > <4>[ 0.227073] ? __warn_thunk+0x62/0x70
> > <4>[ 0.227081] ? report_bug+0x24/0x80
> > <4>[ 0.227089] ? handle_bug+0x16a/0x2a0
> > <4>[ 0.227098] ? exc_invalid_op+0x18/0x80
> > <4>[ 0.227106] ? asm_exc_invalid_op+0x1b/0x20
> > <4>[ 0.227122] ? __warn_thunk+0x62/0x70
> > <4>[ 0.227130] ? __warn_thunk+0x5e/0x70
> > <4>[ 0.227135] ? iommu_dma_ranges_sort+0x40/0x40
> > <4>[ 0.227144] warn_thunk_thunk+0x16/0x30
> > <4>[ 0.227157] do_one_initcall+0x5d/0x460
> > <4>[ 0.227171] kernel_init_freeable+0x3ac/0x530
> > <4>[ 0.227187] ? __pfx_kernel_init+0x10/0x10
> > <4>[ 0.227196] kernel_init+0x1b/0x200
> > <4>[ 0.227203] ret_from_fork+0x44/0x70
> > <4>[ 0.227210] ? __pfx_kernel_init+0x10/0x10
> > <4>[ 0.227217] ret_from_fork_asm+0x1a/0x30
> > <4>[ 0.227236] </TASK>
> > `````````````````````````````````````````````````````````````````````````````````
> > Details log can be found in [3].
>
> And I can't see something obvious from the log..
>
> Would you please give the git-diff a try (drivers/iommu/iommu.c)?
> https://lore.kernel.org/linux-iommu/Z+Itnw4ys6dmDsc+@nvidia.com/
>
> If this doesn't help, would you please give this a try?
> https://lore.kernel.org/linux-iommu/20250324170743.GA1339275@ax162/
>
Thank you, Nicolin, for your reply. Unfortunately, these changes does not solve the issue. (applied individually and together)
Regards
Chaitanya
> Thanks!
> Nicolin
More information about the Intel-xe
mailing list