Second kexec_file_load (but not kexec_load) fails on i915 if CONFIG_INTEL_IOMMU_DEFAULT_ON=n

Fri Jul 4 08:29:01 UTC 2025

On Thu, 03 Jul 2025, Askar Safin <safinaskar at zohomail.com> wrote:
> TL;DR: I found a bug in strange interaction in kexec_file_load (but not kexec_load) and i915
> TL;DR#2: Second (sometimes third or forth) kexec (using kexec_file_load) fails on my particular hardware
> TL;DR#3: I did 55 expirements, each of them required a lot of boots, in total I did 1908 boots

Thanks for the detailed debug info. I'm afraid all I can say at this
point is, please file all of this in a bug report as described in
[1]. Please add the drm.debug related options, and attach the dmesgs and
configs in the bug instead of pointing at external sites.

BR,
Jani.

[1] https://drm.pages.freedesktop.org/intel-docs/how-to-file-i915-bugs.html

>
> Okay, so I found a bug. Steps to reproduce:
> - I have Dell Precision 7780
> - I have recent Debian x86_64 sid installed (bug reproducible with both Debian kernels and mainline ones)
> - Bug is reproducible on many kernels, including very recent ones, for example 6.15.4
> - Boot system, then do kexec into the same system using kexec_file_load. I. e. pass --kexec-file-syscall to "kexec" command
> - Then kexec from this kexec'ed system again (i. e. you should do two kexec's in a row)
> - Then do 3rd kexec, etc
> - Repeat kexec's until you do 100 kexec's or your system start to misbehave
>
> On my computer the system starts to misbehave after some number of kexec's. This always happens after 2nd kexec attempt.
> I. e. the first kexec is always successful. But second sometimes is not.
> I never was able to perform 100 kexec's in a row.
> After some kexec attempt the system starts to misbehave: oopses, panics, locked system, etc.
>
> Notes:
>
> - I tried to bisect "kexec-tools" package, but bisect merely gave me commit, which switched to kexec_file_load as a default.
> Bug is reproducible if we use kexec_file_load, but doesn't reproduce if we use kexec_load
>
> - Bug is reproducible even if we boot via init=/bin/bash (note: this means that initramfs is still part of the boot process). (If we boot to normal GUI, bug is reproducible, too)
>
> - When I reproduce I use this command line: "root=UUID=... rootflags=subvol=... ro init=..."
>
> - Debian package "plymouth" is required for reproducing. (It reproduces with plymouth, but doesn't reproduce without plymouth.) But note that I never see actual plymouth screen! I. e. presence of
> "plymouth" on the system somehow affects bug reproduciblity despite plymouth animation never actually shown. I don't know why this happens, but I suspect that I don't pass "splash" to kernel command line, and thus don't see plymouth screen. But I suspect that plymouth is still included to initramfs and from there somehow affects boot process
>
> - Bug reproduces in Debian, but doesn't reproduce in Ubuntu. After a lot of expirementing I finally understood why: Ubuntu kernel has CONFIG_INTEL_IOMMU_DEFAULT_ON=y, and Debian kernel has not. Additional expirements found that it is culpit. I. e. the bug is reproducible with CONFIG_INTEL_IOMMU_DEFAULT_ON=n and not reproducbile with CONFIG_INTEL_IOMMU_DEFAULT_ON=y . (So advice for distributions: do what Ubuntu does, i. e. set CONFIG_INTEL_IOMMU_DEFAULT_ON=y to hide this bug)
>
> - Bug is not reproducible in old enough kernels, so I did bisect on Linux. Bisect showed me these commits: d4a2393049..4a75f32fc7. I. e. bug is reproducible in 4a75f32fc7, but doesn't reproduce in d4a2393049. Between them there is a middle commit 52407c220c44c8dcc6a, which is not testable. Here are these commits:
>
> commit 4a75f32fc783128d0c42ef73fa62a20379a66828
> Author: Anusha Srivatsa <anusha.srivatsa at intel.com>
>
>    drm/i915/rpl-s: Add PCH Support for Raptor Lake S
>
> commit 52407c220c44c8dcc6aa8aa35ffc8a2db3c849a9
> Author: Anusha Srivatsa <anusha.srivatsa at intel.com>
>
>    drm/i915/rpl-s: Add PCI IDS for Raptor Lake S
>
> It seems these commits merely added support for my Intel GPU model. So this is fake regression. I'm not sure this should be treated as proper regression and whether regzbot should be notified. (What do you think?)
>
> Still formally this is regression: I did expirements and they show that bug present in 4a75f32fc783128d0c42 and not present before. (Side note: in latest kernels both wayland and x11 work, in d4a2393049 x11 works and wayland doesn't.)
>
> I tried to reproduce the bug in Qemu, but I was unable to do so. It seems Intel GPU is required, maybe even my particular model.
>
> Here is "lspci -vnn -d :*:0300" for my GPU:
>
> 00:02.0 VGA compatible controller [0300]: Intel Corporation Raptor Lake-S UHD Graphics [8086:a788] (rev 04) (prog-if 00 [VGA controller])
>         Subsystem: Dell Raptor Lake-S UHD Graphics [1028:0c42]
>         Flags: bus master, fast devsel, latency 0, IRQ 202, IOMMU group 0
>         Memory at 604b000000 (64-bit, non-prefetchable) [size=16M]
>         Memory at 4000000000 (64-bit, prefetchable) [size=256M]
>         I/O ports at 3000 [size=64]
>         Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
>         Capabilities: [40] Vendor Specific Information: Len=0c <?>
>         Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
>         Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
>         Capabilities: [d0] Power Management version 2
>         Capabilities: [100] Process Address Space ID (PASID)
>         Capabilities: [200] Address Translation Service (ATS)
>         Capabilities: [300] Page Request Interface (PRI)
>         Capabilities: [320] Single Root I/O Virtualization (SR-IOV)
>         Kernel driver in use: i915
>         Kernel modules: i915
>
> dmidecode:
> https://zerobin.net/?aebea072b93d8122#z4W9URnV+k9ZZErhP4etQkxlfpyRKf++uKMNoO5PGjs=
>
> - I use "root=UUID=... rootflags=subvol=... ro init=..." as a command line for reproducing. If I add "recovery nomodeset dis_ucode_ldr" (this is options used by Ubuntu in recovery mode), the bug stops to reproduce
>
> Again, in short, full list of things required for successful reproducing:
> - Intel GPU, possibly my particular model
> - Kernel with support for my model (4a75f32fc783128d0c42 and later up to 6.15.4)
> - Kexec at least two times. (One kexec never fails, 100 kexec's in a row never succeed)
> - kexec_file_load as opposed to kexec_load
> - Initramfs
> - Lack of parameters "recovery nomodeset dis_ucode_ldr" (i. e. one of them stops reproducing)
> - plymouth
> - CONFIG_INTEL_IOMMU_DEFAULT_ON=n
>
> Removing of ANY of them stops the bug, and I proved this by lots of expirements.
>
> In total I did 55+ expirements, each of them required up to 100 boots. In total I did 1908 (!!!!!!) boots on my physical laptop (I mean kexec boots here). No, I'm not faking this number, here is my actual directories with results:
>
> user at subvolume:~$ ls /rbt/kx-results/
> @rec-2025-06-29T201723Z-bad-4    @rec-2025-06-29T214650Z-good-60  @rec-2025-07-03T050626Z-bad-41    @rec-2025-07-03T104125Z-bad-28    @rec-2025-07-03T133705Z-bad-3
> @rec-2025-06-29T203429Z-good-60  @rec-2025-06-29T215558Z-bad-8    @rec-2025-07-03T060107Z-good-100  @rec-2025-07-03T111727Z-bad-13    @rec-2025-07-03T141647Z-good-100
> @rec-2025-06-29T205626Z-good-60  @rec-2025-07-01T042949Z-bad-12   @rec-2025-07-03T074810Z-good-100  @rec-2025-07-03T122242Z-good-100  @rec-2025-07-03T145705Z-good-100
> @rec-2025-06-29T211612Z-bad-6    @rec-2025-07-02T120101Z-good-60  @rec-2025-07-03T082914Z-good-100  @rec-2025-07-03T123958Z-bad-12    @rec-2025-07-03T152406Z-bad-50
> @rec-2025-06-29T212932Z-good-60  @rec-2025-07-03T031038Z-good-60  @rec-2025-07-03T100615Z-good-100  @rec-2025-07-03T132116Z-good-100  @rec-2025-07-03T154204Z-bad-15
> user at subvolume:~$ ls /rbt/kx-manual-testing/
> 2025-07-01-03-19-good-6  2025-07-01-03-56-good-4  2025-07-01-05-28-bad-3  2025-07-01-06-35-bad-2  2025-07-01-09-46-good-8
> 2025-07-01-03-44-good-3  2025-07-01-04-47-good-3  2025-07-01-06-19-bad-2  2025-07-01-09-21-bad-2  2025-07-02-13-09-good
> user at subvolume:~$ ls /rbt/kx-vanilla-results/
> 2025-06-30T005219Z_5.16.0-kx-df0cc57e057f18e4-3e17eec5ff024b63_1626_good_60      2025-06-30T023542Z_5.16.0-rc2-kx-87bb2a410dcfb617-9f30253daecd39e5_1663_bad_4
> 2025-06-30T012313Z_5.17.0-kx-f443e374ae131c16-91b07dce12a83fab_1674_bad_1        2025-06-30T032312Z_5.16.0-rc2-kx-c9ee950a2ca55ea0-854a1f40ce042801_1662_bad_6
> 2025-06-30T013555Z_5.16.0-kx-22ef12195e13c5ec-9aaf880b25942f2a_1668_bad_7        2025-06-30T033528Z_5.16.0-rc2-kx-ba884a411700dc56-854a1f40ce042801_1662_good_60
> 2025-06-30T014106Z_5.16.0-kx-9bcbf894b6872216-b828905f3cf12050_1664_bad_2        2025-06-30T034645Z_5.16.0-rc2-kx-d4a23930490df39f-854a1f40ce042801_1662_good_60
> 2025-06-30T014634Z_5.16.0-rc5-kx-cb6846fbb83b574c-83e7c6cf2ede57b4_1663_bad_6    2025-06-30T035232Z_5.16.0-rc2-kx-4a75f32fc783128d-854a1f40ce042801_1662_bad_5
> 2025-06-30T015713Z_5.16.0-rc2-kx-15bb79910fe734ad-9f30253daecd39e5_1663_good_60  2025-06-30T042058Z_5.16.0-rc2-kx-4a75f32fc783128d-854a1f40ce042801_1662_bad_1
> 2025-06-30T020235Z_5.16.0-rc5-kx-b06103b5325364e0-26176b9b704a5c24_1664_bad_6    2025-06-30T050000Z_6.15.4-kx-e60eb441596d1c70-2378f4efc5e956e5_2366_bad_2
> 2025-06-30T020717Z_5.16.0-rc5-kx-eacef9fd61dcf5ea-26176b9b704a5c24_1664_bad_1    2025-06-30T053011Z_6.15.4-kx-e60eb441596d1c70-2378f4efc5e956e5_2366_good_60
> 2025-06-30T021738Z_5.16.0-rc2-kx-67b858dd89932086-8d2f1d17f1e1933c_1662_good_60  2025-06-30T060619Z_5.16.0-rc2-kx-d4a23930490df39f-854a1f40ce042801_1662_good_60
> 2025-06-30T022759Z_5.16.0-rc2-kx-17815f624a90579a-854a1f40ce042801_1662_good_60  2025-06-30T061448Z_5.16.0-rc2-kx-4a75f32fc783128d-854a1f40ce042801_1662_bad_1
>
> Each number in the end of file/directory name is number of boots. In total we have 1908 boots. Testing was mostly automatical, using my script.
>
> Here is one example dmesg from mainline commit e60eb441596d1c70 (somewhere around 6.15.4):
>
> https://zerobin.net/?119ff118fd47b363#BpziYs6dNz5PaT7H8w2hlveoEYa4DDtITGkyd9o57LE=
>
> This is was dmesg from 2nd (and in the same time last) boot. The next boot (i. e. kexec) was unsuccessful. Corresponding config:
>
> https://zerobin.net/?009c807e1df41af8#gnmrswlbaFbdPTuzNq6NFkQd/Jhb3Ds0ZlLiwNanXnc=
>
> If you want results from all expirements, here is a link: https://filebin.net/45g2757b2iwaeen7 (1 Mb, expires after 7 days). Usually expirements come with full reproducer script.
>
> But what I described above is already enough, I think this link is not needed.
>
> I will be available for testing in coming days, then I will switch to other things, and so will not be available for testing.
> If you want more time, then, please, ask for it, i. e. say me something like "Please, be available for testing in more 10 days".
>
> --
> Askar Safin
> https://types.pl/@safinaskar
>

-- 
Jani Nikula, Intel