[REGRESSION] soft lockup on boot starting with kernel 6.10 / commit 5186ba33234c9a90833f7c93ce7de80e25fac6f5

Hugues Bruant hugues.bruant at gmail.com
Tue Sep 10 19:53:46 UTC 2024


On Mon, Sep 9, 2024 at 1:02 AM Borislav Petkov <bp at alien8.de> wrote:
>
> On Sun, Sep 08, 2024 at 11:53:56PM -0700, Hugues Bruant wrote:
> > Hi,
> >
> > I have discovered a 100% reliable soft lockup on boot on my laptop:
> > Purism Librem 14, Intel Core i7-10710U, 48Gb RAM, Samsung Evo Plus 970
> > SSD, CoreBoot BIOS, grub bootloader, Arch Linux.
> >
> > The last working release is kernel 6.9.10, every release from 6.10
> > onwards reliably exhibit the issue, which, based on journalctl logs,
> > seems to be triggered somewhere in systemd-udev:
> > https://gitlab.archlinux.org/-/project/42594/uploads/04583baf22189a0a8bb2f8773096e013/lockup.log
> >
> > Bisect points to commit 5186ba33234c9a90833f7c93ce7de80e25fac6f5
>
> That's a merge commit. Meaning, the bisection likely went into the wrong
> direction.
I double-checked and the bisection results seem quite consistent.
While merge commits are unlikely to be correct bisection results,
they're entirely possible if the bug is triggered by an unexpected
interaction between multiple unrelated commits.

> However, you have out-of-tree modules. Try reproducing it without them.
That was the first suggestion on the Arch bug tracker. The whole
bisection was done without out-of-tree modules.

Now, for the fun part: the kind soul on the Arch bugtracker who
provided me with the kernel images for bisection built a patched
6.10.9 at my request, reverting just Tony's RDT changes that were
flagged by the bisection: bd4955d4bc2182ccb660c9c30a4dd7f36feaf943 and
e3ca96e479c91d6ee657d3caa5092a6a3a620f9f

That patch bring the boot success rate on my machine from 0/10 up to
4/10, even though this code is not supposed to be used, its presence
is clearly impactful!

The framebuffer fix seems to also have a positive (though smaller,
closer to 20%) impact on boot success rate, so I'm planning to test
the combination of both as a next step.

See some extra boot logs attached
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lockup-patch.log
Type: text/x-log
Size: 107430 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20240910/158bc7de/attachment-0007.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: no-lockup-patch.log
Type: text/x-log
Size: 116550 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20240910/158bc7de/attachment-0008.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: no-lockup-patch-2.log
Type: text/x-log
Size: 115849 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20240910/158bc7de/attachment-0009.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: no-lockup-patch-1.log
Type: text/x-log
Size: 115997 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20240910/158bc7de/attachment-0010.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lockup-patch-1.log
Type: text/x-log
Size: 99947 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20240910/158bc7de/attachment-0011.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lockup-patch-4.log
Type: text/x-log
Size: 70148 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20240910/158bc7de/attachment-0012.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lockup-patch-3.log
Type: text/x-log
Size: 70878 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20240910/158bc7de/attachment-0013.bin>


More information about the Intel-gfx mailing list