Regression on linux-next (next-20241120) and drm-tip

Thomas Weißschuh linux at weissschuh.net
Tue Dec 3 12:04:06 UTC 2024


On 2024-12-03 12:54:54+0100, Rafael J. Wysocki wrote:
> On Tue, Dec 3, 2024 at 7:51 AM Thomas Weißschuh <linux at weissschuh.net> wrote:
> >
> > (+Cc Sebastian)
> >
> > Hi Chaitanya,
> >
> > On 2024-12-03 05:07:47+0000, Borah, Chaitanya Kumar wrote:
> > > Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
> > >
> > > This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
> >
> > Thanks for the report.
> >
> > > Since the version next-20241120 [2], we are seeing the following regression
> > >
> > > `````````````````````````````````````````````````````````````````````````````````
> > > <4>[   19.990743] Oops: general protection fault, probably for non-canonical address 0xb11675ef8d1ccbce: 0000 [#1] PREEMPT SMP NOPTI
> > > <4>[   19.990760] CPU: 21 UID: 110 PID: 867 Comm: prometheus-node Not tainted 6.12.0-next-20241120-next-20241120-gac24e26aa08f+ #1
> > > <4>[   19.990771] Hardware name: Intel Corporation Arrow Lake Client Platform/MTL-S UDIMM 2DPC EVCRB, BIOS MTLSFWI1.R00.4400.D85.2410100007 10/10/2024
> > > <4>[   19.990782] RIP: 0010:power_supply_get_property+0x3e/0xe0
> > > `````````````````````````````````````````````````````````````````````````````````
> > > Details log can be found in [3].
> > >
> > > After bisecting the tree, the following patch [4] seems to be the first "bad"
> > > commit
> > >
> > > `````````````````````````````````````````````````````````````````````````````````````````````````````````
> > > Commit 49000fee9e639f62ba1f965ed2ae4c5ad18d19e2
> > > Author:     Thomas Weißschuh <mailto:linux at weissschuh.net>
> > > AuthorDate: Sat Oct 5 12:05:03 2024 +0200
> > > Commit:     Sebastian Reichel <mailto:sebastian.reichel at collabora.com>
> > > CommitDate: Tue Oct 15 22:22:20 2024 +0200
> > >     power: supply: core: add wakeup source inhibit by power_supply_config
> > > `````````````````````````````````````````````````````````````````````````````````````````````````````````
> > >
> > > This is now seen in our drm-tip runs as well. [5]
> > >
> > > Could you please check why the patch causes this regression and provide a fix if necessary?
> >
> > I don't see how this patch can lead to this error.
> 
> It looks like the cfg->no_wakeup_source access reaches beyond the
> struct boundary for some reason.

But the access to this field is only done in __power_supply_register().
The error reports however don't show this function at all,
they come from power_supply_uevent() and power_supply_get_property() by
which time the call to __power_supply_register() is long over.

FWIW there is an uninitialized 'struct power_supply_config' in
drivers/hid/hid-corsair-void.c. But I highly doubt the test machines are
using that. (I'll send a patch later for it)

> > Could you doublecheck the bisect?
> >
> > Note: Having line numbers in the trace would be very useful.
> >
> > > Thank you.
> > >
> > > Regards
> > >
> > > Chaitanya
> >
> > Thanks,
> > Thomas
> >
> >
> > >
> > > P.S. We could not revert the patch cleanly and therefore we are yet to verify the bisect but we are currently working on it.
> > >
> > >
> > > [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
> > > [2]https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20241120
> > > [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20241120/bat-arls-6/boot0.txt
> > > [4] https://cgit.freedesktop.org/drm-tip/commit/?id=49000fee9e639f62ba1f965ed2ae4c5ad18d19e2
> > > [5] https://intel-gfx-ci.01.org/tree/drm-tip/index.html?
> >


More information about the Intel-xe mailing list