Regression on linux-next (next-20241106)

Borah, Chaitanya Kumar chaitanya.kumar.borah at intel.com
Tue Nov 12 17:14:30 UTC 2024



> -----Original Message-----
> From: Rafael J. Wysocki <rafael at kernel.org>
> Sent: Monday, November 11, 2024 6:58 PM
> To: Borah, Chaitanya Kumar <chaitanya.kumar.borah at intel.com>
> Cc: Wysocki, Rafael J <rafael.j.wysocki at intel.com>; intel-
> gfx at lists.freedesktop.org; Kurmi, Suresh Kumar
> <suresh.kumar.kurmi at intel.com>; Saarinen, Jani <jani.saarinen at intel.com>;
> Nikula, Jani <jani.nikula at intel.com>; linux-pm at vger.kernel.org;
> srinivas.pandruvada at linux.intel.com; ricardo.neri-calderon at linux.intel.com
> Subject: Re: Regression on linux-next (next-20241106)
> 
> Hi Chaitanya,
> 
> On Mon, Nov 11, 2024 at 6:41 AM Borah, Chaitanya Kumar
> <chaitanya.kumar.borah at intel.com> wrote:
> >
> > Hello Rafael,
> >
> > Hope you are doing well. I am Chaitanya from the linux graphics team in
> Intel.
> >
> > This mail is regarding a regression we are seeing in our CI runs[1] on linux-
> next repository.
> >
> > Since the version next-20241106 [2], we are seeing the following
> > regression
> >
> > `````````````````````````````````````````````````````````````````````````````````
> > <4>[    7.246473] WARNING: possible circular locking dependency detected
> > <4>[    7.246476] 6.12.0-rc6-next-20241106-next-20241106-g5b913f5d7d7f+
> #1 Not tainted
> > <4>[    7.246479] ------------------------------------------------------
> > <4>[    7.246481] swapper/0/1 is trying to acquire lock:
> > <4>[    7.246483] ffffffff8264aef0 (cpu_hotplug_lock){++++}-{0:0}, at:
> static_key_enable+0xd/0x20
> > <4>[    7.246493]
> >                   but task is already holding lock:
> > <4>[    7.246495] ffffffff82832068 (hybrid_capacity_lock){+.+.}-{4:4}, at:
> intel_pstate_register_driver+0xd3/0x1c0
> > ``````````````````````````````````````````````````````````````````````
> > ```````````
> > Details log can be found in [3].
> 
> Thanks for the report!
> 
> > After bisecting the tree, the following patch [4] seems to be the first "bad"
> > commit
> >
> > ``````````````````````````````````````````````````````````````````````
> > ```````````````````````````````````
> > commit 92447aa5f6e7fbad9427a3fd1bb9e0679c403206
> > Author: Rafael J. Wysocki mailto:rafael.j.wysocki at intel.com
> > Date:   Mon Nov 4 19:53:53 2024 +0100
> >
> >     cpufreq: intel_pstate: Update asym capacity for CPUs that were
> > offline initially
> > ``````````````````````````````````````````````````````````````````````
> > ```````````````````````````````````
> >
> > We also verified that if we revert the patch the issue is not seen.
> >
> > Could you please check why the patch causes this regression and provide a
> fix if necessary?
> >
> > [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
> > [2]
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/co
> > mmit/?h=next-20241106 [3]
> > https://intel-gfx-ci.01.org/tree/linux-next/next-20241106/bat-arls-1/b
> > oot0.txt [4]
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/co
> > mmit/?h=next-
> 20241106&id=92447aa5f6e7fbad9427a3fd1bb9e0679c403206
> 
> The problem is that cpus_read_lock() should not be called under
> hybrid_capacity_lock because the latter is acquired in CPU online/offline
> paths and this is exposed by the above commit, but if I'm not mistaken, the
> issue is there regardless of it.
> 
> A good news is that is should be addressed by a patch that has been posted
> already:
> 
> https://lore.kernel.org/linux-pm/12554508.O9o76ZdvQC@rjwysocki.net/
> 
> so please let me know if it makes the splat go away.
> 
> Even if its changelog says that it has no functional impact, this is not really the
> case.
> 
> Thanks!

Thank you Rafael for the patch, we can confirm that it helps.

Regards

Chaitanya


More information about the Intel-gfx mailing list