[Intel-gfx] Regression in linux-next

Wysocki, Rafael J rafael.j.wysocki at intel.com
Mon Oct 9 19:23:36 UTC 2023


Hi,

On 10/9/2023 7:10 AM, Borah, Chaitanya Kumar wrote:
> Hello Rafael
>
>> Thanks for the report, I think that this is a lockdep assertion failing.
>> If that is correct, it should be straightforward to fix.
>> I'll take care of this early next week.
>> Thanks!
> Thank you for your response.  Please let us know when a fix is available.

It should be fixed in linux-next from today, by this commit:

https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=linux-next&id=b44444027ce7714f309e96b804b7fb088a40d708

Thanks!


> From: Wysocki, Rafael J <rafael.j.wysocki at intel.com>
> Sent: Saturday, October 7, 2023 2:01 AM
> To: Borah, Chaitanya Kumar <chaitanya.kumar.borah at intel.com>
> Cc: intel-gfx at lists.freedesktop.org; Kurmi, Suresh Kumar <suresh.kumar.kurmi at intel.com>; Saarinen, Jani <jani.saarinen at intel.com>
> Subject: Re: Regression in linux-next
>
> Hi,
> On 10/5/2023 5:58 PM, Borah, Chaitanya Kumar wrote:
> Hello Rafael,
>   
> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
> This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
>   
> Thanks for the report, I think that this is a lockdep assertion failing.
> If that is correct, it should be straightforward to fix.
> I'll take care of this early next week.
> Thanks!
>
> On next-20231003 [2], we are seeing the following error
>   
> ```````````````````````````````````````````````````````````````````````````````
> <4>[   14.093075] ------------[ cut here ]------------
> <4>[   14.097664] WARNING: CPU: 0 PID: 1 at drivers/thermal/thermal_trip.c:18 for_each_thermal_trip+0x83/0x90
> <4>[   14.106977] Modules linked in:
> <4>[   14.110017] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W          6.6.0-rc4-next-20231003-next-20231003-gc9f2baaa18b5+ #1
> <4>[   14.121305] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P DDR5 SODIMM SBS RVP, BIOS MTLPFWI1.R00.3323.D89.2309110529 09/11/2023
> <4>[   14.134478] RIP: 0010:for_each_thermal_trip+0x83/0x90
> <4>[   14.139496] Code: 5c 41 5d c3 cc cc cc cc 5b 31 c0 5d 41 5c 41 5d c3 cc cc cc cc 48 8d bf f0 05 00 00 be ff ff ff ff e8 21 a2 2d 00 85 c0 75 9a <0f> 0b eb 96 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90
>   
> Details log can be found in [3].
>   
> After bisecting the tree, the following patch [4] seems to be causing the regression.
>   
> commit d5ea889246b112e228433a5f27f57af90ca0c1fb
> Author: Rafael J. Wysocki mailto:rafael.j.wysocki at intel.com
> Date:   Thu Sep 21 20:02:59 2023 +0200
>   
>      ACPI: thermal: Do not use trip indices for cooling device binding
>   
>      Rearrange the ACPI thermal driver's callback functions used for cooling
>      device binding and unbinding, acpi_thermal_bind_cooling_device() and
>      acpi_thermal_unbind_cooling_device(), respectively, so that they use trip
>      pointers instead of trip indices which is more straightforward and allows
>      the driver to become independent of the ordering of trips in the thermal
>      zone structure.
>   
>      The general functionality is not expected to be changed.
>   
>      Signed-off-by: Rafael J. Wysocki mailto:rafael.j.wysocki at intel.com
>      Reviewed-by: Daniel Lezcano mailto:daniel.lezcano at linaro.org
>   
> We also verified by moving the head of the tree to the previous commit.
>   
> Could you please check why this patch causes the regression and if we can find a solution for it soon?
>   
> [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20231003
> [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20231003/bat-mtlp-6/boot0.txt
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20231003&id=d5ea889246b112e228433a5f27f57af90ca0c1fb


More information about the Intel-gfx mailing list