Regression on linux-next (next-20240228)
Borah, Chaitanya Kumar
chaitanya.kumar.borah at intel.com
Mon Mar 4 04:49:47 UTC 2024
Hello Matthew,
Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
Since the version next-20240228 [2], we are seeing the following regression
`````````````````````````````````````````````````````````````````````````````````
<6> [388.274691] i915: Running intel_migrate_live_selftests/live_migrate_copy
<7> [388.274790] i915 0000:00:02.0: [drm:gsc_work [i915]] GT1: GSC Proxy initialized
<4> [388.540070] page:ffffea0004666880 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1199a2
<4> [388.540111] flags: 0x8000000000000000(zone=2)
<4> [388.540117] page_type: 0xffffffff()
<4> [388.540123] raw: 8000000000000000 ffffea0004524008 ffffea0005f68e08 0000000000000000
<4> [388.540127] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
<4> [388.540130] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
<4> [388.540140] ------------[ cut here ]------------
<2> [388.540143] kernel BUG at include/linux/mm.h:1134!
<4> [388.544999] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4> [388.550187] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G U 6.8.0-rc6-next-20240228-next-20240228-g20af1ca418d2+ #1
<4> [388.561471] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P DDR5 SODIMM SBS RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024
<4> [388.574636] RIP: 0010:put_pages_list+0x92/0xe0
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [3].
After bisecting the tree, the following patch [4] seems to be the first "bad"
commit
`````````````````````````````````````````````````````````````````````````````````````````````````````````
commit ac7130117e8860081be88149061b5abb654d5759
Author: Matthew Wilcox (Oracle) mailto:willy at infradead.org
Date: Tue Feb 27 17:42:41 2024 +0000
mm: use free_unref_folios() in put_pages_list()
Break up the list of folios into batches here so that the folios are more
likely to be cache hot when doing the rest of the processing.
Link: https://lkml.kernel.org/r/20240227174254.710559-8-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) mailto:willy at infradead.org
`````````````````````````````````````````````````````````````````````````````````````````````````````````
We could not revert the patch because of a build errors but resetting to the parent of the commit seems to fix the issue
Could you please check why the patch causes this regression and provide a fix if necessary?
Thank you.
Regards
Chaitanya
[1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
[2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20240228
[3] https://intel-gfx-ci.01.org/tree/linux-next/next-20240228/bat-mtlp-8/igt@i915_selftest@live@migrate.html
[4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20240228&id=ac7130117e8860081be88149061b5abb654d5759
More information about the Intel-gfx
mailing list