[Intel-gfx] Regression on linux-next (next-20231016)

Borah, Chaitanya Kumar chaitanya.kumar.borah at intel.com
Fri Oct 20 05:52:21 UTC 2023


Hello Lorenzo,

Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.

This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.

Since the version next-20231016 [2], we are seeing the following error
```````````````````````````````````````````````````````````````````````````````
<6>[    4.550196] e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
<1>[    4.581173] BUG: kernel NULL pointer dereference, address: 00000000000001b8
<1>[    4.581178] #PF: supervisor read access in kernel mode
<1>[    4.581180] #PF: error_code(0x0000) - not-present page
<6>[    4.581182] PGD 0 P4D 0 
<4>[    4.581184] Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[    4.581186] CPU: 6 PID: 460 Comm: apache2 Not tainted 6.6.0-rc6-next-20231016-next-20231016-g4d0515b235de+ #1
<4>[    4.581189] Hardware name: Intel Corporation Raptor Lake Client Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS RPLSFWI1.R00.3157.A00.2204200131 04/20/2022
<4>[    4.581193] RIP: 0010:mmap_region+0x803/0xa50
`````````````````````````````````````````````````````````````````````````````````

Details log can be found in [3].

After bisecting the tree, the following patch [4] seems to be causing the regression.

`````````````````````````````````````````````````````````````````````````````````````````````````````````
1db41d29b79ad271674081c752961edd064bbbac is the first bad commit
commit 1db41d29b79ad271674081c752961edd064bbbac
Author: Lorenzo Stoakes lstoakes at gmail.com
Date:   Thu Oct 12 18:04:30 2023 +0100

    mm: perform the mapping_map_writable() check after call_mmap()

    In order for a F_SEAL_WRITE sealed memfd mapping to have an opportunity to
    clear VM_MAYWRITE, we must be able to invoke the appropriate
    vm_ops->mmap() handler to do so.  We would otherwise fail the
    mapping_map_writable() check before we had the opportunity to avoid it.

    This patch moves this check after the call_mmap() invocation.  Only memfd
    actively denies write access causing a potential failure here (in
    memfd_add_seals()), so there should be no impact on non-memfd cases.

    This patch makes the userland-visible change that MAP_SHARED, PROT_READ
    mappings of an F_SEAL_WRITE sealed memfd mapping will now succeed.

    There is a delicate situation with cleanup paths assuming that a writable
    mapping must have occurred in circumstances where it may now not have.  In
    order to ensure we do not accidentally mark a writable file unwritable by
    mistake, we explicitly track whether we have a writable mapping and unmap
    only if we do.
`````````````````````````````````````````````````````````````````````````````````````````````````````````

We also verified that reverting  the patch fixes the issue.

We didn't see the issue on next-20231018. Is there a fix already available for this? If not, could you please check why this patch causes the regression and if we can find a solution for it soon?

[1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
[2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20231016
[3] https://intel-gfx-ci.01.org/tree/linux-next/next-20231016/bat-rpls-1/boot0.txt 
[4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20231016&id=1db41d29b79ad271674081c752961edd064bbbac


More information about the Intel-gfx mailing list