[syzbot] [net?] possible deadlock in vm_insert_page
Suren Baghdasaryan
surenb at google.com
Mon Dec 30 18:49:44 UTC 2024
On Mon, Dec 30, 2024 at 10:24 AM Boqun Feng <boqun.feng at gmail.com> wrote:
>
> On Sat, Dec 28, 2024 at 01:52:28AM -0800, Boqun Feng wrote:
> > On Fri, Dec 27, 2024 at 06:03:45PM -0800, Suren Baghdasaryan wrote:
> > > On Fri, Dec 27, 2024 at 4:19 PM Hillf Danton <hdanton at sina.com> wrote:
> > > >
> > > > On Fri, 27 Dec 2024 04:59:22 -0800
> > > > > Hello,
> > > > >
> > > > > syzbot found the following issue on:
> > > > >
> > > > > HEAD commit: 573067a5a685 Merge branch 'for-next/core' into for-kernelci
> > > > > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=149fdfe8580000
> > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=cd7202b56d469648
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=11701838dd42428ab7b3
> > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > > userspace arch: arm64
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > > >
> > > > > Downloadable assets:
> > > > > disk image: https://storage.googleapis.com/syzbot-assets/9d3b5c855aa0/disk-573067a5.raw.xz
> > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/0c06fc1ead83/vmlinux-573067a5.xz
> > > > > kernel image: https://storage.googleapis.com/syzbot-assets/3390e59b9e4b/Image-573067a5.gz.xz
> > > > >
> > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > Reported-by: syzbot+11701838dd42428ab7b3 at syzkaller.appspotmail.com
> > > > >
> > > > > ======================================================
> > > > > WARNING: possible circular locking dependency detected
> > > > > 6.13.0-rc3-syzkaller-g573067a5a685 #0 Not tainted
> > > > > ------------------------------------------------------
> > > > > syz.8.396/8273 is trying to acquire lock:
> > > > > ffff0000d0caa9b8 (&vma->vm_lock->lock){++++}-{4:4}, at: vma_start_write include/linux/mm.h:769 [inline]
> > > > > ffff0000d0caa9b8 (&vma->vm_lock->lock){++++}-{4:4}, at: vm_flags_set include/linux/mm.h:899 [inline]
> > > > > ffff0000d0caa9b8 (&vma->vm_lock->lock){++++}-{4:4}, at: vm_insert_page+0x2a0/0xab0 mm/memory.c:2241
> > > > >
> > > > > but task is already holding lock:
> > > > > ffff0000d4aa2868 (&po->pg_vec_lock){+.+.}-{4:4}, at: packet_mmap+0x9c/0x4c8 net/packet/af_packet.c:4650
> > > > >
> > > > > which lock already depends on the new lock.
> > > > >
> > > > >
> > > > > the existing dependency chain (in reverse order) is:
> > > > >
> > > > > -> #10 (&po->pg_vec_lock){+.+.}-{4:4}:
> > > > > __mutex_lock_common+0x218/0x28f4 kernel/locking/mutex.c:585
> > > > > __mutex_lock kernel/locking/mutex.c:735 [inline]
> > > > > mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:787
> > > > > packet_mmap+0x9c/0x4c8 net/packet/af_packet.c:4650
> > > > > sock_mmap+0x90/0xa8 net/socket.c:1403
> > > > > call_mmap include/linux/fs.h:2183 [inline]
> > > > > mmap_file mm/internal.h:124 [inline]
> > > > > __mmap_new_file_vma mm/vma.c:2291 [inline]
> > > > > __mmap_new_vma mm/vma.c:2355 [inline]
> > > > > __mmap_region+0x1854/0x2180 mm/vma.c:2456
> > > > > mmap_region+0x1f4/0x370 mm/mmap.c:1348
> > > > > do_mmap+0x8b0/0xfd0 mm/mmap.c:496
> > > > > vm_mmap_pgoff+0x1a0/0x38c mm/util.c:580
> > > > > ksys_mmap_pgoff+0x3a4/0x5c8 mm/mmap.c:542
> > > > > __do_sys_mmap arch/arm64/kernel/sys.c:28 [inline]
> > > > > __se_sys_mmap arch/arm64/kernel/sys.c:21 [inline]
> > > > > __arm64_sys_mmap+0xf8/0x110 arch/arm64/kernel/sys.c:21
> > > > > __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> > > > > invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> > > > > el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> > > > > do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> > > > > el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:744
> > > > > el0t_64_sync_handler+0x84/0x108 arch/arm64/kernel/entry-common.c:762
> > > > > el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
> > > > >
> > > > > -> #9 (&mm->mmap_lock){++++}-{4:4}:
> > > > > __might_fault+0xc4/0x124 mm/memory.c:6751
> > > > > drm_mode_object_get_properties+0x208/0x540 drivers/gpu/drm/drm_mode_object.c:407
> > > > > drm_mode_obj_get_properties_ioctl+0x2bc/0x4fc drivers/gpu/drm/drm_mode_object.c:459
> > > > > drm_ioctl_kernel+0x26c/0x368 drivers/gpu/drm/drm_ioctl.c:796
> > > > > drm_ioctl+0x624/0xb14 drivers/gpu/drm/drm_ioctl.c:893
> > > > > vfs_ioctl fs/ioctl.c:51 [inline]
> > > > > __do_sys_ioctl fs/ioctl.c:906 [inline]
> > > > > __se_sys_ioctl fs/ioctl.c:892 [inline]
> > > > > __arm64_sys_ioctl+0x14c/0x1cc fs/ioctl.c:892
> > > > > __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> > > > > invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> > > > > el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> > > > > do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> > > > > el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:744
> > > > > el0t_64_sync_handler+0x84/0x108 arch/arm64/kernel/entry-common.c:762
> > > > > el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
> > > > >
> > > > > -> #8 (crtc_ww_class_mutex){+.+.}-{4:4}:
> > > > > ww_acquire_init include/linux/ww_mutex.h:162 [inline]
> > > > > drm_modeset_acquire_init+0x1e4/0x384 drivers/gpu/drm/drm_modeset_lock.c:250
> > > > > drmm_mode_config_init+0xb98/0x130c drivers/gpu/drm/drm_mode_config.c:453
> > > > > vkms_modeset_init drivers/gpu/drm/vkms/vkms_drv.c:158 [inline]
> > > > > vkms_create drivers/gpu/drm/vkms/vkms_drv.c:219 [inline]
> > > > > vkms_init+0x2fc/0x600 drivers/gpu/drm/vkms/vkms_drv.c:256
> > > > > do_one_initcall+0x254/0x9f8 init/main.c:1266
> > > > > do_initcall_level+0x154/0x214 init/main.c:1328
> > > > > do_initcalls+0x58/0xac init/main.c:1344
> > > > > do_basic_setup+0x8c/0xa0 init/main.c:1363
> > > > > kernel_init_freeable+0x324/0x478 init/main.c:1577
> > > > > kernel_init+0x24/0x2a0 init/main.c:1466
> > > > > ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
> > > > >
> > > > > -> #7 (crtc_ww_class_acquire){+.+.}-{0:0}:
> > > > > ww_acquire_init include/linux/ww_mutex.h:161 [inline]
> > > > > drm_modeset_acquire_init+0x1c4/0x384 drivers/gpu/drm/drm_modeset_lock.c:250
> > > > > drm_client_modeset_commit_atomic+0xd8/0x724 drivers/gpu/drm/drm_client_modeset.c:1009
> > > > > drm_client_modeset_commit_locked+0xd0/0x4a8 drivers/gpu/drm/drm_client_modeset.c:1173
> > > > > drm_client_modeset_commit+0x50/0x7c drivers/gpu/drm/drm_client_modeset.c:1199
> > > > > __drm_fb_helper_restore_fbdev_mode_unlocked+0xd4/0x178 drivers/gpu/drm/drm_fb_helper.c:237
> > > > > drm_fb_helper_set_par+0xc4/0x110 drivers/gpu/drm/drm_fb_helper.c:1351
> > > > > fbcon_init+0xf34/0x1eb8 drivers/video/fbdev/core/fbcon.c:1113
> > > > > visual_init+0x27c/0x548 drivers/tty/vt/vt.c:1011
> > > > > do_bind_con_driver+0x7dc/0xe04 drivers/tty/vt/vt.c:3833
> > > > > do_take_over_console+0x4ac/0x5f0 drivers/tty/vt/vt.c:4399
> > > > > do_fbcon_takeover+0x158/0x260 drivers/video/fbdev/core/fbcon.c:549
> > > > > do_fb_registered drivers/video/fbdev/core/fbcon.c:2988 [inline]
> > > > > fbcon_fb_registered+0x370/0x4ec drivers/video/fbdev/core/fbcon.c:3008
> > > > > do_register_framebuffer drivers/video/fbdev/core/fbmem.c:449 [inline]
> > > > > register_framebuffer+0x470/0x610 drivers/video/fbdev/core/fbmem.c:515
> > > > > __drm_fb_helper_initial_config_and_unlock+0x137c/0x1910 drivers/gpu/drm/drm_fb_helper.c:1841
> > > > > drm_fb_helper_initial_config+0x48/0x64 drivers/gpu/drm/drm_fb_helper.c:1906
> > > > > drm_fbdev_client_hotplug+0x158/0x22c drivers/gpu/drm/drm_fbdev_client.c:51
> > > > > drm_client_register+0x144/0x1e0 drivers/gpu/drm/drm_client.c:140
> > > > > drm_fbdev_client_setup+0x1a4/0x39c drivers/gpu/drm/drm_fbdev_client.c:158
> > > > > drm_client_setup+0x28/0x9c drivers/gpu/drm/drm_client_setup.c:29
> > > > > vkms_create drivers/gpu/drm/vkms/vkms_drv.c:230 [inline]
> > > > > vkms_init+0x4f0/0x600 drivers/gpu/drm/vkms/vkms_drv.c:256
> > > > > do_one_initcall+0x254/0x9f8 init/main.c:1266
> > > > > do_initcall_level+0x154/0x214 init/main.c:1328
> > > > > do_initcalls+0x58/0xac init/main.c:1344
> > > > > do_basic_setup+0x8c/0xa0 init/main.c:1363
> > > > > kernel_init_freeable+0x324/0x478 init/main.c:1577
> > > > > kernel_init+0x24/0x2a0 init/main.c:1466
> > > > > ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
> > > > >
> > > > > -> #6 (&client->modeset_mutex){+.+.}-{4:4}:
> > > > > __mutex_lock_common+0x218/0x28f4 kernel/locking/mutex.c:585
> > > > > __mutex_lock kernel/locking/mutex.c:735 [inline]
> > > > > mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:787
> > > > > drm_client_modeset_probe+0x304/0x3f64 drivers/gpu/drm/drm_client_modeset.c:834
> > > > > __drm_fb_helper_initial_config_and_unlock+0x104/0x1910 drivers/gpu/drm/drm_fb_helper.c:1818
> > > > > drm_fb_helper_initial_config+0x48/0x64 drivers/gpu/drm/drm_fb_helper.c:1906
> > > > > drm_fbdev_client_hotplug+0x158/0x22c drivers/gpu/drm/drm_fbdev_client.c:51
> > > > > drm_client_register+0x144/0x1e0 drivers/gpu/drm/drm_client.c:140
> > > > > drm_fbdev_client_setup+0x1a4/0x39c drivers/gpu/drm/drm_fbdev_client.c:158
> > > > > drm_client_setup+0x28/0x9c drivers/gpu/drm/drm_client_setup.c:29
> > > > > vkms_create drivers/gpu/drm/vkms/vkms_drv.c:230 [inline]
> > > > > vkms_init+0x4f0/0x600 drivers/gpu/drm/vkms/vkms_drv.c:256
> > > > > do_one_initcall+0x254/0x9f8 init/main.c:1266
> > > > > do_initcall_level+0x154/0x214 init/main.c:1328
> > > > > do_initcalls+0x58/0xac init/main.c:1344
> > > > > do_basic_setup+0x8c/0xa0 init/main.c:1363
> > > > > kernel_init_freeable+0x324/0x478 init/main.c:1577
> > > > > kernel_init+0x24/0x2a0 init/main.c:1466
> > > > > ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
> > > > >
> > > > > -> #5 (&helper->lock){+.+.}-{4:4}:
> > > > > __mutex_lock_common+0x218/0x28f4 kernel/locking/mutex.c:585
> > > > > __mutex_lock kernel/locking/mutex.c:735 [inline]
> > > > > mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:787
> > > > > __drm_fb_helper_restore_fbdev_mode_unlocked+0xb4/0x178 drivers/gpu/drm/drm_fb_helper.c:228
> > > > > drm_fb_helper_set_par+0xc4/0x110 drivers/gpu/drm/drm_fb_helper.c:1351
> > > > > fbcon_init+0xf34/0x1eb8 drivers/video/fbdev/core/fbcon.c:1113
> > > > > visual_init+0x27c/0x548 drivers/tty/vt/vt.c:1011
> > > > > do_bind_con_driver+0x7dc/0xe04 drivers/tty/vt/vt.c:3833
> > > > > do_take_over_console+0x4ac/0x5f0 drivers/tty/vt/vt.c:4399
> > > > > do_fbcon_takeover+0x158/0x260 drivers/video/fbdev/core/fbcon.c:549
> > > > > do_fb_registered drivers/video/fbdev/core/fbcon.c:2988 [inline]
> > > > > fbcon_fb_registered+0x370/0x4ec drivers/video/fbdev/core/fbcon.c:3008
> > > > > do_register_framebuffer drivers/video/fbdev/core/fbmem.c:449 [inline]
> > > > > register_framebuffer+0x470/0x610 drivers/video/fbdev/core/fbmem.c:515
> > > > > __drm_fb_helper_initial_config_and_unlock+0x137c/0x1910 drivers/gpu/drm/drm_fb_helper.c:1841
> > > > > drm_fb_helper_initial_config+0x48/0x64 drivers/gpu/drm/drm_fb_helper.c:1906
> > > > > drm_fbdev_client_hotplug+0x158/0x22c drivers/gpu/drm/drm_fbdev_client.c:51
> > > > > drm_client_register+0x144/0x1e0 drivers/gpu/drm/drm_client.c:140
> > > > > drm_fbdev_client_setup+0x1a4/0x39c drivers/gpu/drm/drm_fbdev_client.c:158
> > > > > drm_client_setup+0x28/0x9c drivers/gpu/drm/drm_client_setup.c:29
> > > > > vkms_create drivers/gpu/drm/vkms/vkms_drv.c:230 [inline]
> > > > > vkms_init+0x4f0/0x600 drivers/gpu/drm/vkms/vkms_drv.c:256
> > > > > do_one_initcall+0x254/0x9f8 init/main.c:1266
> > > > > do_initcall_level+0x154/0x214 init/main.c:1328
> > > > > do_initcalls+0x58/0xac init/main.c:1344
> > > > > do_basic_setup+0x8c/0xa0 init/main.c:1363
> > > > > kernel_init_freeable+0x324/0x478 init/main.c:1577
> > > > > kernel_init+0x24/0x2a0 init/main.c:1466
> > > > > ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
> > > > >
> > > > > -> #4 (console_lock){+.+.}-{0:0}:
> > > > > console_lock+0x19c/0x1f4 kernel/printk/printk.c:2833
> > > > > __bch2_print_string_as_lines fs/bcachefs/util.c:267 [inline]
> > > > > bch2_print_string_as_lines+0x2c/0xd4 fs/bcachefs/util.c:286
> > > > > __bch2_fsck_err+0x1864/0x2544 fs/bcachefs/error.c:411
> > > > > bch2_check_fix_ptr fs/bcachefs/buckets.c:112 [inline]
> > > > > bch2_check_fix_ptrs+0x15b8/0x515c fs/bcachefs/buckets.c:266
> > > > > bch2_trigger_extent+0x71c/0x814 fs/bcachefs/buckets.c:856
> > > > > bch2_key_trigger fs/bcachefs/bkey_methods.h:87 [inline]
> > > > > bch2_gc_mark_key+0x4b4/0xb70 fs/bcachefs/btree_gc.c:634
> > > > > bch2_gc_btree fs/bcachefs/btree_gc.c:670 [inline]
> > > > > bch2_gc_btrees fs/bcachefs/btree_gc.c:729 [inline]
> > > > > bch2_check_allocations+0x1018/0x48f4 fs/bcachefs/btree_gc.c:1133
> > > > > bch2_run_recovery_pass+0xe4/0x1d4 fs/bcachefs/recovery_passes.c:191
> > > > > bch2_run_recovery_passes+0x30c/0x73c fs/bcachefs/recovery_passes.c:244
> > > > > bch2_fs_recovery+0x32d8/0x55dc fs/bcachefs/recovery.c:861
> > > > > bch2_fs_start+0x30c/0x53c fs/bcachefs/super.c:1037
> > > > > bch2_fs_get_tree+0x938/0x1030 fs/bcachefs/fs.c:2170
> > > > > vfs_get_tree+0x90/0x28c fs/super.c:1814
> > > > > do_new_mount+0x278/0x900 fs/namespace.c:3507
> > > > > path_mount+0x590/0xe04 fs/namespace.c:3834
> > > > > do_mount fs/namespace.c:3847 [inline]
> > > > > __do_sys_mount fs/namespace.c:4057 [inline]
> > > > > __se_sys_mount fs/namespace.c:4034 [inline]
> > > > > __arm64_sys_mount+0x4d4/0x5ac fs/namespace.c:4034
> > > > > __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> > > > > invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> > > > > el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> > > > > do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> > > > > el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:744
> > > > > el0t_64_sync_handler+0x84/0x108 arch/arm64/kernel/entry-common.c:762
> > > > > el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
> > > > >
> > > > > -> #3 (&c->fsck_error_msgs_lock){+.+.}-{4:4}:
> > > > > __mutex_lock_common+0x218/0x28f4 kernel/locking/mutex.c:585
> > > > > __mutex_lock kernel/locking/mutex.c:735 [inline]
> > > > > mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:787
> > > > > __bch2_fsck_err+0x344/0x2544 fs/bcachefs/error.c:282
> > > > > bch2_check_fix_ptr fs/bcachefs/buckets.c:112 [inline]
> > > > > bch2_check_fix_ptrs+0x15b8/0x515c fs/bcachefs/buckets.c:266
> > > > > bch2_trigger_extent+0x71c/0x814 fs/bcachefs/buckets.c:856
> > > > > bch2_key_trigger fs/bcachefs/bkey_methods.h:87 [inline]
> > > > > bch2_gc_mark_key+0x4b4/0xb70 fs/bcachefs/btree_gc.c:634
> > > > > bch2_gc_btree fs/bcachefs/btree_gc.c:670 [inline]
> > > > > bch2_gc_btrees fs/bcachefs/btree_gc.c:729 [inline]
> > > > > bch2_check_allocations+0x1018/0x48f4 fs/bcachefs/btree_gc.c:1133
> > > > > bch2_run_recovery_pass+0xe4/0x1d4 fs/bcachefs/recovery_passes.c:191
> > > > > bch2_run_recovery_passes+0x30c/0x73c fs/bcachefs/recovery_passes.c:244
> > > > > bch2_fs_recovery+0x32d8/0x55dc fs/bcachefs/recovery.c:861
> > > > > bch2_fs_start+0x30c/0x53c fs/bcachefs/super.c:1037
> > > > > bch2_fs_get_tree+0x938/0x1030 fs/bcachefs/fs.c:2170
> > > > > vfs_get_tree+0x90/0x28c fs/super.c:1814
> > > > > do_new_mount+0x278/0x900 fs/namespace.c:3507
> > > > > path_mount+0x590/0xe04 fs/namespace.c:3834
> > > > > do_mount fs/namespace.c:3847 [inline]
> > > > > __do_sys_mount fs/namespace.c:4057 [inline]
> > > > > __se_sys_mount fs/namespace.c:4034 [inline]
> > > > > __arm64_sys_mount+0x4d4/0x5ac fs/namespace.c:4034
> > > > > __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> > > > > invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> > > > > el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> > > > > do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> > > > > el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:744
> > > > > el0t_64_sync_handler+0x84/0x108 arch/arm64/kernel/entry-common.c:762
> > > > > el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
> > > > >
> > > > > -> #2 (&c->mark_lock){++++}-{0:0}:
> > > > > percpu_down_read+0x5c/0x2e8 include/linux/percpu-rwsem.h:51
> > > > > __bch2_disk_reservation_add+0xc4/0x9f4 fs/bcachefs/buckets.c:1170
> > > > > bch2_disk_reservation_add+0x29c/0x4f4 fs/bcachefs/buckets.h:367
> > > > > __bch2_folio_reservation_get+0x2dc/0x798 fs/bcachefs/fs-io-pagecache.c:428
> > > > > bch2_folio_reservation_get fs/bcachefs/fs-io-pagecache.c:477 [inline]
> > > > > bch2_page_mkwrite+0xa70/0xe44 fs/bcachefs/fs-io-pagecache.c:637
> > > > > do_page_mkwrite+0x140/0x2dc mm/memory.c:3176
> > > > > wp_page_shared mm/memory.c:3577 [inline]
> > > > > do_wp_page+0x1f50/0x38a0 mm/memory.c:3727
> > > > > handle_pte_fault+0xe44/0x5890 mm/memory.c:5817
> > > > > __handle_mm_fault mm/memory.c:5944 [inline]
> > > > > handle_mm_fault+0xf0c/0x17b0 mm/memory.c:6112
> > > > > do_page_fault+0x404/0x10a8 arch/arm64/mm/fault.c:647
> > > > > do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > > > el0_da+0x60/0x178 arch/arm64/kernel/entry-common.c:604
> > > > > el0t_64_sync_handler+0xcc/0x108 arch/arm64/kernel/entry-common.c:765
> > > > > el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
> > > > >
> > > > > -> #1 (sb_pagefaults#4){.+.+}-{0:0}:
> > > > > percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
> > > > > __sb_start_write include/linux/fs.h:1725 [inline]
> > > > > sb_start_pagefault include/linux/fs.h:1890 [inline]
> > > > > bch2_page_mkwrite+0x280/0xe44 fs/bcachefs/fs-io-pagecache.c:614
> > > > > do_page_mkwrite+0x140/0x2dc mm/memory.c:3176
> > > > > wp_page_shared mm/memory.c:3577 [inline]
> > > > > do_wp_page+0x1f50/0x38a0 mm/memory.c:3727
> > > > > handle_pte_fault+0xe44/0x5890 mm/memory.c:5817
> > > > > __handle_mm_fault mm/memory.c:5944 [inline]
> > > > > handle_mm_fault+0xf0c/0x17b0 mm/memory.c:6112
> > > > > do_page_fault+0x404/0x10a8 arch/arm64/mm/fault.c:647
> > > > > do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > > > el0_da+0x60/0x178 arch/arm64/kernel/entry-common.c:604
> > > > > el0t_64_sync_handler+0xcc/0x108 arch/arm64/kernel/entry-common.c:765
> > > > > el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
> > > > >
> > > > > -> #0 (&vma->vm_lock->lock){++++}-{4:4}:
> > > > > check_prev_add kernel/locking/lockdep.c:3161 [inline]
> > > > > check_prevs_add kernel/locking/lockdep.c:3280 [inline]
> > > > > validate_chain kernel/locking/lockdep.c:3904 [inline]
> > > > > __lock_acquire+0x34f0/0x7904 kernel/locking/lockdep.c:5226
> > > > > lock_acquire+0x23c/0x724 kernel/locking/lockdep.c:5849
> > > > > down_write+0x50/0xc0 kernel/locking/rwsem.c:1577
> > > > > vma_start_write include/linux/mm.h:769 [inline]
> > > > > vm_flags_set include/linux/mm.h:899 [inline]
> > > > > vm_insert_page+0x2a0/0xab0 mm/memory.c:2241
> > > > > packet_mmap+0x2f8/0x4c8 net/packet/af_packet.c:4680
> > > > > sock_mmap+0x90/0xa8 net/socket.c:1403
> > > > > call_mmap include/linux/fs.h:2183 [inline]
> > > > > mmap_file mm/internal.h:124 [inline]
> > > > > __mmap_new_file_vma mm/vma.c:2291 [inline]
> > > > > __mmap_new_vma mm/vma.c:2355 [inline]
> > > > > __mmap_region+0x1854/0x2180 mm/vma.c:2456
> > > > > mmap_region+0x1f4/0x370 mm/mmap.c:1348
> > > > > do_mmap+0x8b0/0xfd0 mm/mmap.c:496
> > > > > vm_mmap_pgoff+0x1a0/0x38c mm/util.c:580
> > > > > ksys_mmap_pgoff+0x3a4/0x5c8 mm/mmap.c:542
> > > > > __do_sys_mmap arch/arm64/kernel/sys.c:28 [inline]
> > > > > __se_sys_mmap arch/arm64/kernel/sys.c:21 [inline]
> > > > > __arm64_sys_mmap+0xf8/0x110 arch/arm64/kernel/sys.c:21
> > > > > __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> > > > > invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> > > > > el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> > > > > do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> > > > > el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:744
> > > > > el0t_64_sync_handler+0x84/0x108 arch/arm64/kernel/entry-common.c:762
> > > > > el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
> > > > >
> > > > > other info that might help us debug this:
> > > > >
> > > > > Chain exists of:
> > > > > &vma->vm_lock->lock --> &mm->mmap_lock --> &po->pg_vec_lock
> > > > >
> > > > > Possible unsafe locking scenario:
> > > > >
> > > > > CPU0 CPU1
> > > > > ---- ----
> > > > > lock(&po->pg_vec_lock);
> > > > > lock(&mm->mmap_lock);
> > > > > lock(&po->pg_vec_lock);
> > > > > lock(&vma->vm_lock->lock);
> > > > >
> > > > > *** DEADLOCK ***
> > > > >
> > > > > 2 locks held by syz.8.396/8273:
> > > > > #0: ffff0000d6a2cc10 (&mm->mmap_lock){++++}-{4:4}, at: mmap_write_lock_killable include/linux/mmap_lock.h:122 [inline]
> > > > > #0: ffff0000d6a2cc10 (&mm->mmap_lock){++++}-{4:4}, at: vm_mmap_pgoff+0x154/0x38c mm/util.c:578
> > > > > #1: ffff0000d4aa2868 (&po->pg_vec_lock){+.+.}-{4:4}, at: packet_mmap+0x9c/0x4c8 net/packet/af_packet.c:4650
> > > > >
> > > > Given &mm->mmap_lock and &po->pg_vec_lock in same locking order on both sides,
> > > > this deadlock report is bogus. Due to lockdep glitch?
> >
> > What do you mean by "both sides"? Note that, here is the report saying
> > the locks that are already held by the current task, and that current
> > task is going to acquire &vma->vm_lock->lock, so lockdep finds new
> > dependency:
> >
> > &po->pg_vec_lock --> &vma->vm_lock->lock
> >
> > and there will be a circular dependency because (see above) lockdep
> > recorded a dependency chain that:
> >
> > &vma->vm_lock->lock --> ... --> &po->pg_vec_lock
> >
> > >
> > > Yeah, this looks fishy. Note that to write-lock vma->vm_lock (which is
> > > what's done here) a task needs to also hold the mmap_write_lock, so
> > > the above race between CPU0 and CPU1 should not be possible because
> >
> > Note the the dependency chain has 11 locks in it, so the real deadlock
> > scenario may have 11 CPUs involved, and due to the limitation of how we
> > can do pretty-print in kernel log, it's always show two CPUs cases. The
> > real case may be:
> >
> > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10
> > ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
> > lock(&po->pg_vec_lock);
> > lock(&vma->vm_lock->lock);
> > lock(sb_pagefaults#4);
> > lock(&c->mark_lock);
> > lock(&c->fsck_error_msgs_lock);
> > lock(console_lock);
> > lock(&helper->lock);
> > lock(&client->modeset_mutex);
> > lock(crtc_ww_class_acquire);
> > lock(crtc_ww_class_mutex);
> > lock(&mm->mmap_lock);
> > lock(&po->pg_vec_lock);
> > lock(&mm->mmap_lock);
> > lock(crtc_ww_class_mutex);
> > lock(crtc_ww_class_acquire);
> > lock(&client->modeset_mutex);
> > lock(&helper->lock);
> > lock(console_lock);
> > lock(&c->fsck_error_msgs_lock);
> > lock(&c->mark_lock);
> > lock(sb_pagefaults#4);
> > lock(&vma->vm_lock->lock);
> >
>
> OK, in this case CPU0 and CPU10 should be the same (both of the
> dependencies are introduced by the current task), so the scenario should
> be:
>
> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9
> ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
> lock(&mm->mmap_lock);
> lock(&po->pg_vec_lock);
> lock(&vma->vm_lock->lock);
> lock(sb_pagefaults#4);
> lock(&c->mark_lock);
> lock(&c->fsck_error_msgs_lock);
> lock(console_lock);
> lock(&helper->lock);
> lock(&client->modeset_mutex);
> lock(crtc_ww_class_acquire);
> lock(crtc_ww_class_mutex);
> lock(&mm->mmap_lock);
> lock(crtc_ww_class_mutex);
> lock(crtc_ww_class_acquire);
> lock(&client->modeset_mutex);
> lock(&helper->lock);
> lock(console_lock);
> lock(&c->fsck_error_msgs_lock);
> lock(&c->mark_lock);
> lock(sb_pagefaults#4);
> lock(&vma->vm_lock->lock);
>
> and CPU0 is running the current task.
>
> And CPU1, CPU2 can be the same CPU because the lock dependencies
> (&vma->vm_lock->lock --> sb_pagefaults#4 and sb_pagefaults#4 -->
> &c->mark_lock) are both introduced in a do_page_fault() according to the
> existing dependency chain info #1 and #2 above, therefore, it's
> simpified into:
>
> CPU0 CPU1 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9
> ---- ---- ---- ---- ---- ---- ---- ---- ----
> (current task)
> lock(&mm->mmap_lock);
> lock(&po->pg_vec_lock);
> lock(&vma->vm_lock->lock);
> lock(sb_pagefaults#4);
> lock(&c->mark_lock);
> lock(&c->fsck_error_msgs_lock);
> lock(console_lock);
> lock(&helper->lock);
> lock(&client->modeset_mutex);
> lock(crtc_ww_class_acquire);
> lock(crtc_ww_class_mutex);
> lock(&mm->mmap_lock);
> lock(crtc_ww_class_mutex);
> lock(crtc_ww_class_acquire);
> lock(&client->modeset_mutex);
> lock(&helper->lock);
> lock(console_lock);
> lock(&c->fsck_error_msgs_lock);
> lock(&c->mark_lock);
> lock(&vma->vm_lock->lock);
>
> and CPU1 is running do_page_fault().
>
> Similarly CPU3 and CPU4 can be the same CPU due to depenedency chain
> info #3 and #4 above, so:
>
> CPU0 CPU1 CPU3 CPU5 CPU6 CPU7 CPU8 CPU9
> ---- ---- ---- ---- ---- ---- ---- ----
> [current task]
> lock(&mm->mmap_lock);
> lock(&po->pg_vec_lock);
> [in do_page_fault()]
> lock(&vma->vm_lock->lock);
> lock(sb_pagefaults#4);
> lock(&c->mark_lock);
> lock(&c->fsck_error_msgs_lock);
> lock(console_lock);
> lock(&helper->lock);
> lock(&client->modeset_mutex);
> lock(crtc_ww_class_acquire);
> lock(crtc_ww_class_mutex);
> lock(&mm->mmap_lock);
> lock(crtc_ww_class_mutex);
> lock(crtc_ww_class_acquire);
> lock(&client->modeset_mutex);
> lock(&helper->lock);
> lock(console_lock);
> lock(&c->mark_lock);
> lock(&vma->vm_lock->lock);
>
> and CPU3 is doing bch2_check_fix_ptrs().
>
> CPU5, CPU6, CPU7 and CPU8 can be the same CPU due to depenedency chain
> info #5, #6, #7 and #8:
>
> CPU0 CPU1 CPU3 CPU5 CPU9
> ---- ---- ---- ---- ----
> [current task]
> lock(&mm->mmap_lock);
> lock(&po->pg_vec_lock);
> [in do_page_fault()]
> lock(&vma->vm_lock->lock);
> lock(sb_pagefaults#4);
> [in bch2_check_fix_ptrs()]
> lock(&c->mark_lock);
> lock(&c->fsck_error_msgs_lock);
> lock(console_lock);
> lock(&helper->lock);
> lock(&client->modeset_mutex);
> lock(crtc_ww_class_acquire);
> lock(crtc_ww_class_mutex);
> lock(&mm->mmap_lock);
> lock(crtc_ww_class_mutex);
> lock(console_lock);
> lock(&c->mark_lock);
> lock(&vma->vm_lock->lock);
>
> and CPU5 is doing vkms_init(). May need some help to find where the
> console_lock() was acquired in this path, because I wasn't able to find
> one.
>
> And based on depenedency chain info #9, CPU9 is doing
> drm_mode_obj_get_properties_ioctl():
>
> CPU0 CPU1 CPU3 CPU5 CPU9
> ---- ---- ---- ---- ----
> [current task]
>
> lock(&mm->mmap_lock);
> lock(&po->pg_vec_lock);
> [in do_page_fault()]
>
> lock(&vma->vm_lock->lock);
> lock(sb_pagefaults#4);
> [in bch2_check_fix_ptrs()]
>
> lock(&c->mark_lock);
> lock(&c->fsck_error_msgs_lock);
> [in vkms_init()]
>
> lock(console_lock);
> lock(&helper->lock);
> lock(&client->modeset_mutex);
> lock(crtc_ww_class_acquire);
> [in drm_mode_obj_get_properties_ioctl()]
>
> lock(crtc_ww_class_mutex);
> lock(&mm->mmap_lock);
> lock(crtc_ww_class_mutex);
> lock(console_lock);
> lock(&c->mark_lock);
> lock(&vma->vm_lock->lock);
Thanks for breaking down this more. Overall this sequence of:
lock(&mm->mmap_lock);
lock(&po->pg_vec_lock);
lock(&vma->vm_lock->lock);
looks scary to me because as you said pagefault path takes
vma->vm_lock and po->pg_vec_lock in the reverse order. So I tool a
closer look into the path that does that. I believe it's this one:
-> #10 (&po->pg_vec_lock){+.+.}-{4:4}:
__mutex_lock_common+0x218/0x28f4 kernel/locking/mutex.c:585
__mutex_lock kernel/locking/mutex.c:735 [inline]
mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:787
packet_mmap+0x9c/0x4c8 net/packet/af_packet.c:4650
sock_mmap+0x90/0xa8 net/socket.c:1403
call_mmap include/linux/fs.h:2183 [inline]
mmap_file mm/internal.h:124 [inline]
__mmap_new_file_vma mm/vma.c:2291 [inline]
__mmap_new_vma mm/vma.c:2355 [inline]
__mmap_region+0x1854/0x2180 mm/vma.c:2456
mmap_region+0x1f4/0x370 mm/mmap.c:1348
do_mmap+0x8b0/0xfd0 mm/mmap.c:496
And indeed __mmap_new_vma() calls __mmap_new_file_vma() (which
eventually locks pg_vec_lock) before it write-locks the vma here:
https://elixir.bootlin.com/linux/v6.13-rc3/source/mm/vma.c#L2370.
Normally we would want to lock the vma first before making vma
changes, however note that this is a brand new vma not yet added into
the vma tree. We write-lock the vma right before we add it into the
tree here: https://elixir.bootlin.com/linux/v6.13-rc3/source/mm/vma.c#L2371
So, pagefault path would not really care about this VMA because by the
time it becomes visible for pagefaults to use it, it's already
write-locked, so vma_start_read() will fail to read-lock it.
If this was a vma visible to pagefaults then the locking sequence
would have to be enforced so that vma is write-locked before any
changes are done to it (and any other locks are taken).
Thanks,
Suren.
> Cc bcachefs, drm for the awareness.
>
> Regards,
> Boqun
>
> > (of course, it could happen with less CPUs and it could also be a false
> > positive, but the depenedency chain is real)
> >
> > Also a quick look seems to suggest that the lock dependency on CPU 1:
> >
> > lock(&vma->vm_lock->lock);
> > lock(sb_pagefaults#4);
> >
> > can happen in a page fault with a reader of &vma->vm_lock->lock.
> >
> > do_page_fault():
> > lock_vma_under_rcu():
> > vma_start_read():
> > down_read_trylock(); // read lock &vma->vm_lock_lock here.
> > ...
> > handle_mm_fault():
> > sb_start_pagefault(); // lock(sb_pagefaults#4);
> >
> > if so, an existing reader can block the other writer, so I don't think
> > the mmap_lock write protection can help here.
> >
> >
> > It's bit late for me to take a deep look, will continue tomorrow. So far
> > the story seems to be:
> >
> > * Page fault can connect &vma->vm_lock->lock with &c->mark_lock.
> >
> > * Some bcachefs internal can connect &c->mark_lock with console_lock.
> >
> > * Some drm internal can connect console_lock with drm internal
> > locks (e.g. crtc_ww_class_mutex) because of fbcon.
> >
> > * (not sure) drm may trigger a page fault (because of put_user())
> > with some internal locks held. This will connect
> > crtc_ww_class_mutex with &mm->mmap_lock.
> >
> > * And eventually normal mm operations will connect &mm->mmap_lock
> > with &vma->vm_lock->lock.
> >
> > Regards,
> > Boqun
> >
> >
> > > they will synchronize on the mmap_lock before locking vm_lock or
> > > pg_vec_lock.
> > >
> > > >
> > > > > stack backtrace:
> > > > > CPU: 0 UID: 0 PID: 8273 Comm: syz.8.396 Not tainted 6.13.0-rc3-syzkaller-g573067a5a685 #0
> > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> > > > > Call trace:
> > > > > show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:466 (C)
> > > > > __dump_stack lib/dump_stack.c:94 [inline]
> > > > > dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:120
> > > > > dump_stack+0x1c/0x28 lib/dump_stack.c:129
> > > > > print_circular_bug+0x154/0x1c0 kernel/locking/lockdep.c:2074
> > > > > check_noncircular+0x310/0x404 kernel/locking/lockdep.c:2206
> > > > > check_prev_add kernel/locking/lockdep.c:3161 [inline]
> > > > > check_prevs_add kernel/locking/lockdep.c:3280 [inline]
> > > > > validate_chain kernel/locking/lockdep.c:3904 [inline]
> > > > > __lock_acquire+0x34f0/0x7904 kernel/locking/lockdep.c:5226
> > > > > lock_acquire+0x23c/0x724 kernel/locking/lockdep.c:5849
> > > > > down_write+0x50/0xc0 kernel/locking/rwsem.c:1577
> > > > > vma_start_write include/linux/mm.h:769 [inline]
> > > > > vm_flags_set include/linux/mm.h:899 [inline]
> > > > > vm_insert_page+0x2a0/0xab0 mm/memory.c:2241
> > > > > packet_mmap+0x2f8/0x4c8 net/packet/af_packet.c:4680
> > > > > sock_mmap+0x90/0xa8 net/socket.c:1403
> > > > > call_mmap include/linux/fs.h:2183 [inline]
> > > > > mmap_file mm/internal.h:124 [inline]
> > > > > __mmap_new_file_vma mm/vma.c:2291 [inline]
> > > > > __mmap_new_vma mm/vma.c:2355 [inline]
> > > > > __mmap_region+0x1854/0x2180 mm/vma.c:2456
> > > > > mmap_region+0x1f4/0x370 mm/mmap.c:1348
> > > > > do_mmap+0x8b0/0xfd0 mm/mmap.c:496
> > > > > vm_mmap_pgoff+0x1a0/0x38c mm/util.c:580
> > > > > ksys_mmap_pgoff+0x3a4/0x5c8 mm/mmap.c:542
> > > > > __do_sys_mmap arch/arm64/kernel/sys.c:28 [inline]
> > > > > __se_sys_mmap arch/arm64/kernel/sys.c:21 [inline]
> > > > > __arm64_sys_mmap+0xf8/0x110 arch/arm64/kernel/sys.c:21
> > > > > __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> > > > > invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> > > > > el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> > > > > do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> > > > > el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:744
> > > > > el0t_64_sync_handler+0x84/0x108 arch/arm64/kernel/entry-common.c:762
> > > > > el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
> > > > >
> > > > >
> > > > > ---
> > > > > This report is generated by a bot. It may contain errors.
> > > > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > > > syzbot engineers can be reached at syzkaller at googlegroups.com.
> > > > >
> > > > > syzbot will keep track of this issue. See:
> > > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > > > >
> > > > > If the report is already addressed, let syzbot know by replying with:
> > > > > #syz fix: exact-commit-title
> > > > >
> > > > > If you want to overwrite report's subsystems, reply with:
> > > > > #syz set subsystems: new-subsystem
> > > > > (See the list of subsystem names on the web dashboard)
> > > > >
> > > > > If the report is a duplicate of another one, reply with:
> > > > > #syz dup: exact-subject-of-another-report
> > > > >
> > > > > If you want to undo deduplication, reply with:
> > > > > #syz undup
> > > > >
> > > >
More information about the dri-devel
mailing list