[Intel-gfx] NPE in i915_gemfs_init
Heiner Kallweit
hkallweit1 at gmail.com
Sat Jul 13 11:12:56 UTC 2019
On 13.07.2019 12:38, Heiner Kallweit wrote:
> Since few days I'm getting the following on a N3450-based headless system with linux-next.
> linux-next from Jul 4th was still ok.
> Is this a known issue?
>
> [ 4.818139] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 4.818165] #PF: supervisor instruction fetch in kernel mode
> [ 4.818178] #PF: error_code(0x0010) - not-present page
> [ 4.818192] PGD 0 P4D 0
> [ 4.818203] Oops: 0010 [#1] SMP
> [ 4.818214] CPU: 2 PID: 2008 Comm: systemd-udevd Not tainted 5.2.0-next-20190712 #1
> [ 4.818232] Hardware name: NA ZBOX-CI327NANO-GS-01/ZBOX-CI327NANO-GS-01, BIOS 5.12 04/26/2018
> [ 4.818253] RIP: 0010:0x0
> [ 4.818265] Code: Bad RIP value.
> [ 4.818275] RSP: 0018:ffffacd84023f918 EFLAGS: 00010287
> [ 4.818288] RAX: 0000000000000000 RBX: ffff8b11f7f60000 RCX: 00000000000000ae
> [ 4.818304] RDX: ffffacd84023f92d RSI: ffffacd84023f928 RDI: ffff8b11f884a000
> [ 4.818320] RBP: ffffacd84023f950 R08: 0000000000000001 R09: 0000000000000000
> [ 4.818336] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b11f882c7a0
> [ 4.818352] R13: 0000000000000000 R14: ffff8b11f7f60000 R15: ffffffffc06b71e0
> [ 4.818369] FS: 00007fb45c549840(0000) GS:ffff8b11fbb00000(0000) knlGS:0000000000000000
> [ 4.818387] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 4.818404] CR2: ffffffffffffffd6 CR3: 0000000177d4a000 CR4: 00000000003406e0
> [ 4.818419] Call Trace:
> [ 4.818508] i915_gemfs_init+0x80/0xc0 [i915]
> [ 4.818582] i915_gem_init_early+0x126/0x140 [i915]
> [ 4.818647] i915_driver_load+0x362/0x1740 [i915]
> [ 4.818663] ? find_held_lock+0x37/0x90
> [ 4.818675] ? _raw_spin_unlock_irqrestore+0x45/0x50
> [ 4.818689] ? __pm_runtime_resume+0x5e/0x90
> [ 4.818701] ? lockdep_hardirqs_on+0xf2/0x180
> [ 4.818713] ? _raw_spin_unlock_irqrestore+0x45/0x50
> [ 4.818778] i915_pci_probe+0x45/0x120 [i915]
> [ 4.818792] pci_device_probe+0xab/0x120
> [ 4.818804] really_probe+0xf4/0x290
> [ 4.818815] driver_probe_device+0x53/0xa0
> [ 4.818826] device_driver_attach+0x59/0x60
> [ 4.818838] __driver_attach+0x53/0xc0
> [ 4.818849] ? device_driver_attach+0x60/0x60
> [ 4.818862] bus_for_each_dev+0x82/0xd0
> [ 4.818873] driver_attach+0x1f/0x30
> [ 4.818883] bus_add_driver+0x174/0x1c0
> [ 4.818895] driver_register+0x71/0xc0
> [ 4.818906] __pci_register_driver+0x73/0x80
> [ 4.818971] i915_init+0x5c/0x67 [i915]
> [ 4.818982] ? 0xffffffffc06fb000
> [ 4.818993] do_one_initcall+0x5f/0x2e5
> [ 4.819005] ? do_init_module+0x23/0x220
> [ 4.819018] ? rcu_read_lock_sched_held+0x76/0x80
> [ 4.819032] ? kmem_cache_alloc_trace+0x234/0x260
> [ 4.819045] do_init_module+0x5d/0x220
> [ 4.819056] load_module+0x2220/0x2560
> [ 4.819068] ? kernel_read+0x52/0x80
> [ 4.819079] __do_sys_finit_module+0xda/0x100
> [ 4.819091] ? __do_sys_finit_module+0xda/0x100
> [ 4.819104] __x64_sys_finit_module+0x19/0x20
> [ 4.819116] do_syscall_64+0x50/0x1a0
> [ 4.819127] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 4.819140] RIP: 0033:0x7fb45dd2ce3d
> [ 4.819152] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 23 50 0c 00 f7 d8 64 89 01 48
> [ 4.819190] RSP: 002b:00007fff3b90ee48 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> [ 4.819208] RAX: ffffffffffffffda RBX: 00005586f27eccb0 RCX: 00007fb45dd2ce3d
> [ 4.819224] RDX: 0000000000000000 RSI: 00007fb45d9a284d RDI: 0000000000000015
> [ 4.819240] RBP: 00007fb45d9a284d R08: 0000000000000000 R09: 0000000000000001
> [ 4.819256] R10: 0000000000000015 R11: 0000000000000246 R12: 0000000000000000
> [ 4.819272] R13: 00005586f27e23b0 R14: 0000000000020000 R15: 00005586f27eccb0
> [ 4.819289] Modules linked in: aes_x86_64(+) glue_helper crypto_simd i915(+) cryptd snd_hda_intel intel_gtt i2c_algo_bit drm_kms_helper syscopyarea snd_hda_codec sysfillrect sysimgblt snd_hda_core fb_sys_fops r8169 i2c_i801 snd_pcm realtek snd_timer libphy drm snd mei_me mei sch_fq_codel crypto_user efivarfs ipv6 serio_raw atkbd libps2 i8042 serio ums_realtek ext4 crc32c_intel mbcache jbd2 ahci libahci libata
> [ 4.819387] CR2: 0000000000000000
> [ 4.819399] ---[ end trace 531c4d73e2bf857e ]---
> [ 4.819412] RIP: 0010:0x0
> [ 4.819424] Code: Bad RIP value.
> [ 4.819434] RSP: 0018:ffffacd84023f918 EFLAGS: 00010287
> [ 4.819447] RAX: 0000000000000000 RBX: ffff8b11f7f60000 RCX: 00000000000000ae
> [ 4.819463] RDX: ffffacd84023f92d RSI: ffffacd84023f928 RDI: ffff8b11f884a000
> [ 4.819479] RBP: ffffacd84023f950 R08: 0000000000000001 R09: 0000000000000000
> [ 4.819495] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b11f882c7a0
> [ 4.819511] R13: 0000000000000000 R14: ffff8b11f7f60000 R15: ffffffffc06b71e0
> [ 4.819527] FS: 00007fb45c549840(0000) GS:ffff8b11fbb00000(0000) knlGS:0000000000000000
> [ 4.819545] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 4.819559] CR2: ffffffffffffffd6 CR3: 0000000177d4a000 CR4: 00000000003406e0
> [ 4.819576] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:38
> [ 4.819596] in_atomic(): 0, irqs_disabled(): 1, pid: 2008, name: systemd-udevd
> [ 4.819613] INFO: lockdep is turned off.
> [ 4.819623] irq event stamp: 31896
> [ 4.819637] hardirqs last enabled at (31895): [<ffffffff9f40e875>] kfree+0xc5/0x2a0
> [ 4.819656] hardirqs last disabled at (31896): [<ffffffff9f201c3d>] trace_hardirqs_off_thunk+0x1a/0x1c
> [ 4.819679] softirqs last enabled at (31866): [<ffffffff9fc00327>] __do_softirq+0x327/0x424
> [ 4.819700] softirqs last disabled at (31859): [<ffffffff9f26f8b3>] irq_exit+0xb3/0xc0
> [ 4.819719] CPU: 2 PID: 2008 Comm: systemd-udevd Tainted: G D 5.2.0-next-20190712 #1
> [ 4.819739] Hardware name: NA ZBOX-CI327NANO-GS-01/ZBOX-CI327NANO-GS-01, BIOS 5.12 04/26/2018
> [ 4.819758] Call Trace:
> [ 4.819770] dump_stack+0x70/0xa0
> [ 4.819782] ___might_sleep.cold+0x9f/0xb0
> [ 4.819794] __might_sleep+0x46/0x80
> [ 4.819805] exit_signals+0x2f/0x330
> [ 4.819816] do_exit+0xb3/0xb60
> [ 4.819827] rewind_stack_do_exit+0x17/0x20
> [ 4.819838] RIP: 0033:0x7fb45dd2ce3d
> [ 4.819849] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 23 50 0c 00 f7 d8 64 89 01 48
> [ 4.819887] RSP: 002b:00007fff3b90ee48 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> [ 4.819905] RAX: ffffffffffffffda RBX: 00005586f27eccb0 RCX: 00007fb45dd2ce3d
> [ 4.819921] RDX: 0000000000000000 RSI: 00007fb45d9a284d RDI: 0000000000000015
> [ 4.819937] RBP: 00007fb45d9a284d R08: 0000000000000000 R09: 0000000000000001
> [ 4.819953] R10: 0000000000000015 R11: 0000000000000246 R12: 0000000000000000
> [ 4.819969] R13: 00005586f27e23b0 R14: 0000000000020000 R15: 00005586f27eccb0
>
I debugged a little bit and remount_fs isn't set in sb->s_op.
The following at least avoids the NPE, not sure whether it's the correct fix.
diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c
index 099f3397a..a80903d01 100644
--- a/drivers/gpu/drm/i915/gem/i915_gemfs.c
+++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c
@@ -14,6 +14,7 @@
int i915_gemfs_init(struct drm_i915_private *i915)
{
struct file_system_type *type;
+ struct super_block *sb;
struct vfsmount *gemfs;
type = get_fs_type("tmpfs");
@@ -32,8 +33,9 @@ int i915_gemfs_init(struct drm_i915_private *i915)
* shrunk.
*/
- if (has_transparent_hugepage()) {
- struct super_block *sb = gemfs->mnt_sb;
+ sb = gemfs->mnt_sb;
+
+ if (has_transparent_hugepage() && sb->s_op->remount_fs) {
/* FIXME: Disabled until we get W/A for read BW issue. */
char options[] = "huge=never";
int flags = 0;
--
2.22.0
More information about the Intel-gfx
mailing list