[Intel-gfx] NPE in i915_gemfs_init

Heiner Kallweit hkallweit1 at gmail.com
Sun Jul 14 12:39:15 UTC 2019


On 14.07.2019 14:34, Chris Wilson wrote:
> Quoting Heiner Kallweit (2019-07-13 12:12:56)
>> On 13.07.2019 12:38, Heiner Kallweit wrote:
>>> Since few days I'm getting the following on a N3450-based headless system with linux-next.
>>> linux-next from Jul 4th was still ok.
>>> Is this a known issue?
>>>
>>> [    4.818139] BUG: kernel NULL pointer dereference, address: 0000000000000000
>>> [    4.818165] #PF: supervisor instruction fetch in kernel mode
>>> [    4.818178] #PF: error_code(0x0010) - not-present page
>>> [    4.818192] PGD 0 P4D 0
>>> [    4.818203] Oops: 0010 [#1] SMP
>>> [    4.818214] CPU: 2 PID: 2008 Comm: systemd-udevd Not tainted 5.2.0-next-20190712 #1
>>> [    4.818232] Hardware name: NA ZBOX-CI327NANO-GS-01/ZBOX-CI327NANO-GS-01, BIOS 5.12 04/26/2018
>>> [    4.818253] RIP: 0010:0x0
>>> [    4.818265] Code: Bad RIP value.
>>> [    4.818275] RSP: 0018:ffffacd84023f918 EFLAGS: 00010287
>>> [    4.818288] RAX: 0000000000000000 RBX: ffff8b11f7f60000 RCX: 00000000000000ae
>>> [    4.818304] RDX: ffffacd84023f92d RSI: ffffacd84023f928 RDI: ffff8b11f884a000
>>> [    4.818320] RBP: ffffacd84023f950 R08: 0000000000000001 R09: 0000000000000000
>>> [    4.818336] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b11f882c7a0
>>> [    4.818352] R13: 0000000000000000 R14: ffff8b11f7f60000 R15: ffffffffc06b71e0
>>> [    4.818369] FS:  00007fb45c549840(0000) GS:ffff8b11fbb00000(0000) knlGS:0000000000000000
>>> [    4.818387] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [    4.818404] CR2: ffffffffffffffd6 CR3: 0000000177d4a000 CR4: 00000000003406e0
>>> [    4.818419] Call Trace:
>>> [    4.818508]  i915_gemfs_init+0x80/0xc0 [i915]
>>> [    4.818582]  i915_gem_init_early+0x126/0x140 [i915]
>>> [    4.818647]  i915_driver_load+0x362/0x1740 [i915]
>>> [    4.818663]  ? find_held_lock+0x37/0x90
>>> [    4.818675]  ? _raw_spin_unlock_irqrestore+0x45/0x50
>>> [    4.818689]  ? __pm_runtime_resume+0x5e/0x90
>>> [    4.818701]  ? lockdep_hardirqs_on+0xf2/0x180
>>> [    4.818713]  ? _raw_spin_unlock_irqrestore+0x45/0x50
>>> [    4.818778]  i915_pci_probe+0x45/0x120 [i915]
>>> [    4.818792]  pci_device_probe+0xab/0x120
>>> [    4.818804]  really_probe+0xf4/0x290
>>> [    4.818815]  driver_probe_device+0x53/0xa0
>>> [    4.818826]  device_driver_attach+0x59/0x60
>>> [    4.818838]  __driver_attach+0x53/0xc0
>>> [    4.818849]  ? device_driver_attach+0x60/0x60
>>> [    4.818862]  bus_for_each_dev+0x82/0xd0
>>> [    4.818873]  driver_attach+0x1f/0x30
>>> [    4.818883]  bus_add_driver+0x174/0x1c0
>>> [    4.818895]  driver_register+0x71/0xc0
>>> [    4.818906]  __pci_register_driver+0x73/0x80
>>> [    4.818971]  i915_init+0x5c/0x67 [i915]
>>> [    4.818982]  ? 0xffffffffc06fb000
>>> [    4.818993]  do_one_initcall+0x5f/0x2e5
>>> [    4.819005]  ? do_init_module+0x23/0x220
>>> [    4.819018]  ? rcu_read_lock_sched_held+0x76/0x80
>>> [    4.819032]  ? kmem_cache_alloc_trace+0x234/0x260
>>> [    4.819045]  do_init_module+0x5d/0x220
>>> [    4.819056]  load_module+0x2220/0x2560
>>> [    4.819068]  ? kernel_read+0x52/0x80
>>> [    4.819079]  __do_sys_finit_module+0xda/0x100
>>> [    4.819091]  ? __do_sys_finit_module+0xda/0x100
>>> [    4.819104]  __x64_sys_finit_module+0x19/0x20
>>> [    4.819116]  do_syscall_64+0x50/0x1a0
>>> [    4.819127]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> [    4.819140] RIP: 0033:0x7fb45dd2ce3d
>>> [    4.819152] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 23 50 0c 00 f7 d8 64 89 01 48
>>> [    4.819190] RSP: 002b:00007fff3b90ee48 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> [    4.819208] RAX: ffffffffffffffda RBX: 00005586f27eccb0 RCX: 00007fb45dd2ce3d
>>> [    4.819224] RDX: 0000000000000000 RSI: 00007fb45d9a284d RDI: 0000000000000015
>>> [    4.819240] RBP: 00007fb45d9a284d R08: 0000000000000000 R09: 0000000000000001
>>> [    4.819256] R10: 0000000000000015 R11: 0000000000000246 R12: 0000000000000000
>>> [    4.819272] R13: 00005586f27e23b0 R14: 0000000000020000 R15: 00005586f27eccb0
>>> [    4.819289] Modules linked in: aes_x86_64(+) glue_helper crypto_simd i915(+) cryptd snd_hda_intel intel_gtt i2c_algo_bit drm_kms_helper syscopyarea snd_hda_codec sysfillrect sysimgblt snd_hda_core fb_sys_fops r8169 i2c_i801 snd_pcm realtek snd_timer libphy drm snd mei_me mei sch_fq_codel crypto_user efivarfs ipv6 serio_raw atkbd libps2 i8042 serio ums_realtek ext4 crc32c_intel mbcache jbd2 ahci libahci libata
>>> [    4.819387] CR2: 0000000000000000
>>> [    4.819399] ---[ end trace 531c4d73e2bf857e ]---
>>> [    4.819412] RIP: 0010:0x0
>>> [    4.819424] Code: Bad RIP value.
>>> [    4.819434] RSP: 0018:ffffacd84023f918 EFLAGS: 00010287
>>> [    4.819447] RAX: 0000000000000000 RBX: ffff8b11f7f60000 RCX: 00000000000000ae
>>> [    4.819463] RDX: ffffacd84023f92d RSI: ffffacd84023f928 RDI: ffff8b11f884a000
>>> [    4.819479] RBP: ffffacd84023f950 R08: 0000000000000001 R09: 0000000000000000
>>> [    4.819495] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b11f882c7a0
>>> [    4.819511] R13: 0000000000000000 R14: ffff8b11f7f60000 R15: ffffffffc06b71e0
>>> [    4.819527] FS:  00007fb45c549840(0000) GS:ffff8b11fbb00000(0000) knlGS:0000000000000000
>>> [    4.819545] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [    4.819559] CR2: ffffffffffffffd6 CR3: 0000000177d4a000 CR4: 00000000003406e0
>>> [    4.819576] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:38
>>> [    4.819596] in_atomic(): 0, irqs_disabled(): 1, pid: 2008, name: systemd-udevd
>>> [    4.819613] INFO: lockdep is turned off.
>>> [    4.819623] irq event stamp: 31896
>>> [    4.819637] hardirqs last  enabled at (31895): [<ffffffff9f40e875>] kfree+0xc5/0x2a0
>>> [    4.819656] hardirqs last disabled at (31896): [<ffffffff9f201c3d>] trace_hardirqs_off_thunk+0x1a/0x1c
>>> [    4.819679] softirqs last  enabled at (31866): [<ffffffff9fc00327>] __do_softirq+0x327/0x424
>>> [    4.819700] softirqs last disabled at (31859): [<ffffffff9f26f8b3>] irq_exit+0xb3/0xc0
>>> [    4.819719] CPU: 2 PID: 2008 Comm: systemd-udevd Tainted: G      D           5.2.0-next-20190712 #1
>>> [    4.819739] Hardware name: NA ZBOX-CI327NANO-GS-01/ZBOX-CI327NANO-GS-01, BIOS 5.12 04/26/2018
>>> [    4.819758] Call Trace:
>>> [    4.819770]  dump_stack+0x70/0xa0
>>> [    4.819782]  ___might_sleep.cold+0x9f/0xb0
>>> [    4.819794]  __might_sleep+0x46/0x80
>>> [    4.819805]  exit_signals+0x2f/0x330
>>> [    4.819816]  do_exit+0xb3/0xb60
>>> [    4.819827]  rewind_stack_do_exit+0x17/0x20
>>> [    4.819838] RIP: 0033:0x7fb45dd2ce3d
>>> [    4.819849] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 23 50 0c 00 f7 d8 64 89 01 48
>>> [    4.819887] RSP: 002b:00007fff3b90ee48 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> [    4.819905] RAX: ffffffffffffffda RBX: 00005586f27eccb0 RCX: 00007fb45dd2ce3d
>>> [    4.819921] RDX: 0000000000000000 RSI: 00007fb45d9a284d RDI: 0000000000000015
>>> [    4.819937] RBP: 00007fb45d9a284d R08: 0000000000000000 R09: 0000000000000001
>>> [    4.819953] R10: 0000000000000015 R11: 0000000000000246 R12: 0000000000000000
>>> [    4.819969] R13: 00005586f27e23b0 R14: 0000000000020000 R15: 00005586f27eccb0
>>>
>>
>> I debugged a little bit and remount_fs isn't set in sb->s_op.
>> The following at least avoids the NPE, not sure whether it's the correct fix.
> 
> I take it you don't have CONFIG_TMPFS set?
> 
This option is set:

[root at zotac linux-next]# grep TMPFS .config
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y

> In which case we can't use the remount_fs trick and we can't pass
> options to kern_mount(). We could however just set the option direction
> in our superblock -- sadly, the defines are private so we will just have
> to hope they don't change :)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c
> index 099f3397aada..5910315f2069 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gemfs.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c
> @@ -7,6 +7,7 @@
>  #include <linux/fs.h>
>  #include <linux/mount.h>
>  #include <linux/pagemap.h>
> +#include <linux/shmem_fs.h>
> 
>  #include "i915_drv.h"
>  #include "i915_gemfs.h"
> @@ -33,17 +34,10 @@ int i915_gemfs_init(struct drm_i915_private *i915)
>          */
> 
>         if (has_transparent_hugepage()) {
> -               struct super_block *sb = gemfs->mnt_sb;
> +               struct shmem_sb_info *sb_info = gemfs->mnt_sb->s_fs_info;
> +
>                 /* FIXME: Disabled until we get W/A for read BW issue. */
> -               char options[] = "huge=never";
> -               int flags = 0;
> -               int err;
> -
> -               err = sb->s_op->remount_fs(sb, &flags, options);
> -               if (err) {
> -                       kern_unmount(gemfs);
> -                       return err;
> -               }
> +               sb_info->huge = 0; /* SHMEM_HUGE_NEVER */
>         }
> 
>         i915->mm.gemfs = gemfs;
> 
> -Chris
> 
Heiner


More information about the Intel-gfx mailing list