[Intel-gfx] NULL ptr dereference in current i915 driver
Michel Thierry
michel.thierry at intel.com
Wed Apr 22 09:11:37 PDT 2015
On 4/22/2015 12:36 AM, Linus Torvalds wrote:
> So I just go the appended NULL pointer de-reference when trying to
> look at a video from my GoPro.
>
> The code disassembles to
>
> 0: 81 fb 00 04 00 00 cmp $0x400,%ebx
> 6: 41 89 07 mov %eax,(%r15)
> 9: 74 78 je 0x83
> b: 48 8d 7c 24 18 lea 0x18(%rsp),%rdi
> 10: e8 6e b3 1b c1 callq 0xffffffffc11bb383
> 15: 84 c0 test %al,%al
> 17: 74 4a je 0x63
> 19: 48 85 ed test %rbp,%rbp
> 1c: 75 b5 jne 0xffffffffffffffd3
> 1e: 48 8b 04 24 mov (%rsp),%rax
> 22: 49 8b 84 c4 98 01 00 mov 0x198(%r12,%rax,8),%rax
> 29: 00
> 2a:* 48 8b 28 mov (%rax),%rbp <-- trapping instruction
> 2d: 65 ff 05 1f e8 ef 3f incl %gs:0x3fefe81f(%rip) # 0x3fefe853
> 34: 48 b8 00 00 00 00 00 movabs $0x160000000000,%rax
> 3b: 16 00 00
>
> which matches up with the asm code
>
> cmpl $1024, %ebx #, act_pte
> movl %eax, (%r15) # D.49217, *_26
> je .L118 #,
> .L110:
> leaq 24(%rsp), %rdi #, tmp156
> call __sg_page_iter_next #
> testb %al, %al # D.49219
> je .L119 #,
> testq %rbp, %rbp # pt_vaddr
> jne .L109 #,
> movq (%rsp), %rax # %sfp, act_pt
> movq 408(%r12,%rax,8), %rax # MEM[(struct i915_hw_ppgtt
> *)vm_8(D)].D.36998.pd.page
> movq (%rax), %rbp # _21->page, D.49221
> #APP
> # 72 "./arch/x86/include/asm/preempt.h" 1
> incl %gs:__preempt_count(%rip) # __preempt_count
> # 0 "" 2
> #NO_APP
> movabsq $24189255811072, %rax #, tmp150
>
> which in turn seems to come from the C code
>
> pt_vaddr =
> kmap_atomic(ppgtt->pd.page_table[act_pt]->page);
>
> (that "testq %rbp,%rbp; jne" just before the oopsing instruction group
> is that "if (pt_vaddr == NULL)" test.
>
> IOW, it looks like
>
> ppgtt->pd.page_table[act_pt]
>
> is NULL, and then trying to dereference ->page off of it is what
> oopses (the preempt-count increment that comes after is the
> "pagefault_disable()" in kmap_atomic, and the big constant we're
> loading into %rax is part of "page_address(page)").
>
> I have no idea why "ppgtt->pd.page_table[act_pt]" would be NULL, but
> clearly it can be. Can somebody who knows this code look into it. I've
> added a few people who have worked in this area recently, in addition
> to the usual maintainer list..
>
> Thanks,
>
> Linus
>
> ---
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffffc010c137>] gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> PGD 0
> Oops: 0000 [#1] SMP
> Modules linked in: rfcomm fuse cmac ip6t_rpfilter ip6t_REJECT
> nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4
> nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_nat ebtable_broute
> bridge stp llc ebtable_filter ebtables ip6table_mangle
> ip6table_security ip6table_raw ip6table_filter ip6_tables
> iptable_mangle iptable_security iptable_raw bnep arc4 vfat fat
> x86_pkg_temp_thermal pn544_mei mei_phy coretemp pn544 hci nfc
> kvm_intel iTCO_wdt iTCO_vendor_support snd_hda_codec_realtek
> snd_hda_codec_hdmi kvm snd_hda_codec_generic uvcvideo
> videobuf2_vmalloc videobuf2_memops microcode videobuf2_core
> snd_hda_intel v4l2_common hid_multitouch snd_hda_controller videodev
> btusb snd_hda_codec iwlmvm media snd_hwdep mac80211 btbcm snd_seq
> btintel bluetooth snd_seq_device joydev snd_pcm serio_raw
> i2c_i801 iwlwifi cfg80211 snd_hda_core sony_laptop snd_timer snd
> rfkill mei_me soundcore lpc_ich shpchp mei mfd_core dm_crypt
> crct10dif_pclmul i915 crc32_pclmul crc32c_intel i2c_algo_bit
> drm_kms_helper ghash_clmulni_intel drm i2c_core video
> CPU: 1 PID: 2697 Comm: chrome Not tainted 4.0.0-09362-g1fc149933fd4 #8
> Hardware name: Sony Corporation SVP11213CXB/VAIO, BIOS R0270V7 05/17/2013
> task: ffff88010dc51b30 ti: ffff88003f328000 task.ti: ffff88003f328000
> RIP: 0010:[<ffffffffc010c137>] [<ffffffffc010c137>]
> gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> RSP: 0018:ffff88003f32b9a8 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000075b1b
> RDX: ffff88007d848990 RSI: 0000000000000001 RDI: ffff88003f32b9c0
> RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88003f6f7e58
> R10: 000000000d836000 R11: 0000000000000000 R12: ffff8800d4164000
> R13: 0000000000000000 R14: 0000000000000001 R15: ffff88003f7bbffc
> FS: 00007f7f0ee94a00(0000) GS:ffff88011fa80000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000005f607000 CR4: 00000000001407e0
> Stack:
> 0000000000000201 000002010dc51b30 0000000000000000 ffff88007d848990
> 0000004000075b1b ffff880100000001 0000000000000fe0 ffff880011215900
> 0000000000000000 ffff88006cc4c380 ffff88003f6f0000 0000000000000001
> Call Trace:
> ggtt_bind_vma+0x97/0x110 [i915]
> i915_vma_bind+0x40/0x410 [i915]
> swiotlb_map_sg_attrs+0x74/0x140
> i915_gem_object_do_pin+0x864/0x9f0 [i915]
> mutex_lock+0x9/0x30
> i915_gem_execbuffer_reserve_vma.isra.20+0x66/0x130 [i915]
> i915_gem_execbuffer_reserve+0x2ec/0x320 [i915]
> i915_gem_do_execbuffer.isra.27+0x5ee/0xf80 [i915]
> mutex_optimistic_spin+0x16e/0x1f0
> __mutex_lock_interruptible_slowpath+0x21/0x130
> shmem_fault+0x57/0x1c0
> drm_gem_object_lookup+0x14/0xa0 [drm]
> i915_gem_execbuffer2+0xb2/0x2a0 [i915]
> drm_ioctl+0x15a/0x580 [drm]
> current_fs_time+0x9/0x50
> do_vfs_ioctl+0x2e8/0x4f0
> file_has_perm+0x77/0x80
> syscall_trace_enter_phase1+0x116/0x140
> SyS_ioctl+0x79/0x90
> system_call_fastpath+0x12/0x6a
> Code: 00 81 fb 00 04 00 00 41 89 07 74 78 48 8d 7c 24 18 e8 6e b3 1b
> c1 84 c0 74 4a 48 85 ed 75 b5 48 8b 04 24 49 8b 84 c4 98 01 00 00 <48>
> 8b 28 65 ff 05 1f e8 ef 3f 48 b8 00 00 00 00 00 16 00 00 48
> RIP gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> RSP <ffff88003f32b9a8>
> CR2: 0000000000000000
>
Hi,
I see a possible va re-allocation that could be the culprit, but the
change was commited just 2 days ago
(http://cgit.freedesktop.org/drm-intel/commit/?id=5c5f645773b6d147bf68c350674dc3ef4f8de83d).
-Michel
More information about the Intel-gfx
mailing list