[PATCH] drm/amdgpu: fix to add buffer funcs check
Huang Rui
ray.huang at amd.com
Tue Apr 11 09:45:12 UTC 2017
On Tue, Apr 11, 2017 at 02:53:27PM +0800, Christian König wrote:
> Am 11.04.2017 um 04:58 schrieb Huang Rui:
> > This patch fixes the case when buffer funcs is empty and bo evict is
> > executing. It must double check buffer funcs, otherwise, a NULL
> > pointer dereference kernel panic will be encountered.
> >
> > BUG: unable to handle kernel NULL pointer dereference at 00000000000001a4
> > IP: [<ffffffffa067b6cd>] amdgpu_evict_flags+0x3d/0xf0 [amdgpu]
> > PGD 0
> >
> > Oops: 0000 [#1] SMP
> > Modules linked in: amdgpu(OE) ttm drm_kms_helper drm i2c_algo_bit
> fb_sys_fops syscopyarea sysfillrect sysimgblt fmem(OE) physmem_drv(OE)
> rpcsec_gss_krb5 nfsv4 nfs fscache intel_rapl x86_pkg_temp_thermal
> intel_powerclamp snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic
> kvm_intel snd_hda_intel snd_hda_codec kvm snd_hda_core joydev eeepc_wmi
> asus_wmi sparse_keymap snd_hwdep snd_pcm irqbypass crct10dif_pclmul
> snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq crc32_pclmul snd_seq_device
> ghash_clmulni_intel aesni_intel aes_x86_64 snd_timer lrw gf128mul mei_me snd
> glue_helper ablk_helper cryptd tpm_infineon mei lpc_ich serio_raw soundcore
> shpchp mac_hid nfsd auth_rpcgss nfs_acl lockd grace coretemp sunrpc parport_pc
> ppdev lp parport autofs4 hid_generic mxm_wmi r8169 usbhid ahci
> > psmouse libahci nvme mii hid nvme_core wmi video
> > CPU: 3 PID: 1627 Comm: kworker/u8:17 Tainted: G OE 4.9.0-custom
> #1
> > Hardware name: ASUS All Series/Z87-A, BIOS 1802 01/28/2014
> > Workqueue: events_unbound async_run_entry_fn
> > task: ffff88021e7057c0 task.stack: ffffc9000262c000
> > RIP: 0010:[<ffffffffa067b6cd>] [<ffffffffa067b6cd>]
> amdgpu_evict_flags+0x3d/0xf0 [amdgpu]
> > RSP: 0018:ffffc9000262fb30 EFLAGS: 00010246
> > RAX: 0000000000000000 RBX: ffff88021e8a5858 RCX: 0000000000000000
> > RDX: 0000000000000001 RSI: ffffc9000262fb58 RDI: ffff88021e8a5800
> > RBP: ffffc9000262fb48 R08: 0000000000000000 R09: ffff88021e8a5814
> > R10: 000000001def8f01 R11: ffff88021def8c80 R12: ffffc9000262fb58
> > R13: ffff88021d2b1990 R14: 0000000000000000 R15: ffff88021e8a5858
> > FS: 0000000000000000(0000) GS:ffff88022ed80000(0000)
> knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00000000000001a4 CR3: 0000000001c07000 CR4: 00000000001406e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>
> Can we have the full stack trace please?
>
> Eviction should never occur before the buffer funcs are initialized, so
> that patch just papers over some kind of race condition on startup as
> far as I can see.
>
If we set ip_block_mask=0xff, sdma ip won't enable it. So funcs_ring is
NULL at that time. Though it is a corner case, but we don't also expect it
hang with kernel panic. I met it when I was debugging S3.
Thanks,
Rui
More information about the amd-gfx
mailing list