[PATCH] drm/amdgpu: fix NULL pointer panic of emit_gds_switch

Christian König deathsimple at vodafone.de
Thu May 11 12:37:15 UTC 2017


Am 11.05.2017 um 12:27 schrieb Chunming Zhou:
> [  338.384770] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [  338.384817] IP: [<          (null)>]           (null)
> [  338.384843] PGD 0
>
> [  338.384865] Oops: 0010 [#1] SMP
> [  338.384881] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_hda_codec(E) eeepc_wmi(E) joydev(E) asus_wmi(E) sparse_keymap(E) video(E) snd_hda_core(E) snd_hwdep(E) snd_pcm(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_rawmidi(E) snd_seq(E) snd_seq_device(E) snd_timer(E) snd(E) soundcore(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) aes_x86_64(E) lrw(E) shpchp(E) gf128mul(E) glue_helper(E) 8250_dw(E) ablk_helper(E) i2c_piix4(E) cryptd(E) serio_raw(E) i2c_designware_platform(E) mac_hid(E) i2c_designware_core(E) nfsd(E) auth_rpcgss(E)
> [  338.385293]  nfs_acl(E) lockd(E) grace(E) sunrpc(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) r8169(E) ahci(E) mii(E) libahci(E) wmi(E)
> [  338.385395] CPU: 10 PID: 1477 Comm: sdma0 Tainted: G           OE   4.9.0-custom #4
> [  338.385432] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017
> [  338.385477] task: ffff880209240000 task.stack: ffffc90001bd4000
> [  338.385505] RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
> [  338.385543] RSP: 0018:ffffc90001bd7d40  EFLAGS: 00010202
> [  338.385568] RAX: ffffffffa072c280 RBX: ffff8801d420f400 RCX: 0000000000004000
> [  338.385602] RDX: 0000000000000000 RSI: 0000000000000005 RDI: ffff880212376018
> [  338.385635] RBP: ffffc90001bd7da8 R08: 0000000000000000 R09: 0000000000004000
> [  338.385669] R10: 0000000000000000 R11: 0000000000000002 R12: ffff880212370000
> [  338.385702] R13: ffff880212370e90 R14: ffff880212376018 R15: 0000000000000001
> [  338.385738] FS:  0000000000000000(0000) GS:ffff88021ee80000(0000) knlGS:0000000000000000
> [  338.385776] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  338.385803] CR2: 0000000000000000 CR3: 0000000212352000 CR4: 00000000003406e0
> [  338.385834] Stack:
> [  338.385843]  ffffffffa05d2313 ffffffff00000000 ffffc90000004000 ffffffff811818d3
> [  338.385879]  0000000000000018 ffff880212370a18 01ffc90000000000 ffff8801bc5f2e00
> [  338.385915]  ffff8801d420f400 ffff880212376018 0000000000000000 0000000000000001
> [  338.385950] Call Trace:
> [  338.385993]  [<ffffffffa05d2313>] ? amdgpu_vm_flush+0x283/0x400 [amdgpu]
> [  338.386025]  [<ffffffff811818d3>] ? printk+0x4d/0x4f
> [  338.386074]  [<ffffffffa05d4906>] amdgpu_ib_schedule+0x4a6/0x4d0 [amdgpu]
> [  338.386140]  [<ffffffffa0673e54>] amdgpu_job_run+0x64/0x180 [amdgpu]
> [  338.386203]  [<ffffffffa0672e09>] amd_sched_main+0x2e9/0x4a0 [amdgpu]
> [  338.386232]  [<ffffffff810bfce0>] ? prepare_to_wait_event+0x110/0x110
> [  338.386295]  [<ffffffffa0672b20>] ? amd_sched_select_entity+0xe0/0xe0 [amdgpu]
> [  338.386327]  [<ffffffff8109b423>] kthread+0xd3/0xf0
> [  338.386349]  [<ffffffff8109b350>] ? kthread_park+0x60/0x60
> [  338.386376]  [<ffffffff817e1ee5>] ret_from_fork+0x25/0x30
> [  338.386401] Code:  Bad RIP value.
> [  338.386420] RIP  [<          (null)>]           (null)
> [  338.386443]  RSP <ffffc90001bd7d40>
> [  338.386458] CR2: 0000000000000000
> [  338.398508] ---[ end trace 4c66fcdc74b9a0a2 ]---
>
> Change-Id: I0867463a9ec13d0f16b7f95bcca218cd42c3e867
> Signed-off-by: Chunming Zhou <David1.Zhou at amd.com>

Shorten the commit message a bit, with that fixed the patch is 
Reviewed-by: Christian König <christian.koenig at amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 84aba1a..bca1fb5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -783,7 +783,7 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job)
>   		mutex_unlock(&id_mgr->lock);
>   	}
>   
> -	if (gds_switch_needed) {
> +	if (ring->funcs->emit_gds_switch && gds_switch_needed) {
>   		id->gds_base = job->gds_base;
>   		id->gds_size = job->gds_size;
>   		id->gws_base = job->gws_base;




More information about the amd-gfx mailing list