[BUG, regression] Dereferencing of NULL pointer in radeon_mn_unregister()

Petr Cvek petrcvekcz at gmail.com
Sun Sep 1 09:38:10 UTC 2019


Hi,

kernel: 5.3.0-rc6-next

After starting Xorg and running xrandr the Xorg crashes with (not exactly useful, it is MIPS dump):

[   28.842553] CPU 0 Unable to handle kernel paging request at virtual address 0000001c, epc == 808de6d4, ra == 804d32ec
[   28.853387] Oops[#1]:
[   28.855699] CPU: 0 PID: 692 Comm: Xorg Not tainted 5.3.0-rc6-next-20190826+ #59
[   28.863104] $ 0   : 00000000 80b60000 00000011 87f1af00
[   28.868407] $ 4   : 0000001c 00000002 00000002 ffff00fe
[   28.873705] $ 8   : 865e9fe0 0000fc00 00000004 00000000
[   28.879003] $12   : 87f1baf0 00000000 0000da9a 00000040
[   28.884301] $16   : 86434450 86434400 00000000 0000001c
[   28.889600] $20   : 865e9dbc 00000000 80912ee4 865e9dbc
[   28.894898] $24   : 80add220 27cfd6fd                  
[   28.900198] $28   : 865e8000 865e9cb8 00000009 804d32ec
[   28.905499] Hi    : 000091bb
[   28.908414] Lo    : ffff6e44
[   28.911350] epc   : 808de6d4 mutex_lock+0x8/0x44
[   28.916045] ra    : 804d32ec radeon_mn_unregister+0x3c/0xb0
[   28.921687] Status: 1100fc03 KERNEL EXL IE 
[   28.925929] Cause : 00800008 (ExcCode 02)
[   28.929987] BadVA : 0000001c
[   28.932903] PrId  : 00019655 (MIPS 24KEc)
[   28.936961] Modules linked in: usbhid hid_generic hid evdev
[   28.942635] Process Xorg (pid: 692, threadinfo=68a84c48, task=84477b53, tls=77e03da0)
[   28.950566] Stack : 00000000 804d32e4 00000001 00000000 84d7b400 84d7b400 8784a078 86434450
[   28.959043]         86632600 8663268c 803a4ed4 8041583c 00000000 803b6d94 865e9dbc 86434450
[   28.967519]         86632600 86434400 86632600 803a451c 87912980 879129ac 80ae0000 00000007
[   28.975996]         00000007 86632620 86632600 803a45d0 87ffc718 71a8f000 71a8f000 87ffc71c
[   28.984472]         71a8efff 800d3c08 865eac00 86632600 00000000 803a5bf4 71a8f000 00000000
[   28.992948]         ...
[   28.995425] Call Trace:
[   28.997905] [<808de6d4>] mutex_lock+0x8/0x44
[   29.002239] [<804d32ec>] radeon_mn_unregister+0x3c/0xb0
[   29.007550] [<8041583c>] radeon_gem_object_free+0x18/0x2c
[   29.013031] [<803a451c>] drm_gem_object_release_handle+0x74/0xac
[   29.019122] [<803a45d0>] drm_gem_handle_delete+0x7c/0x128
[   29.024599] [<803a5bf4>] drm_ioctl_kernel+0xb0/0x108
[   29.029633] [<803a5e74>] drm_ioctl+0x200/0x3a8
[   29.034154] [<803e07b4>] radeon_drm_ioctl+0x54/0xc0
[   29.039110] [<801214dc>] do_vfs_ioctl+0x4e8/0x81c
[   29.043880] [<80121864>] ksys_ioctl+0x54/0xb0
[   29.048305] [<8001100c>] syscall_common+0x34/0x58
[   29.053074] Code: 24050002  27bdfff8  8f830000 <c0850000> 14a00005  00000000  00600825  e0810000  1020fffa 

but it seems there is NULL pointer at this line:

	https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpu/drm/radeon/radeon_mn.c?h=next-20190830#n237

The code is:

	struct radeon_mn *rmn = bo->mn;
	...
	mutex_lock(&rmn->lock);		//<-crash

A quick assert proves the bo->mn returns NULL. The code worked in 4.19-rc and it seems the problematic patch is 

	drm/radeon: use mmu_notifier_get/put for struct radeon_mn

as it removes the NULL check.

Forcing -ENODEV in the register funtion (and immediate return in unregister as without CONFIG_MMU_NOTIFIER) works.

Petr


More information about the amd-gfx mailing list