[REGRESSION] drm/etnaviv: command buffer outside valid memory window

Russell King - ARM Linux admin linux at armlinux.org.uk
Thu Jun 27 14:32:47 UTC 2019


On Thu, Jun 27, 2019 at 11:04:17AM +0100, Russell King - ARM Linux admin wrote:
> On Thu, Jun 27, 2019 at 11:20:15AM +0200, Lucas Stach wrote:
> > Am Samstag, den 22.06.2019, 17:16 +0100 schrieb Russell King - ARM Linux admin:
> > > While updating my various systems for the TCP SACK issue, I notice
> > > that while most platforms are happy, the Cubox-i4 is not.  During
> > > boot, we get:
> > > 
> > > [    0.000000] cma: Reserved 256 MiB at 0x30000000
> > > ...
> > > [    0.000000] Kernel command line: console=ttymxc0,115200n8 console=tty1 video=mxcfb0:dev=hdmi root=/dev/nfs rw cma=256M ahci_imx.hotplug=1 splash resume=/dev/sda1
> > > [    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
> > > [    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
> > > [    0.000000] Memory: 1790972K/2097152K available (8471K kernel code, 693K rwdata, 2844K rodata, 500K init, 8062K bss, 44036K reserved, 262144K cma-reserved, 1310720K highmem)
> > > ...
> > > [   13.101098] etnaviv-gpu 130000.gpu: command buffer outside valid memory window
> > > [   13.171963] etnaviv-gpu 134000.gpu: command buffer outside valid memory window
> > 
> > Yes, that's a regression due to different default CMA area placement
> > and etnaviv not being smart enough to move the linear window to the
> > right offset.
> 
> As it's a user visible regression, it needs fixing, either by reverting
> the changes that caused it or by some other issue.  In the kernel, the
> policy is "if a bug fix causes a regression, the bug fix was itself
> wrong".  We don't fix one person's bug if it causes a regression for
> someone else.
> 
> Please resolve the acknowledged regression.
> 
> > > and shortly after the login prompt appears, the entire SoC appears to
> > > lock up - it becomes unresponsive on the network, or via serial console
> > > to sysrq requests.
> > > 
> > > I suspect the GPU ends up scribbling over the CPU's vector page/kernel
> > > as a result of the above two etnaviv errors when Xorg attempts to start
> > > using the GPU.
> > 
> > This should not be possible. The driver notices that the command buffer
> > isn't accessible to the GPU, which aborts the GPU init. While the
> > etnaviv DRM device is still accessible, it will not expose any
> > enumerable GPU cores to userspace. So there is no way for userspace to
> > actually submit GPU commands.
> 
> Yep, I came to that conclusion.  Nevertheless, if I allow Xorg to start
> with 5.1, the system totally hangs shortly thereafter.  I need to try
> without etnaviv loaded at all.

Well, it seems to get worse.  I just tried to unload etnaviv, and was
greeted by this oops.  It's another regression; etnaviv used to unload
perfectly fine.  Please can you add module unload testing to your
workflow?

Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = da59c000
[00000008] *pgd=8fc0f831
Internal error: Oops: 17 [#1] SMP ARM
Modules linked in: ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_owner xt_multiport iptable_filter ip_tables x_tables bnep rfcomm bluetooth
ecdh_generic nfsd rc_cec snd_soc_fsl_spdif nvmem_imx_ocotp imx_pcm_dma imx_sdma
virt_dma coda v4l2_mem2mem imx_vdoa dw_hdmi_ahb_audio dw_hdmi_cec videobuf2_dma_contig etnaviv(-) gpu_sched imx_thermal snd_soc_imx_spdif imx6q_cpufreq caamrng
caam_jr caam error
CPU: 1 PID: 2898 Comm: rmmod Not tainted 5.1.0+ #319
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
PC is at etnaviv_iommu_put_suballoc_va+0x10/0x68 [etnaviv]
LR is at etnaviv_cmdbuf_suballoc_destroy+0x20/0x48 [etnaviv]
pc : [<bf0521e0>]    lr : [<bf04a664>]    psr: a00f0013
sp : d9f2be40  ip : 000001b0  fp : 00000000
r10: 00000081  r9 : d9f2a000  r8 : c00091c4
r7 : dc993800  r6 : 00000000  r5 : dd4c6810  r4 : 00000000
r3 : b00c0000  r2 : 00040000  r1 : dd4c6810  r0 : dc991840
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 2a59c04a  DAC: 00000051
Process rmmod (pid: 2898, stack limit = 0xd9f2a218)
Stack: (0xd9f2be40 to 0xd9f2c000)
be40: 00000000 00000000 dd4c6800 dd5e9b40 00000000 bf04a664 dd5e9b40 00000000
be60: dc991840 bf04e4d0 bf04e458 dd5e93c0 dd5e9b40 c04aa2e0 00000018 dc993800
be80: c00091c4 dd5e9b40 00000001 c04aa3b4 00000000 dc993800 dd0f9410 dd5a4000
bea0: 00000000 bf04a97c dd5e9b40 dd0f9410 bf05295c c04aa9bc dd5e9b40 c04aaf6c
bec0: dd0f9410 00000000 bf055260 bf04a950 bf04a93c c04b1f00 c04b1edc dd0f9410
bee0: 00000000 c04b0798 c0c493a8 de8af44c dd0f9410 c0c493a8 c0c49408 c04af450
bf00: dd0f9444 dd0f9410 000120a8 c04ac02c c0bf5f44 bec80600 d9f2bf30 c142e46c
bf20: dd0f9400 dd0f9400 000120a8 00000081 c00091c4 c04b2718 bf058390 dd0f9400
bf40: bec80600 c04b2790 bf056140 bf0528c4 bf0528b4 c00d6710 d9f2bf80 616e7465
bf60: 00766976 ddf7b4d8 b6ef5000 00000000 00000001 c0196490 00000001 00000000
bf80: d9f2bf80 d9f2bf80 0095d008 00000000 00000000 0000005b bec805f4 00000880
bfa0: bec80600 c0009000 00000880 bec80600 bec80600 00000880 00009778 bec805f4
bfc0: 00000880 bec80600 000120a8 00000081 00000001 000120bc 00000001 00000000
bfe0: b6e70130 bec805fc 00008f75 b6e7013c 800b0010 bec80600 00000000 00000000
[<bf0521e0>] (etnaviv_iommu_put_suballoc_va [etnaviv]) from [<bf04a664>] (etnaviv_cmdbuf_suballoc_destroy+0x20/0x48 [etnaviv])
[<bf04a664>] (etnaviv_cmdbuf_suballoc_destroy [etnaviv]) from [<bf04e4d0>] (etnaviv_gpu_unbind+0x78/0xc0 [etnaviv])
[<bf04e4d0>] (etnaviv_gpu_unbind [etnaviv]) from [<c04aa2e0>] (component_unbind+0x30/0x68)
[<c04aa2e0>] (component_unbind) from [<c04aa3b4>] (component_unbind_all+0x9c/0xcc)
[<c04aa3b4>] (component_unbind_all) from [<bf04a97c>] (etnaviv_unbind+0x24/0x44
[etnaviv])
[<bf04a97c>] (etnaviv_unbind [etnaviv]) from [<c04aa9bc>] (take_down_master.part.0+0x18/0x30)
[<c04aa9bc>] (take_down_master.part.0) from [<c04aaf6c>] (component_master_del+0x78/0x90)
[<c04aaf6c>] (component_master_del) from [<bf04a950>] (etnaviv_pdev_remove+0x14/0x1c [etnaviv])
[<bf04a950>] (etnaviv_pdev_remove [etnaviv]) from [<c04b1f00>] (platform_drv_remove+0x24/0x3c)
[<c04b1f00>] (platform_drv_remove) from [<c04b0798>] (device_release_driver_internal+0xdc/0x190)
[<c04b0798>] (device_release_driver_internal) from [<c04af450>] (bus_remove_device+0xcc/0xec)
[<c04af450>] (bus_remove_device) from [<c04ac02c>] (device_del+0x124/0x2dc)
[<c04ac02c>] (device_del) from [<c04b2718>] (platform_device_del+0x1c/0x88)
[<c04b2718>] (platform_device_del) from [<c04b2790>] (platform_device_unregister+0xc/0x18)
[<c04b2790>] (platform_device_unregister) from [<bf0528c4>] (etnaviv_exit+0x10/0x30 [etnaviv])
[<bf0528c4>] (etnaviv_exit [etnaviv]) from [<c00d6710>] (sys_delete_module+0x168/0x1b8)
[<c00d6710>] (sys_delete_module) from [<c0009000>] (ret_fast_syscall+0x0/0x28)
Exception stack(0xd9f2bfa8 to 0xd9f2bff0)
bfa0:                   00000880 bec80600 bec80600 00000880 00009778 bec805f4
bfc0: 00000880 bec80600 000120a8 00000081 00000001 000120bc 00000001 00000000
bfe0: b6e70130 bec805fc 00008f75 b6e7013c
Code: e92d4070 e1a05001 e5904588 e24dd008 (e594c008)
---[ end trace 3a2617468df8e3a2 ]---

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


More information about the dri-devel mailing list