[PATCH] Revert "drm/nouveau/device/pci: set as non-CPU-coherent on ARM64"
Alexandre Courbot
acourbot at nvidia.com
Mon May 9 10:28:47 UTC 2016
On 04/29/2016 08:18 PM, Robin Murphy wrote:
> This reverts commit 1733a2ad36741b1812cf8b3f3037c28d0af53f50.
>
> There is apparently something amiss with the way the TTM code handles
> DMA buffers, which the above commit was attempting to work around for
> arm64 systems with non-coherent PCI. Unfortunately, this completely
> breaks systems *with* coherent PCI (which appear to be the majority).
>
> Booting a plain arm64 defconfig + CONFIG_DRM + CONFIG_DRM_NOUVEAU on
> a machine with a PCI GPU having coherent dma_map_ops (in this case a
> 7600GT card plugged into an ARM Juno board) results in a fatal crash:
>
> [ 2.803438] nouveau 0000:06:00.0: DRM: allocated 1024x768 fb: 0x9000, bo ffffffc976141c00
> [ 2.897662] Unable to handle kernel NULL pointer dereference at virtual address 000001ac
> [ 2.897666] pgd = ffffff8008e00000
> [ 2.897675] [000001ac] *pgd=00000009ffffe003, *pud=00000009ffffe003, *pmd=0000000000000000
> [ 2.897680] Internal error: Oops: 96000045 [#1] PREEMPT SMP
> [ 2.897685] Modules linked in:
> [ 2.897692] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc5+ #543
> [ 2.897694] Hardware name: ARM Juno development board (r1) (DT)
> [ 2.897699] task: ffffffc9768a0000 ti: ffffffc9768a8000 task.ti: ffffffc9768a8000
> [ 2.897711] PC is at __memcpy+0x7c/0x180
> [ 2.897719] LR is at OUT_RINGp+0x34/0x70
> [ 2.897724] pc : [<ffffff80083465fc>] lr : [<ffffff800854248c>] pstate: 80000045
> [ 2.897726] sp : ffffffc9768ab360
> [ 2.897732] x29: ffffffc9768ab360 x28: 0000000000000001
> [ 2.897738] x27: ffffffc97624c000 x26: 0000000000000000
> [ 2.897744] x25: 0000000000000080 x24: 0000000000006c00
> [ 2.897749] x23: 0000000000000005 x22: ffffffc97624c010
> [ 2.897755] x21: 0000000000000004 x20: 0000000000000004
> [ 2.897761] x19: ffffffc9763da000 x18: ffffffc976b2491c
> [ 2.897766] x17: 0000000000000007 x16: 0000000000000006
> [ 2.897771] x15: 0000000000000001 x14: 0000000000000001
> [ 2.897777] x13: 0000000000e31b70 x12: ffffffc9768a0080
> [ 2.897783] x11: 0000000000000000 x10: fffffffffffffb00
> [ 2.897788] x9 : 0000000000000000 x8 : 0000000000000000
> [ 2.897793] x7 : 0000000000000000 x6 : 00000000000001ac
> [ 2.897799] x5 : 00000000ffffffff x4 : 0000000000000000
> [ 2.897804] x3 : 0000000000000010 x2 : 0000000000000010
> [ 2.897810] x1 : ffffffc97624c010 x0 : 00000000000001ac
> ...
> [ 2.898494] Call trace:
> [ 2.898499] Exception stack(0xffffffc9768ab1a0 to 0xffffffc9768ab2c0)
> [ 2.898506] b1a0: ffffffc9763da000 0000000000000004 ffffffc9768ab360 ffffff80083465fc
> [ 2.898513] b1c0: ffffffc976801e00 ffffffc9762b8000 ffffffc9768ab1f0 ffffff80080ec158
> [ 2.898520] b1e0: ffffffc9768ab230 ffffff8008496d04 ffffffc975ce6d80 ffffffc9768ab36e
> [ 2.898527] b200: ffffffc9768ab36f ffffffc9768ab29d ffffffc9768ab29e ffffffc9768a0000
> [ 2.898533] b220: ffffffc9768ab250 ffffff80080e70c0 ffffffc9768ab270 ffffff8008496e44
> [ 2.898540] b240: 00000000000001ac ffffffc97624c010 0000000000000010 0000000000000010
> [ 2.898546] b260: 0000000000000000 00000000ffffffff 00000000000001ac 0000000000000000
> [ 2.898552] b280: 0000000000000000 0000000000000000 fffffffffffffb00 0000000000000000
> [ 2.898558] b2a0: ffffffc9768a0080 0000000000e31b70 0000000000000001 0000000000000001
> [ 2.898566] [<ffffff80083465fc>] __memcpy+0x7c/0x180
> [ 2.898574] [<ffffff800853e164>] nv04_fbcon_imageblit+0x1d4/0x2e8
> [ 2.898582] [<ffffff800853d6d0>] nouveau_fbcon_imageblit+0xd8/0xe0
> [ 2.898591] [<ffffff80083c4db4>] soft_cursor+0x154/0x1d8
> [ 2.898598] [<ffffff80083c47b4>] bit_cursor+0x4fc/0x538
> [ 2.898605] [<ffffff80083c0cfc>] fbcon_cursor+0x134/0x1a8
> [ 2.898613] [<ffffff800841c280>] hide_cursor+0x38/0xa0
> [ 2.898620] [<ffffff800841d420>] redraw_screen+0x120/0x228
> [ 2.898628] [<ffffff80083bf268>] fbcon_prepare_logo+0x370/0x3f8
> [ 2.898635] [<ffffff80083bf640>] fbcon_init+0x350/0x560
> [ 2.898641] [<ffffff800841c634>] visual_init+0xac/0x108
> [ 2.898648] [<ffffff800841df14>] do_bind_con_driver+0x1c4/0x3a8
> [ 2.898655] [<ffffff800841e4f4>] do_take_over_console+0x174/0x1e8
> [ 2.898662] [<ffffff80083bf8c4>] do_fbcon_takeover+0x74/0x100
> [ 2.898669] [<ffffff80083c3e44>] fbcon_event_notify+0x8cc/0x920
> [ 2.898680] [<ffffff80080d7e38>] notifier_call_chain+0x50/0x90
> [ 2.898685] [<ffffff80080d8214>] __blocking_notifier_call_chain+0x4c/0x90
> [ 2.898691] [<ffffff80080d826c>] blocking_notifier_call_chain+0x14/0x20
> [ 2.898696] [<ffffff80083c5e1c>] fb_notifier_call_chain+0x1c/0x28
> [ 2.898703] [<ffffff80083c81ac>] register_framebuffer+0x1cc/0x2e0
> [ 2.898712] [<ffffff800845da80>] drm_fb_helper_initial_config+0x288/0x3e8
> [ 2.898719] [<ffffff800853da20>] nouveau_fbcon_init+0xe0/0x118
> [ 2.898727] [<ffffff800852d2f8>] nouveau_drm_load+0x268/0x890
> [ 2.898734] [<ffffff8008466e24>] drm_dev_register+0xbc/0xc8
> [ 2.898740] [<ffffff8008468a88>] drm_get_pci_dev+0xa0/0x180
> [ 2.898747] [<ffffff800852cb28>] nouveau_drm_probe+0x1a0/0x1e0
> [ 2.898755] [<ffffff80083a32e0>] pci_device_probe+0x98/0x110
> [ 2.898763] [<ffffff800858e434>] driver_probe_device+0x204/0x2b0
> [ 2.898770] [<ffffff800858e58c>] __driver_attach+0xac/0xb0
> [ 2.898777] [<ffffff800858c3e0>] bus_for_each_dev+0x60/0xa0
> [ 2.898783] [<ffffff800858dbc0>] driver_attach+0x20/0x28
> [ 2.898789] [<ffffff800858d7b0>] bus_add_driver+0x1d0/0x238
> [ 2.898796] [<ffffff800858ed50>] driver_register+0x60/0xf8
> [ 2.898802] [<ffffff80083a20dc>] __pci_register_driver+0x3c/0x48
> [ 2.898809] [<ffffff8008468eb4>] drm_pci_init+0xf4/0x120
> [ 2.898818] [<ffffff8008c56fc0>] nouveau_drm_init+0x21c/0x230
> [ 2.898825] [<ffffff80080829d4>] do_one_initcall+0x8c/0x190
> [ 2.898832] [<ffffff8008c31af4>] kernel_init_freeable+0x14c/0x1f0
> [ 2.898839] [<ffffff80088a0c20>] kernel_init+0x10/0x100
> [ 2.898845] [<ffffff8008085e10>] ret_from_fork+0x10/0x40
> [ 2.898853] Code: a88120c7 a8c12027 a88120c7 a8c12027 (a88120c7)
> [ 2.898871] ---[ end trace d5713dcad023ee04 ]---
> [ 2.898888] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> In a toss-up between the GPU seeing stale data artefacts on some systems
> vs. catastrophic kernel crashes on other systems, the latter would seem
> to take precedence, so revert this change until the real underlying
> problem can be fixed.
>
> Signed-off-by: Robin Murphy <robin.murphy at arm.com>
> ---
>
> Alex, Ben, Dave,
>
> I know Alex was looking into this, but since we're nearly at -rc6 already
> it looks like the only thing to do for 4.6 is pick the lesser of two evils...
Hi Robin,
Sorry for the delayed reply - I was offline last week.
You are right, so let's pick this patch for now.
Reviewed-by: Alexandre Courbot <acourbot at nvidia.com>
More information about the dri-devel
mailing list