[Nouveau] [PATCH v2] Revert "drm/nouveau/device/pci: set as non-CPU-coherent on ARM64"

Robin Murphy robin.murphy at arm.com
Mon Jun 6 09:25:01 UTC 2016


On 06/06/16 08:11, Alexandre Courbot wrote:
> From: Robin Murphy <robin.murphy at arm.com>
>
> This reverts commit 1733a2ad36741b1812cf8b3f3037c28d0af53f50.
>
> There is apparently something amiss with the way the TTM code handles
> DMA buffers, which the above commit was attempting to work around for
> arm64 systems with non-coherent PCI. Unfortunately, this completely
> breaks systems *with* coherent PCI (which appear to be the majority).
>
> Booting a plain arm64 defconfig + CONFIG_DRM + CONFIG_DRM_NOUVEAU on
> a machine with a PCI GPU having coherent dma_map_ops (in this case a
> 7600GT card plugged into an ARM Juno board) results in a fatal crash:
>
> [    2.803438] nouveau 0000:06:00.0: DRM: allocated 1024x768 fb: 0x9000, bo ffffffc976141c00
> [    2.897662] Unable to handle kernel NULL pointer dereference at virtual address 000001ac
> [    2.897666] pgd = ffffff8008e00000
> [    2.897675] [000001ac] *pgd=00000009ffffe003, *pud=00000009ffffe003, *pmd=0000000000000000
> [    2.897680] Internal error: Oops: 96000045 [#1] PREEMPT SMP
> [    2.897685] Modules linked in:
> [    2.897692] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc5+ #543
> [    2.897694] Hardware name: ARM Juno development board (r1) (DT)
> [    2.897699] task: ffffffc9768a0000 ti: ffffffc9768a8000 task.ti: ffffffc9768a8000
> [    2.897711] PC is at __memcpy+0x7c/0x180
> [    2.897719] LR is at OUT_RINGp+0x34/0x70
> [    2.897724] pc : [<ffffff80083465fc>] lr : [<ffffff800854248c>] pstate: 80000045
> [    2.897726] sp : ffffffc9768ab360
> [    2.897732] x29: ffffffc9768ab360 x28: 0000000000000001
> [    2.897738] x27: ffffffc97624c000 x26: 0000000000000000
> [    2.897744] x25: 0000000000000080 x24: 0000000000006c00
> [    2.897749] x23: 0000000000000005 x22: ffffffc97624c010
> [    2.897755] x21: 0000000000000004 x20: 0000000000000004
> [    2.897761] x19: ffffffc9763da000 x18: ffffffc976b2491c
> [    2.897766] x17: 0000000000000007 x16: 0000000000000006
> [    2.897771] x15: 0000000000000001 x14: 0000000000000001
> [    2.897777] x13: 0000000000e31b70 x12: ffffffc9768a0080
> [    2.897783] x11: 0000000000000000 x10: fffffffffffffb00
> [    2.897788] x9 : 0000000000000000 x8 : 0000000000000000
> [    2.897793] x7 : 0000000000000000 x6 : 00000000000001ac
> [    2.897799] x5 : 00000000ffffffff x4 : 0000000000000000
> [    2.897804] x3 : 0000000000000010 x2 : 0000000000000010
> [    2.897810] x1 : ffffffc97624c010 x0 : 00000000000001ac
> ...
> [    2.898494] Call trace:
> [    2.898499] Exception stack(0xffffffc9768ab1a0 to 0xffffffc9768ab2c0)
> [    2.898506] b1a0: ffffffc9763da000 0000000000000004 ffffffc9768ab360 ffffff80083465fc
> [    2.898513] b1c0: ffffffc976801e00 ffffffc9762b8000 ffffffc9768ab1f0 ffffff80080ec158
> [    2.898520] b1e0: ffffffc9768ab230 ffffff8008496d04 ffffffc975ce6d80 ffffffc9768ab36e
> [    2.898527] b200: ffffffc9768ab36f ffffffc9768ab29d ffffffc9768ab29e ffffffc9768a0000
> [    2.898533] b220: ffffffc9768ab250 ffffff80080e70c0 ffffffc9768ab270 ffffff8008496e44
> [    2.898540] b240: 00000000000001ac ffffffc97624c010 0000000000000010 0000000000000010
> [    2.898546] b260: 0000000000000000 00000000ffffffff 00000000000001ac 0000000000000000
> [    2.898552] b280: 0000000000000000 0000000000000000 fffffffffffffb00 0000000000000000
> [    2.898558] b2a0: ffffffc9768a0080 0000000000e31b70 0000000000000001 0000000000000001
> [    2.898566] [<ffffff80083465fc>] __memcpy+0x7c/0x180
> [    2.898574] [<ffffff800853e164>] nv04_fbcon_imageblit+0x1d4/0x2e8
> [    2.898582] [<ffffff800853d6d0>] nouveau_fbcon_imageblit+0xd8/0xe0
> [    2.898591] [<ffffff80083c4db4>] soft_cursor+0x154/0x1d8
> [    2.898598] [<ffffff80083c47b4>] bit_cursor+0x4fc/0x538
> [    2.898605] [<ffffff80083c0cfc>] fbcon_cursor+0x134/0x1a8
> [    2.898613] [<ffffff800841c280>] hide_cursor+0x38/0xa0
> [    2.898620] [<ffffff800841d420>] redraw_screen+0x120/0x228
> [    2.898628] [<ffffff80083bf268>] fbcon_prepare_logo+0x370/0x3f8
> [    2.898635] [<ffffff80083bf640>] fbcon_init+0x350/0x560
> [    2.898641] [<ffffff800841c634>] visual_init+0xac/0x108
> [    2.898648] [<ffffff800841df14>] do_bind_con_driver+0x1c4/0x3a8
> [    2.898655] [<ffffff800841e4f4>] do_take_over_console+0x174/0x1e8
> [    2.898662] [<ffffff80083bf8c4>] do_fbcon_takeover+0x74/0x100
> [    2.898669] [<ffffff80083c3e44>] fbcon_event_notify+0x8cc/0x920
> [    2.898680] [<ffffff80080d7e38>] notifier_call_chain+0x50/0x90
> [    2.898685] [<ffffff80080d8214>] __blocking_notifier_call_chain+0x4c/0x90
> [    2.898691] [<ffffff80080d826c>] blocking_notifier_call_chain+0x14/0x20
> [    2.898696] [<ffffff80083c5e1c>] fb_notifier_call_chain+0x1c/0x28
> [    2.898703] [<ffffff80083c81ac>] register_framebuffer+0x1cc/0x2e0
> [    2.898712] [<ffffff800845da80>] drm_fb_helper_initial_config+0x288/0x3e8
> [    2.898719] [<ffffff800853da20>] nouveau_fbcon_init+0xe0/0x118
> [    2.898727] [<ffffff800852d2f8>] nouveau_drm_load+0x268/0x890
> [    2.898734] [<ffffff8008466e24>] drm_dev_register+0xbc/0xc8
> [    2.898740] [<ffffff8008468a88>] drm_get_pci_dev+0xa0/0x180
> [    2.898747] [<ffffff800852cb28>] nouveau_drm_probe+0x1a0/0x1e0
> [    2.898755] [<ffffff80083a32e0>] pci_device_probe+0x98/0x110
> [    2.898763] [<ffffff800858e434>] driver_probe_device+0x204/0x2b0
> [    2.898770] [<ffffff800858e58c>] __driver_attach+0xac/0xb0
> [    2.898777] [<ffffff800858c3e0>] bus_for_each_dev+0x60/0xa0
> [    2.898783] [<ffffff800858dbc0>] driver_attach+0x20/0x28
> [    2.898789] [<ffffff800858d7b0>] bus_add_driver+0x1d0/0x238
> [    2.898796] [<ffffff800858ed50>] driver_register+0x60/0xf8
> [    2.898802] [<ffffff80083a20dc>] __pci_register_driver+0x3c/0x48
> [    2.898809] [<ffffff8008468eb4>] drm_pci_init+0xf4/0x120
> [    2.898818] [<ffffff8008c56fc0>] nouveau_drm_init+0x21c/0x230
> [    2.898825] [<ffffff80080829d4>] do_one_initcall+0x8c/0x190
> [    2.898832] [<ffffff8008c31af4>] kernel_init_freeable+0x14c/0x1f0
> [    2.898839] [<ffffff80088a0c20>] kernel_init+0x10/0x100
> [    2.898845] [<ffffff8008085e10>] ret_from_fork+0x10/0x40
> [    2.898853] Code: a88120c7 a8c12027 a88120c7 a8c12027 (a88120c7)
> [    2.898871] ---[ end trace d5713dcad023ee04 ]---
> [    2.898888] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> In a toss-up between the GPU seeing stale data artefacts on some systems
> vs. catastrophic kernel crashes on other systems, the latter would seem
> to take precedence, so revert this change until the real underlying
> problem can be fixed.
>
> Signed-off-by: Robin Murphy <robin.murphy at arm.com>
> Acked-by: Alexandre Courbot <acourbot at nvidia.com>
> [acourbot at nvidia.com: port to Nouveau tree, remove bits in lib/]
> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
> ---
> Hi Ben,
>
> I have ported this patch to your tree - could you take it for 4.7? We definitely want
> to avoid these crashes. I am working on a final solution for this that will allow us
> to remove that cpu_coherent flag altogether.

Cheers Alex! Should this also go to stable for 4.6?

Robin.

>   drm/nouveau/nvkm/engine/device/pci.c | 2 +-
>   lib/include/nvif/os.h                | 6 ------
>   2 files changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/drm/nouveau/nvkm/engine/device/pci.c b/drm/nouveau/nvkm/engine/device/pci.c
> index 18fab3973ce5..62ad0300cfa5 100644
> --- a/drm/nouveau/nvkm/engine/device/pci.c
> +++ b/drm/nouveau/nvkm/engine/device/pci.c
> @@ -1614,7 +1614,7 @@ nvkm_device_pci_func = {
>   	.fini = nvkm_device_pci_fini,
>   	.resource_addr = nvkm_device_pci_resource_addr,
>   	.resource_size = nvkm_device_pci_resource_size,
> -	.cpu_coherent = !IS_ENABLED(CONFIG_ARM) && !IS_ENABLED(CONFIG_ARM64),
> +	.cpu_coherent = !IS_ENABLED(CONFIG_ARM),
>   };
>
>   int
> diff --git a/lib/include/nvif/os.h b/lib/include/nvif/os.h
> index 831110904fee..1eda53aa8f45 100644
> --- a/lib/include/nvif/os.h
> +++ b/lib/include/nvif/os.h
> @@ -130,12 +130,6 @@ typedef dma_addr_t resource_size_t;
>   #define IS_ENABLED_CONFIG_ARM 0
>   #endif
>
> -#if defined(CONFIG_ARM64)
> -#define IS_ENABLED_CONFIG_ARM64 1
> -#else
> -#define IS_ENABLED_CONFIG_ARM64 0
> -#endif
> -
>   #if defined(CONFIG_IOMMU_API)
>   #define IS_ENABLED_CONFIG_IOMMU_API 1
>   #else
>



More information about the Nouveau mailing list