[RFC PATCH] drm: disable WC optimization for cache coherent devices on non-x86

Ard Biesheuvel ard.biesheuvel at linaro.org
Mon Jan 21 10:06:17 UTC 2019


Currently, the DRM code assumes that PCI devices are always cache
coherent for DMA, and that this can be selectively overridden for
some buffers using non-cached mappings on the CPU side and PCIe
NoSnoop transactions on the bus side.

Whether the NoSnoop part is implemented correctly is highly platform
specific. Whether it /matters/ if NoSnoop is implemented correctly or
not is architecture specific: on x86, such transactions are coherent
with the CPU whether the NoSnoop attribute is honored or not. On other
architectures, it depends on whether such transactions may allocate in
caches that are non-coherent with the CPU's uncached mappings.

Bottom line is that we should not rely on this optimization to work
correctly for cache coherent devices in the general case. On the
other hand, disabling this optimization for non-coherent devices
is likely to cause breakage as well, since the driver will assume
cache coherent PCIe if this optimization is turned off.

So rename drm_arch_can_wc_memory() to drm_device_can_wc_memory(), and
pass the drm_device pointer into it so we can base the return value
on whether the device is cache coherent or not if not running on
X86.

Cc: Christian Koenig <christian.koenig at amd.com>
Cc: Alex Deucher <alexander.deucher at amd.com>
Cc: David Zhou <David1.Zhou at amd.com>
Cc: Huang Rui <ray.huang at amd.com>
Cc: Junwei Zhang <Jerry.Zhang at amd.com>
Cc: Michel Daenzer <michel.daenzer at amd.com>
Cc: David Airlie <airlied at linux.ie>
Cc: Daniel Vetter <daniel at ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
Cc: Maxime Ripard <maxime.ripard at bootlin.com>
Cc: Sean Paul <sean at poorly.run>
Cc: Michael Ellerman <mpe at ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh at kernel.crashing.org>
Cc: Will Deacon <will.deacon at arm.com>
Reported-by: Carsten Haitzler <Carsten.Haitzler at arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel at linaro.org>
---
This is a followup to '[RFC PATCH] drm/ttm: force cached mappings for system
RAM on ARM'

https://lore.kernel.org/linux-arm-kernel/20190110072841.3283-1-ard.biesheuvel@linaro.org/

Without t
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  2 +-
 drivers/gpu/drm/radeon/radeon_object.c     |  2 +-
 include/drm/drm_cache.h                    | 19 +++++++++++--------
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 728e15e5d68a..777fa251838f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -480,7 +480,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 	/* For architectures that don't support WC memory,
 	 * mask out the WC flag from the BO
 	 */
-	if (!drm_arch_can_wc_memory())
+	if (!drm_device_can_wc_memory(adev->ddev))
 		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_GTT_USWC;
 #endif
 
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 833e909706a9..610889bf6ab5 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -249,7 +249,7 @@ int radeon_bo_create(struct radeon_device *rdev,
 	/* For architectures that don't support WC memory,
 	 * mask out the WC flag from the BO
 	 */
-	if (!drm_arch_can_wc_memory())
+	if (!drm_device_can_wc_memory(rdev->ddev))
 		bo->flags &= ~RADEON_GEM_GTT_WC;
 #endif
 
diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h
index bfe1639df02d..ced63b1207a3 100644
--- a/include/drm/drm_cache.h
+++ b/include/drm/drm_cache.h
@@ -33,6 +33,8 @@
 #ifndef _DRM_CACHE_H_
 #define _DRM_CACHE_H_
 
+#include <drm/drm_device.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/scatterlist.h>
 
 void drm_clflush_pages(struct page *pages[], unsigned long num_pages);
@@ -41,15 +43,16 @@ void drm_clflush_virt_range(void *addr, unsigned long length);
 u64 drm_get_max_iomem(void);
 
 
-static inline bool drm_arch_can_wc_memory(void)
+static inline bool drm_device_can_wc_memory(struct drm_device *ddev)
 {
-#if defined(CONFIG_PPC) && !defined(CONFIG_NOT_COHERENT_CACHE)
-	return false;
-#elif defined(CONFIG_MIPS) && defined(CONFIG_CPU_LOONGSON3)
-	return false;
-#else
-	return true;
-#endif
+	if (IS_ENABLED(CONFIG_PPC))
+		return IS_ENABLED(CONFIG_NOT_COHERENT_CACHE);
+	else if (IS_ENABLED(CONFIG_MIPS))
+		return !IS_ENABLED(CONFIG_CPU_LOONGSON3);
+	else if (IS_ENABLED(CONFIG_X86))
+		return true;
+
+	return !dev_is_dma_coherent(ddev->dev);
 }
 
 #endif
-- 
2.17.1



More information about the amd-gfx mailing list