[Intel-gfx] [PATCH] drm/i915: Avoid snooping with userptr where not supported

Chris Wilson chris at chris-wilson.co.uk
Tue Mar 1 17:36:31 UTC 2016


On Tue, Mar 01, 2016 at 05:21:49PM +0000, Tvrtko Ursulin wrote:
> 
> On 01/03/16 17:14, Chris Wilson wrote:
> >On Tue, Mar 01, 2016 at 04:29:35PM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>
> >>    commit e5756c10d841ddb448293c849392f3d6b809561f
> >>    Author: Imre Deak <imre.deak at intel.com>
> >>    Date:   Fri Aug 14 18:43:30 2015 +0300
> >>
> >>        drm/i915/bxt: don't allow cached GEM mappings on A stepping
> >>
> >>Added an exception of disallowing snooping for Broxton A
> >>stepping hardware but userptr was still enabling it regardless.
> >>
> >>Move the check to HAS_SNOOP now that it is used from multiple
> >>call sites and use it.
> >>
> >>Idea is that userspace which relies on userptr snooping, for
> >>example for fine grained buffered SVM, can query the caching
> >>mode on a created userptr object to determine whether coherency
> >>is supported or not.
> >>
> >>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >>Cc: Imre Deak <imre.deak at intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_drv.h         | 1 +
> >>  drivers/gpu/drm/i915/i915_gem.c         | 2 +-
> >>  drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
> >>  3 files changed, 3 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>index 671295523317..73f0db17b990 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.h
> >>+++ b/drivers/gpu/drm/i915/i915_drv.h
> >>@@ -2629,6 +2629,7 @@ struct drm_i915_cmd_table {
> >>  #define HAS_LLC(dev)		(INTEL_INFO(dev)->has_llc)
> >>  #define HAS_WT(dev)		((IS_HASWELL(dev) || IS_BROADWELL(dev)) && \
> >>  				 __I915__(dev)->ellc_size)
> >>+#define HAS_SNOOP(dev)		(!IS_BXT_REVID(dev, 0, BXT_REVID_A1))
> >>  #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
> >>
> >>  #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>index 3d31d3ac589e..1c6e8fb9d392 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -3949,7 +3949,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> >>  		 * cacheline, whereas normally such cachelines would get
> >>  		 * invalidated.
> >>  		 */
> >>-		if (IS_BXT_REVID(dev, 0, BXT_REVID_A1))
> >>+		if (!HAS_SNOOP(dev))
> >
> >if (!HAS_LLC && !HAS_SNOOP)
> >
> >and HAS_SNOOP should be true for everything that is not llc, with the
> >exception above. So make a dev_info->has_snoop feature bit (or
> >uses_snoop?) and then clear the copy inside the struct for the special
> >case.
> 
> Sure I can do that, but isn't it the same as this patch? Unless
> something with LLC and no snooping appears?

I am generalising into something that makes sense here.

if (!HAS_SNOOP(dev)) does not make sense by itself since LLC is not
snooping and so this would appear to also reject all of LLC.

> 
> >>  			return -ENODEV;
> >>
> >>  		level = I915_CACHE_LLC;
> >>diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> >>index 4b09c840d493..48aaff019b42 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> >>@@ -782,7 +782,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file
> >>
> >>  	drm_gem_private_object_init(dev, &obj->base, args->user_size);
> >>  	i915_gem_object_init(obj, &i915_gem_userptr_ops);
> >>-	obj->cache_level = I915_CACHE_LLC;
> >>+	obj->cache_level = HAS_SNOOP(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
> >
> >This should be
> >if (!HAS_SNOOP && !HAS_LLC)
> >	return -ENODEV;
> >
> >as we can not support a coherent pointer shared between memory and the GPU
> >in this case.
> 
> Hm interesting, why not? I mean, what is the difference compared to
> normal BOs which will be uncached on !LLC?

They are allocated as device memory. userptr is for importing normal system
pages into the device with the expectation that such is coherent. To
change that we should have additional flags that tell us that the user
is expecting non-coherent bo which requires the user to manage sync.
That is also typically (thinking about the GL spec) also explicitly
requested at the client API level.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list