[Intel-gfx] [RFC][PATCH] drm/i915: Try to avoid most cache flushes on LLC platforms (WIP)
Chris Wilson
chris at chris-wilson.co.uk
Mon Aug 5 23:25:30 CEST 2013
On Mon, Aug 05, 2013 at 11:13:24PM +0300, ville.syrjala at linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala at linux.intel.com>
>
> A bit more food for the cache discussions. My idea here is to track
> whether an object is part of any fb, and in case it is we assume it
> can be used for scanout, and thus may need some extra clflushes.
> But otherwise we'd try to avoid clflushes on LLC plarforms.
>
> I also included some GFDT stuff in there just to think a bit about how
> it would work, but I left out the GFDT flushes themselves.
>
> Also I didn't even boot-test this, so it may explode quite
> spectacularly.
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 +++
> drivers/gpu/drm/i915/i915_gem.c | 63 +++++++++++++++++++++++-------
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++-
> drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++-
> drivers/gpu/drm/i915/intel_display.c | 2 +
> drivers/gpu/drm/i915/intel_fb.c | 1 +
> 6 files changed, 72 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8b14e22..ec275ed 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -451,6 +451,8 @@ enum i915_cache_level {
> I915_CACHE_NONE = 0,
> I915_CACHE_LLC,
> I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */
> + I915_CACHE_WT, /* hsw gt3e */
> + I915_CACHE_GFDT = 0x4, /* SNB/IVB: flag ORed w/ the cache mode */
> };
>
> typedef uint32_t gen6_gtt_pte_t;
> @@ -1388,6 +1390,8 @@ struct drm_i915_gem_object {
>
> /** for phy allocated objects */
> struct drm_i915_gem_phys_object *phys_obj;
> +
> + atomic_t fb_count;
> };
> #define to_gem_object(obj) (&((struct drm_i915_gem_object *)(obj))->base)
>
> @@ -1494,6 +1498,7 @@ struct drm_i915_file_private {
> #define HAS_VEBOX(dev) (INTEL_INFO(dev)->has_vebox_ring)
> #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc)
> #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws)
> +#define HAS_GFDT(dev) (0)
>
> #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 6)
> #define HAS_ALIASING_PPGTT(dev) (INTEL_INFO(dev)->gen >=6 && !IS_VALLEYVIEW(dev))
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index bd6eb64..38d3241 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -394,6 +394,20 @@ shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length,
> return ret ? - EFAULT : 0;
> }
>
> +static bool cpu_caches_coherent(struct drm_i915_gem_object *obj)
> +{
> + return HAS_LLC(obj->base.dev) || obj->cache_level != I915_CACHE_NONE;
> +}
Yes, that much better expresses what we mean.
> +static bool
> +scanout_needs_clflush(struct drm_i915_gem_object *obj,
> + enum i915_cache_level cache_level)
> +{
> + return cache_level != I915_CACHE_LLC &&
> + cache_level != I915_CACHE_LLC_MLC &&
> + atomic_read(&obj->fb_count) > 0;
> +}
(Aside, this also a good time to change LLC_MLC to L3!)
GFDT, similarly to WT, I think is purely a bit set on the GPU side. My
experience was that I could not dirty with the CPU then issue a GPU GFDT
flush without seeing cache dirt on the scanout. Actually, I think I just
found my issue in the SNB bspec... Apparently, I must not dirty the
surface before the previous flush is complete - which is solvable by using
the active render tracking. Just another little hoop to jump through for
GFDT.
I think we really want to see the llc/scanout patch first so that we can
give it a thorough test and debunking. Then WT as it is fairly trivial
and then seeing if we can finally get GFDT right.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list