[Intel-gfx] [PATCH] drm/i915: Use MLC (l3$) for context objects
Daniel Vetter
daniel at ffwll.ch
Mon Apr 8 17:12:27 CEST 2013
On Mon, Apr 08, 2013 at 07:57:40AM -0700, Kenneth Graunke wrote:
> On 04/08/2013 06:28 AM, Chris Wilson wrote:
> >Enabling context support increases SwapBuffers latency by about 20%
> >(measured on an i7-3720qm). We can offset that loss slightly by enabling
> >faster caching for the contexts. As they are not backed by any
> >particular cache (such as the sampler or render caches) our only option
> >is to select the generic mid-level cache. This reduces the latency of
> >the swap by about 5%.
> >
> >Oddly this effect can be observed running smokin-guns on IVB at
> >1280x1024:
> >Using BLT copies for swaps: 151.67 fps
> >Using Render copies for swaps (unpatched): 141.70 fps
> >With contexts disabled: 150.23 fps
> >With contexts in L3$: 150.77 fps
> >
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >Cc: Ben Widawsky <ben at bwidawsk.net>
> >Cc: Kenneth Graunke <kenneth at whitecape.org>
> >---
> > drivers/gpu/drm/i915/i915_gem_context.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> >index 94d873a..a1e8ecb 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_context.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_context.c
> >@@ -152,6 +152,13 @@ create_hw_context(struct drm_device *dev,
> > return ERR_PTR(-ENOMEM);
> > }
> >
> >+ if (INTEL_INFO(dev)->gen >= 7) {
> >+ ret = i915_gem_object_set_cache_level(ctx->obj,
> >+ I915_CACHE_LLC_MLC);
> >+ if (ret)
> >+ goto err_out;
> >+ }
> >+
> > /* The ring associated with the context object is handled by the normal
> > * object tracking code. We give an initial ring value simple to pass an
> > * assertion in the context switch code.
>
> Sounds good to me, Chris. Is this also useful on Sandybridge? I
> don't see why we couldn't use both L3 & LLC there as well.
Snb doesn't have gpu l3 afaik.
>
> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
>
> On a tangent, it would be nice to clean up the I915_CACHE_* enums.
> The term "MLC" isn't really used in the modern documentation, and
> there's even text that says "Ivybridge doesn't have MLC". But what
> it really means here is LLC + L3. It's probably not worth reworking
> until we can send out the new Haswell bits, though.
Yeah, my idea is that iternally we split this into cache_level + flags,
where cache_level controls how coherency works (i.e. where we have to
clflush and do similar magic) and the flags do the fine-tuning (like
enabling l3 caching). But like you've said, can wait for more hsw magic
(or gfdt) landing.
Queued for -next, thanks for the patch.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list