[Intel-gfx] [PATCH 4/5] drm/i915: Only set gem object L3 cache level for IVB devices

Mon Dec 7 14:26:15 PST 2015

On 12/7/15, 11:56 AM, "Deak, Imre" <imre.deak at intel.com> wrote:

>On Mon, 2015-12-07 at 21:28 +0200, Ville Syrjälä wrote:
>> On Mon, Dec 07, 2015 at 10:51:09AM -0800, Wayne Boyer wrote:
>> > Do some further clean up based on the initial review of
>> > drm/i915: Separate cherryview from valleyview.
>> > 
>> > In this case, in i915_gem_alloc_context_obj() only call
>> > i915_gem_object_set_cache_level() for Ivy Bridge devices
>> > since later platforms don't have L3 control bits in the PTE.
>> > 
>> > Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
>> > Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> > Signed-off-by: Wayne Boyer <wayne.boyer at intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/i915_gem_context.c | 8 +++-----
>> >  1 file changed, 3 insertions(+), 5 deletions(-)
>> > 
>> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
>> > b/drivers/gpu/drm/i915/i915_gem_context.c
>> > index 4b1161d..e4de433 100644
>> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> > @@ -185,12 +185,10 @@ i915_gem_alloc_context_obj(struct drm_device
>> > *dev, size_t size)
>> >  	/*
>> >  	 * Try to make the context utilize L3 as well as LLC.
>> >  	 *
>> > -	 * On VLV we don't have L3 controls in the PTEs so we
>> > -	 * shouldn't touch the cache level, especially as that
>> > -	 * would make the object snooped which might have a
>> > -	 * negative performance impact.
>> > +	 * This is only applicable for Ivy Bridge devices since
>> > +	 * later platforms don't have L3 control bits in the PTE.
>> >  	 */
>> > -	if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) &&
>> > !IS_CHERRYVIEW(dev)) {
>> > +	if (IS_IVYBRIDGE(dev)) {
>> 
>> This would actually change the snoop setting on BXT, but that
>> should be fine since BXT still has the GTT -> PAT 0 thing which
>> means we get snooping anyway IIRC.
>> 
>> Imre, we discussed this stuff at some point, and I hope I'm not
>> forgetting something crucial here that would break BXT. Probably best
>> if you sanity check my thinking here...
>
>Yes, AFAIU it doesn't matter what level we set, because of the PAT
>mapping constraint you describe above. Atm we also don't use this
>function on BXT, b/c of the different codepaths for LRC vs. ringbuffer
>mode. But that could/should be unified at one point, so yea it's better
>to keep things working for BXT too here. In the corresponding
>intel_lr_context_deferred_alloc() we don't set the cache level, which
>also agrees with the above, so I think this change is fine.
>
>> We should probably mention this PAT 0 business in a comment somewhere
>> here since it's a very easy thing to miss.
>
>Yep, a comment about this would be in order.

Ville, Imre, how does this look for an updated comment?

/*
 * Try to make the context utilize L3 as well as LLC.
 *
 * On VLV we don't have L3 controls in the PTE so we
 * shouldn't touch the cache level, especially as that
 * would make the object snooped which might have a
 * negative performance impact.
 *
 * On CHV and BXT we set PPAT 0 in chv_setup_private_ppat()
 * which means we get snooping anyway.
 *
 * This is only applicable for Ivy Bridge devices since
 * later platforms don't have L3 control bits in the PTE.
 */