[Intel-gfx] [PATCH 1/2] drm/i915/execlists: HWS is uncached on !llc platforms

Ville Syrjälä ville.syrjala at linux.intel.com
Fri Oct 23 12:36:46 PDT 2015


On Fri, Oct 23, 2015 at 08:08:34PM +0100, Chris Wilson wrote:
> On Fri, Oct 23, 2015 at 09:41:29PM +0300, Ville Syrjälä wrote:
> > On Fri, Oct 23, 2015 at 07:29:08PM +0100, Chris Wilson wrote:
> > > On Fri, Oct 23, 2015 at 09:22:38PM +0300, Ville Syrjälä wrote:
> > > > On Fri, Oct 23, 2015 at 06:56:41PM +0100, Chris Wilson wrote:
> > > > > On Fri, Oct 23, 2015 at 08:50:42PM +0300, Ville Syrjälä wrote:
> > > > > > On Fri, Oct 23, 2015 at 06:43:31PM +0100, Chris Wilson wrote:
> > > > > > > As the HWS is mapped into the GPU as uncached,
> > > > > > 
> > > > > > Since when?
> > > > > 
> > > > > Since it is embedded into execlists' default context which is allocated
> > > > > using the system default cache level, i.e. uncached on !llc. See
> > > > > intel_lr_context_deferred_alloc()
> > > > 
> > > > Oh right. That doesn't actually matter since it's mapped through ggtt
> > > > which means it always goes through PAT 0.
> > > 
> > > Oh, that again. Doesn't that mean we broke i915.enable_ppgtt=0?
> > 
> > Just means everything is snooped in that case. As long as the hardware
> > doesn't get too upset about all the snooping it should work. Can't
> > really recall if I actually tried it though. I think I did.
> 
> Just wondering if it means we start getting cacheline dirt on the
> scanout. Though since snooping only occurs on flushes, I guess it
> actually means it hits the backing storage and then is pushed into the
> cpu cache. The other worry is whether we are then generating fsb snoop
> traffic on every context switch. Just idle thoughts as I realise I don't
> know as much about snooping as I'd like.

TBH I never gave !ppgtt too much thought.

The snoops for the context switches have crossed my mind. I was
thinking that maybe we could map the status page through ppgtt and
the rest of the context through ggtt, and then we could make the
PAT 0 non-snooped. But looks like the per-process status page still
needs to be mapped through the ggtt.

But maybe we could also map it through the ppgtt and use the ppgtt
mapping for seqno writes? That's assuming we don't need to look at
whatever else gets stored in the status page through the ggtt mapping.

Or I suppose we could take you approach and just make ggtt non-snooped
and take the clflush hit for seqno reads. No idea which is worse. Would
need to gather some numbers I suppose.

-- 
Ville Syrjälä
Intel OTC


More information about the Intel-gfx mailing list