[Mesa-dev] i965: L3 cache partitioning.
Ben Widawsky
ben at bwidawsk.net
Fri Sep 11 12:00:52 PDT 2015
On Fri, Sep 11, 2015 at 11:37:21AM -0700, Ben Widawsky wrote:
> On Fri, Sep 11, 2015 at 10:24:29AM -0700, Ben Widawsky wrote:
> > On Sun, Sep 06, 2015 at 06:12:38PM +0200, Francisco Jerez wrote:
> > > This series implements dynamic partitioning of the L3 cache space
> > > among its clients, the purpose is multiple:
> > >
> > > - Steal a chunk of L3 space when necessary and reserve it for SLM as
> > > required to support compute shaders with shared variables.
> > >
> > > - Allow L3 caching of dataport DC memory access where the default L3
> > > partitioning doesn't have any space reserved for it (pre-Gen8) --
> > > Should improve performance of scratch access (register spills and
> > > fills and some forms of indirect array indexing), atomic counters
> > > and images.
> > >
> > > - Allow dynamic changes of the L3 configuration for work-loads that
> > > could benefit from a partitioning other than the default
> > > (e.g. reduce URB size to gain some additional cache space on
> > > heavily fragment-bound workloads, or split the L3 allocation of
> > > different clients to reduce thrashing). The basic infrastructure
> > > to achieve this is implemented here but no specific heuristics are
> > > included yet in this series.
> >
> > I admit to not know how this stuff works pre-GEN8, but it was my impression that
> > on GEN8+ these kind of tweaks will make no difference to 3D clients other than
> > for constant buffers, and scratch space. Every other client of the L3 uses a
> > fixed size. Therefore I am skeptical of your last claim and I'd very much like
> > it if you could help me find where the theory came from and certainly some
> > amount of performance data would be very welcome as well.
> >
> > I certainly believe the partitioning is critical for optimal usage of SLM, and
> > as you mention, ensuring that other users of the dynamic partitioning don't
> > screw us over. It's the rest that I'm unsure of.
> >
>
> Interesting. My information seems to be GEN9+. GEN8 does seem to have a balance
> with the L3 D$
Yeah, I redact my statement now. I think there is value in having this
programmability. I looked at this before too, not sure how I forgot we can
actually split the DC and RO.
More information about the mesa-dev
mailing list