[Mesa-dev] i965: L3 cache partitioning.
Ben Widawsky
ben at bwidawsk.net
Fri Sep 11 11:37:21 PDT 2015
On Fri, Sep 11, 2015 at 10:24:29AM -0700, Ben Widawsky wrote:
> On Sun, Sep 06, 2015 at 06:12:38PM +0200, Francisco Jerez wrote:
> > This series implements dynamic partitioning of the L3 cache space
> > among its clients, the purpose is multiple:
> >
> > - Steal a chunk of L3 space when necessary and reserve it for SLM as
> > required to support compute shaders with shared variables.
> >
> > - Allow L3 caching of dataport DC memory access where the default L3
> > partitioning doesn't have any space reserved for it (pre-Gen8) --
> > Should improve performance of scratch access (register spills and
> > fills and some forms of indirect array indexing), atomic counters
> > and images.
> >
> > - Allow dynamic changes of the L3 configuration for work-loads that
> > could benefit from a partitioning other than the default
> > (e.g. reduce URB size to gain some additional cache space on
> > heavily fragment-bound workloads, or split the L3 allocation of
> > different clients to reduce thrashing). The basic infrastructure
> > to achieve this is implemented here but no specific heuristics are
> > included yet in this series.
>
> I admit to not know how this stuff works pre-GEN8, but it was my impression that
> on GEN8+ these kind of tweaks will make no difference to 3D clients other than
> for constant buffers, and scratch space. Every other client of the L3 uses a
> fixed size. Therefore I am skeptical of your last claim and I'd very much like
> it if you could help me find where the theory came from and certainly some
> amount of performance data would be very welcome as well.
>
> I certainly believe the partitioning is critical for optimal usage of SLM, and
> as you mention, ensuring that other users of the dynamic partitioning don't
> screw us over. It's the rest that I'm unsure of.
>
Interesting. My information seems to be GEN9+. GEN8 does seem to have a balance
with the L3 D$
> >
> > The series can be found here in a testable form:
> > http://cgit.freedesktop.org/~currojerez/mesa/log/?h=i965-l3-partitioning
> >
> > [PATCH 01/13] i965: Define symbolic constants for some useful L3 cache control registers.
> > [PATCH 02/13] i965: Keep track of whether LRI is allowed in the context struct.
> > [PATCH 03/13] i965: Define state flag to signal that the URB size has been altered.
> > [PATCH 04/13] i965/gen8: Don't add workaround bits to PIPE_CONTROL stalls if DC flush is set.
> > [PATCH 05/13] i965: Import tables enumerating the set of validated L3 configurations.
> > [PATCH 06/13] i965: Implement programming of the L3 configuration.
> > [PATCH 07/13] i965/hsw: Enable L3 atomics.
> > [PATCH 08/13] i965: Implement selection of the closest L3 configuration based on a vector of weights.
> > [PATCH 09/13] i965: Calculate appropriate L3 partition weights for the current pipeline state.
> > [PATCH 10/13] i965: Implement L3 state atom.
> > [PATCH 11/13] i965: Add debug flag to print out the new L3 state during transitions.
> > [PATCH 12/13] i965: Work around L3 state leaks during context switches.
> > [PATCH 13/13] i965: Hook up L3 partitioning state atom.
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list