[Mesa-dev] i965: L3 cache partitioning.
Ben Widawsky
ben at bwidawsk.net
Fri Sep 11 10:24:29 PDT 2015
On Sun, Sep 06, 2015 at 06:12:38PM +0200, Francisco Jerez wrote:
> This series implements dynamic partitioning of the L3 cache space
> among its clients, the purpose is multiple:
>
> - Steal a chunk of L3 space when necessary and reserve it for SLM as
> required to support compute shaders with shared variables.
>
> - Allow L3 caching of dataport DC memory access where the default L3
> partitioning doesn't have any space reserved for it (pre-Gen8) --
> Should improve performance of scratch access (register spills and
> fills and some forms of indirect array indexing), atomic counters
> and images.
>
> - Allow dynamic changes of the L3 configuration for work-loads that
> could benefit from a partitioning other than the default
> (e.g. reduce URB size to gain some additional cache space on
> heavily fragment-bound workloads, or split the L3 allocation of
> different clients to reduce thrashing). The basic infrastructure
> to achieve this is implemented here but no specific heuristics are
> included yet in this series.
I admit to not know how this stuff works pre-GEN8, but it was my impression that
on GEN8+ these kind of tweaks will make no difference to 3D clients other than
for constant buffers, and scratch space. Every other client of the L3 uses a
fixed size. Therefore I am skeptical of your last claim and I'd very much like
it if you could help me find where the theory came from and certainly some
amount of performance data would be very welcome as well.
I certainly believe the partitioning is critical for optimal usage of SLM, and
as you mention, ensuring that other users of the dynamic partitioning don't
screw us over. It's the rest that I'm unsure of.
>
> The series can be found here in a testable form:
> http://cgit.freedesktop.org/~currojerez/mesa/log/?h=i965-l3-partitioning
>
> [PATCH 01/13] i965: Define symbolic constants for some useful L3 cache control registers.
> [PATCH 02/13] i965: Keep track of whether LRI is allowed in the context struct.
> [PATCH 03/13] i965: Define state flag to signal that the URB size has been altered.
> [PATCH 04/13] i965/gen8: Don't add workaround bits to PIPE_CONTROL stalls if DC flush is set.
> [PATCH 05/13] i965: Import tables enumerating the set of validated L3 configurations.
> [PATCH 06/13] i965: Implement programming of the L3 configuration.
> [PATCH 07/13] i965/hsw: Enable L3 atomics.
> [PATCH 08/13] i965: Implement selection of the closest L3 configuration based on a vector of weights.
> [PATCH 09/13] i965: Calculate appropriate L3 partition weights for the current pipeline state.
> [PATCH 10/13] i965: Implement L3 state atom.
> [PATCH 11/13] i965: Add debug flag to print out the new L3 state during transitions.
> [PATCH 12/13] i965: Work around L3 state leaks during context switches.
> [PATCH 13/13] i965: Hook up L3 partitioning state atom.
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list