[Mesa-dev] [PATCH 1/3] i965: Fix cache pollution race during L3 partitioning set-up.
currojerez at riseup.net
Thu Jan 14 18:37:01 PST 2016
We need to split the stalling flush from the RO cache invalidation
into a different PIPE_CONTROL command to make sure that the top of the
pipe invalidation happens after any previous rendering is complete.
Otherwise it's possible for previous rendering to pollute the L3 cache
in the short window of time between RO invalidation and the completion
of the stalling flush. Fixes rendering artifacts on Unigine Heaven,
Metro Last Light Redux and Metro 2033 Redux.
Tested-by: Darius Spitznagel <d.spitznagel at goodbytez.de>
src/mesa/drivers/dri/i965/gen7_l3_state.c | 31 +++++++++++++++++++++++--------
1 file changed, 23 insertions(+), 8 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c b/src/mesa/drivers/dri/i965/gen7_l3_state.c
index b63e61c..85f18d0 100644
@@ -330,20 +330,35 @@ setup_l3_config(struct brw_context *brw, const struct brw_l3_config *cfg)
/* According to the hardware docs, the L3 partitioning can only be changed
* while the pipeline is completely drained and the caches are flushed,
- * which involves a first PIPE_CONTROL flush which stalls the pipeline and
- * initiates invalidation of the relevant caches...
+ * which involves a first PIPE_CONTROL flush which stalls the pipeline...
- PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
- PIPE_CONTROL_CONST_CACHE_INVALIDATE |
- PIPE_CONTROL_INSTRUCTION_INVALIDATE |
- /* ...followed by a second stalling flush which guarantees that
- * invalidation is complete when the L3 configuration registers are
- * modified.
+ /* ...followed by a second pipelined PIPE_CONTROL that initiates
+ * invalidation of the relevant caches. Note that because RO invalidation
+ * happens at the top of the pipeline (i.e. right away as the PIPE_CONTROL
+ * command is processed by the CS) we cannot combine it with the previous
+ * stalling flush as the hardware documentation suggests, because that
+ * would cause the CS to stall on previous rendering *after* RO
+ * invalidation and wouldn't prevent the RO caches from being polluted by
+ * concurrent rendering before the stall completes. This intentionally
+ * doesn't implement the SKL+ hardware workaround suggesting to enable CS
+ * stall on PIPE_CONTROLs with the texture cache invalidation bit set for
+ * GPGPU workloads because the previous and subsequent PIPE_CONTROLs
+ * already guarantee that there is no concurrent GPGPU kernel execution
+ * (see SKL HSD 2132585).
+ PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
+ PIPE_CONTROL_CONST_CACHE_INVALIDATE |
+ PIPE_CONTROL_INSTRUCTION_INVALIDATE |
+ /* Now send a third stalling flush to make sure that invalidation is
+ * complete when the L3 configuration registers are modified.
More information about the mesa-dev