[Mesa-dev] [PATCH] i965/hsw: Implement end of batch workaround
Kenneth Graunke
kenneth at whitecape.org
Thu Jul 9 10:35:19 PDT 2015
From: Ben Widawsky <benjamin.widawsky at intel.com>
This patch can cause an infinite recursion if the previous patch titled, "i965:
Track finished batch state" isn't present (backporters take notice).
v2: Sent out the wrong patch originally. This patches switches the order of
flushes, doing the generic flush before the CC_STATE, and the required
workaround flush afterwards
v3: Only perform workaround for render ring
Add text to the BATCH_RESERVE comments
v4: Rebase; update citation to mention PRM and Wa name; combine two blocks.
Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 27 +++++++++++++++++++++++++--
src/mesa/drivers/dri/i965/intel_batchbuffer.h | 4 ++++
2 files changed, 29 insertions(+), 2 deletions(-)
Hey Ben,
I was going to suggest a few minor changes, and then realized it'd save us both
time if I just typed them up. Here's a v4 of your patch. If it passes Jenkins,
I'd say let's push it. Thanks for remembering this!
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 969d92c..d93ee6e 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -32,6 +32,7 @@
#include "intel_buffers.h"
#include "intel_fbo.h"
#include "brw_context.h"
+#include "brw_defines.h"
#include <xf86drm.h>
#include <i915_drm.h>
@@ -206,10 +207,32 @@ brw_finish_batch(struct brw_context *brw)
*/
brw_emit_query_end(brw);
- /* We may also need to snapshot and disable OA counters. */
- if (brw->batch.ring == RENDER_RING)
+ if (brw->batch.ring == RENDER_RING) {
+ /* We may also need to snapshot and disable OA counters. */
brw_perf_monitor_finish_batch(brw);
+ if (brw->is_haswell) {
+ /* From the Haswell PRM, Volume 2b, Command Reference: Instructions,
+ * 3DSTATE_CC_STATE_POINTERS > "Note":
+ *
+ * "SW must program 3DSTATE_CC_STATE_POINTERS command at the end of every
+ * 3D batch buffer followed by a PIPE_CONTROL with RC flush and CS stall."
+ *
+ * From the example in the docs, it seems to expect a regular pipe control
+ * flush here as well. We may have done it already, but meh.
+ *
+ * See also WaAvoidRCZCounterRollover.
+ */
+ brw_emit_mi_flush(brw);
+ BEGIN_BATCH(2);
+ OUT_BATCH(_3DSTATE_CC_STATE_POINTERS << 16 | (2 - 2));
+ OUT_BATCH(brw->cc.state_offset | 1);
+ ADVANCE_BATCH();
+ brw_emit_pipe_control_flush(brw, PIPE_CONTROL_RENDER_TARGET_FLUSH |
+ PIPE_CONTROL_CS_STALL);
+ }
+ }
+
/* Mark that the current program cache BO has been used by the GPU.
* It will be reallocated if we need to put new programs in for the
* next batch.
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
index fdd07e0..8eaedd1 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
@@ -26,6 +26,10 @@ extern "C" {
* - 3 DWords for MI_REPORT_PERF_COUNT itself on Gen6+. ==> 12 bytes.
* On Ironlake, it's 6 DWords, but we have some slack due to the lack of
* Sandybridge PIPE_CONTROL madness.
+ * - CC_STATE workaround on HSW (12 * 4 = 48 bytes)
+ * - 5 dwords for initial mi_flush
+ * - 2 dwords for CC state setup
+ * - 5 dwords for the required pipe control at the end
*/
#define BATCH_RESERVED 152
--
2.4.5
More information about the mesa-dev
mailing list