[Mesa-dev] [PATCH] i965: Emit Ivybridge VS workaround flushes.
Ian Romanick
idr at freedesktop.org
Sat Feb 11 00:31:51 PST 2012
On 02/10/2012 09:25 PM, Kenneth Graunke wrote:
> I recently discovered this text in the BSpec. It seems wise to comply,
> though I haven't observed it to fix anything yet.
>
> Fixes a regression in glean/fbo since 28cfa1fa213fe.
Eh... how do you know it fixes the regression if you haven't observed it
yet? :p
> NOTE: This is a candidate for stable release branches.
>
> Cc: Eric Anholt<eric at anholt.net>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45221
> Signed-off-by: Kenneth Graunke<kenneth at whitecape.org>
> ---
> src/mesa/drivers/dri/i965/gen7_urb.c | 2 +
> src/mesa/drivers/dri/i965/gen7_vs_state.c | 2 +
> src/mesa/drivers/dri/intel/intel_batchbuffer.c | 26 ++++++++++++++++++++++-
> src/mesa/drivers/dri/intel/intel_batchbuffer.h | 1 +
> 4 files changed, 29 insertions(+), 2 deletions(-)
>
> This also obsoletes Eric's uncommitted patch
> i965/gen7: Always allocate push constant space before uploading.
> as it fixes the same problem without such a large performance hit.
>
> My theory is that since 3DSTATE_PUSH_CONSTANT_ALLOC_VS is a non-pipelined
> command, emitting it at the top of upload_vs_state effectively served the
> same purpose (only a much harsher flush).
>
> diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c b/src/mesa/drivers/dri/i965/gen7_urb.c
> index e6cf1eb..920c9fc 100644
> --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> @@ -99,6 +99,8 @@ gen7_upload_urb(struct brw_context *brw)
> /* GS requirement */
> assert(!brw->gs.prog_active);
>
> + gen7_emit_vs_workaround_flush(intel);
> +
> BEGIN_BATCH(2);
> OUT_BATCH(_3DSTATE_URB_VS<< 16 | (2 - 2));
> OUT_BATCH(brw->urb.nr_vs_entries |
> diff --git a/src/mesa/drivers/dri/i965/gen7_vs_state.c b/src/mesa/drivers/dri/i965/gen7_vs_state.c
> index 0746e6c..a3d652c 100644
> --- a/src/mesa/drivers/dri/i965/gen7_vs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_vs_state.c
> @@ -35,6 +35,8 @@ upload_vs_state(struct brw_context *brw)
> struct intel_context *intel =&brw->intel;
> uint32_t floating_point_mode = 0;
>
> + gen7_emit_vs_workaround_flush(intel);
> +
> BEGIN_BATCH(2);
> OUT_BATCH(_3DSTATE_BINDING_TABLE_POINTERS_VS<< 16 | (2 - 2));
> OUT_BATCH(brw->bind.bo_offset);
> diff --git a/src/mesa/drivers/dri/intel/intel_batchbuffer.c b/src/mesa/drivers/dri/intel/intel_batchbuffer.c
> index c58dee8..d10e008 100644
> --- a/src/mesa/drivers/dri/intel/intel_batchbuffer.c
> +++ b/src/mesa/drivers/dri/intel/intel_batchbuffer.c
> @@ -56,13 +56,13 @@ intel_batchbuffer_init(struct intel_context *intel)
> {
> intel_batchbuffer_reset(intel);
>
> - if (intel->gen == 6) {
> + if (intel->gen>= 6) {
> /* We can't just use brw_state_batch to get a chunk of space for
> * the gen6 workaround because it involves actually writing to
> * the buffer, and the kernel doesn't let us write to the batch.
> */
> intel->batch.workaround_bo = drm_intel_bo_alloc(intel->bufmgr,
> - "gen6 workaround",
> + "pipe_control workaround",
> 4096, 4096);
> }
> }
> @@ -394,6 +394,28 @@ intel_emit_depth_stall_flushes(struct intel_context *intel)
> }
>
> /**
> + * From the BSpec, volume 2a.03: VS Stage Input / State:
> + * "[DevIVB] A PIPE_CONTROL with Post-Sync Operation set to 1h and a depth
> + * stall needs to be sent just prior to any 3DSTATE_VS, 3DSTATE_URB_VS,
> + * 3DSTATE_CONSTANT_VS, 3DSTATE_BINDING_TABLE_POINTER_VS,
> + * 3DSTATE_SAMPLER_STATE_POINTER_VS command. Only one PIPE_CONTROL needs
> + * to be sent before any combination of VS associated 3DSTATE."
> + */
> +void
> +gen7_emit_vs_workaround_flush(struct intel_context *intel)
> +{
> + assert(intel->gen == 7);
> +
> + BEGIN_BATCH(4);
> + OUT_BATCH(_3DSTATE_PIPE_CONTROL);
> + OUT_BATCH(PIPE_CONTROL_DEPTH_STALL | PIPE_CONTROL_WRITE_IMMEDIATE);
> + OUT_RELOC(intel->batch.workaround_bo,
> + I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION, 0);
> + OUT_BATCH(0); /* write data */
> + ADVANCE_BATCH();
> +}
> +
> +/**
> * Emits a PIPE_CONTROL with a non-zero post-sync operation, for
> * implementing two workarounds on gen6. From section 1.4.7.1
> * "PIPE_CONTROL" of the Sandy Bridge PRM volume 2 part 1:
> diff --git a/src/mesa/drivers/dri/intel/intel_batchbuffer.h b/src/mesa/drivers/dri/intel/intel_batchbuffer.h
> index e5e5bd4..751ec99 100644
> --- a/src/mesa/drivers/dri/intel/intel_batchbuffer.h
> +++ b/src/mesa/drivers/dri/intel/intel_batchbuffer.h
> @@ -43,6 +43,7 @@ bool intel_batchbuffer_emit_reloc_fenced(struct intel_context *intel,
> void intel_batchbuffer_emit_mi_flush(struct intel_context *intel);
> void intel_emit_post_sync_nonzero_flush(struct intel_context *intel);
> void intel_emit_depth_stall_flushes(struct intel_context *intel);
> +void gen7_emit_vs_workaround_flush(struct intel_context *intel);
>
> static INLINE uint32_t float_as_int(float f)
> {
More information about the mesa-dev
mailing list