[Mesa-stable] [Mesa-dev] [PATCH] i965/gen6: Fix HiZ hang in WebGL Google maps

Fri Dec 20 13:36:15 PST 2013

On 12/20/2013 04:47 AM, Chad Versace wrote:
> We need to emit depth stall flushes before depth and hiz resolves.
> Placing them at the top of blorp's state emission fixes the hang.
> 
> Fixes HiZ hang in the new WebGL Google maps on Sandybridge Chrome OS.
> Tested by zooming in and out continuously for 2 hours.
> 
> This patch is based on
> https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/8bc07bb70163c3706fb4ba5f980e57dc942f56dd
> 
> CC: mesa-stable at lists.freedesktop.org
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70740
> Signed-off-by: Stéphane Marchesin <marcheu at chromium.org>
> Signed-off-by: Chad Versace <chad.versace at linux.intel.com>
> ---
>  src/mesa/drivers/dri/i965/gen6_blorp.cpp | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
> index 6a5841f..3a0e7ec 100644
> --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp
> +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
> @@ -1012,6 +1012,16 @@ gen6_blorp_emit_primitive(struct brw_context *brw,
>     ADVANCE_BATCH();
>  }
>  
> +static void
> +gen6_emit_hiz_workaround(struct brw_context *brw, enum gen6_hiz_op hiz_op)
> +{
> +   if (hiz_op == GEN6_HIZ_OP_DEPTH_RESOLVE ||
> +       hiz_op == GEN6_HIZ_OP_HIZ_RESOLVE) {
> +      brw->batch.need_workaround_flush = true;
> +      intel_emit_post_sync_nonzero_flush(brw);
> +      intel_emit_depth_stall_flushes(brw);
> +   }
> +}
>  
>  /**
>   * \brief Execute a blit or render pass operation.
> @@ -1034,6 +1044,8 @@ gen6_blorp_exec(struct brw_context *brw,
>     uint32_t wm_bind_bo_offset = 0;
>  
>     uint32_t prog_offset = params->get_wm_prog(brw, &prog_data);
> +
> +   gen6_emit_hiz_workaround(brw, params->hiz_op);
>     gen6_emit_3dstate_multisample(brw, params->dst.num_samples);
>     gen6_emit_3dstate_sample_mask(brw,
>                                   params->dst.num_samples > 1 ?
> 

I'm fine with landing this as is, since it's trivial and getting
Sandybridge stable is really critical.  I'm glad to see this won't
happen on Gen7+.

Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

That said, it would be nice to refine it a little.

The intel_emit_post_sync_nonzero_flush(brw) call should not be
necessary.  The very first BLORP call, gen6_emit_3dstate_multisample,
already calls it.  However, it may not have taken effect in the past,
since intel_emit_post_sync_nonzero_flush checks need_workaround_flush.

I also am dubious whether we need the depth stall flushes, since we
already do them before 3DSTATE_DEPTH_BUFFER, which is documented as
necessary.

So here's what I think is actually going on:

We've seen strong evidence that it's 100% required to do the post-sync
nonzero workaround before non-pipelined commands.  (My patches which
added missing post-sync workarounds greatly reduced the number of GPU
hangs.)

I believe that the root of the problem is that needs_workaround_flush is
not getting set to true sufficiently often, so the post-sync non-zero
workaround isn't happening when necessary.  Currently, we flag it at the
start of each batch, and /after/ each BLORP operation.

That means it's missing in two cases:
1. Switching from normal rendering to BLORP drawing.
   (...fixed by this patch.)
2. Between successive draws (3DPRIMITIVEs) within a batch.
   (I believe this is also necessary, and is not fixed by this patch.)

I would love to see an alternate patch which sets need_workaround_flush
in two places:
1. Batchbuffer init (for the very first draw)
2. After each 3DPRIMITIVE command.

I believe that would fix this hang and potentially other future hangs.

--Ken