On 6 February 2012 00:33, Chad Versace <span dir="ltr">&lt;<a href="mailto:chad.versace@linux.intel.com">chad.versace@linux.intel.com</a>&gt;</span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
The HiZ op was implemented as a meta-op. This patch reimplements it by<br>
emitting a special HiZ batch. This fixes several known bugs, and likely<br>
a lot of undiscovered ones too.<br>
<br>
==== Why the HiZ meta-op needed to die ====<br>
<br>
The HiZ op was implemented as a meta-op, which caused lots of trouble. All<br>
other meta-ops occur as a result of some GL call (for example, glClear and<br>
glGenerateMipmap), but the HiZ meta-op was special. It was called in<br>
places that Mesa (in particular, the vbo and swrast modules) did not<br>
expect---and were not prepared for---state changes to occur (for example:<br>
glDraw; glCallList; within glBegin/End blocks; and within<br>
swrast_prepare_render as a result of intel_miptree_map).<br>
<br>
In an attempt to work around these unexpected state changes, I added two<br>
hooks in i965:<br>
  - A hook for glDraw, located in brw_predraw_resolve_buffers (which is<br>
    called in the glDraw path). This hook detected if a predraw resolve<br>
    meta-op had occurred, and would hackishly repropagate some GL state<br>
    if necessary. This ensured that the meta-op state changes would not<br>
    intefere with the vbo module&#39;s subsequent execution of glDraw.<br>
  - A hook for glBegin, implemented by brwPrepareExecBegin. This hook<br>
    resolved all buffers before entering<br>
    a glBegin/End block, thus preventing an infinitely recurring call to<br>
    vbo_exec_FlushVertices. The vbo module calls vbo_exec_FlushVertices to<br>
    flush its vertex queue in response to GL state changes.<br>
<br>
Unfortunately, these hooks were not sufficient. The meta-op state changes<br>
still interacted badly with glPopAttrib (as discovered in bug 44927) and<br>
with swrast rendering (as discovered by debugging gen6&#39;s swrast fallback<br>
for glBitmap). I expect there are more undiscovered bugs. Rather than play<br>
whack-a-mole in a minefield, the sane approach is to replace the HiZ<br>
meta-op with something safer.<br>
<br>
==== How it was killed ====<br>
<br>
This patch consists of several logical components:<br>
  1. Rewrite the HiZ op by replacing function gen6_resolve_slice with<br>
     gen6_hiz_exec and gen7_hiz_exec. The new functions do not call<br>
     a meta-op, but instead manually construct and emit a batch to &quot;draw&quot;<br>
     the HiZ op&#39;s rectangle primitive. The new functions alter no GL<br>
     state.<br>
  2. Add fields to brw_context::hiz for the new HiZ op.<br>
  3. Emit a workaround flush when toggling 3DSTATE_VS.VsFunctionEnable.<br>
  4. Kill all dead HiZ code:<br>
     - the function gen6_resolve_slice<br>
     - the dirty flag BRW_NEW_HIZ<br>
     - the dead fields in brw_context::hiz<br>
     - the state packet manipulation triggered by the now removed<br>
       brw_context::hiz::op<br>
     - the meta-op workaround in brw_predraw_resolve_buffers (discussed<br>
       above)<br>
     - the meta-op workaround brwPrepareExecBegin (discussed above)<br>
<br>
Note: This is a candidate for the 8.0 branch.<br>
CC: Eric Anholt &lt;<a href="mailto:eric@anholt.net">eric@anholt.net</a>&gt;<br>
CC: Kenneth Graunke &lt;<a href="mailto:kenneth@whitecape.org">kenneth@whitecape.org</a>&gt;<br>
CC: Paul Berry &lt;<a href="mailto:stereotype441@gmail.com">stereotype441@gmail.com</a>&gt;<br>
Bugzilla: <a href="https://bugs.freedesktop.org/show_bug.cgi?id=43327" target="_blank">https://bugs.freedesktop.org/show_bug.cgi?id=43327</a><br>
Reported-by: <a href="mailto:xunx.fang@intel.com">xunx.fang@intel.com</a><br>
Bugzilla: <a href="https://bugs.freedesktop.org/show_bug.cgi?id=44927%0AReported-by" target="_blank">https://bugs.freedesktop.org/show_bug.cgi?id=44927<br>
Reported-by</a>: <a href="mailto:chao.a.chen@intel.com">chao.a.chen@intel.com</a><br>
Signed-off-by: Chad Versace &lt;<a href="mailto:chad.versace@linux.intel.com">chad.versace@linux.intel.com</a>&gt;<br>
---<br>
 src/mesa/drivers/dri/i965/Makefile.sources    |    1 +<br>
 src/mesa/drivers/dri/i965/brw_context.c       |   55 --<br>
 src/mesa/drivers/dri/i965/brw_context.h       |   40 +-<br>
 src/mesa/drivers/dri/i965/brw_draw.c          |   47 +--<br>
 src/mesa/drivers/dri/i965/brw_state_upload.c  |    1 -<br>
 src/mesa/drivers/dri/i965/brw_vtbl.c          |   14 +-<br>
 src/mesa/drivers/dri/i965/gen6_clip_state.c   |   20 +-<br>
 src/mesa/drivers/dri/i965/gen6_depthstencil.c |    9 +-<br>
 src/mesa/drivers/dri/i965/gen6_hiz.c          |  830 ++++++++++++++++--------<br>
 src/mesa/drivers/dri/i965/gen6_hiz.h          |   38 ++<br>
 src/mesa/drivers/dri/i965/gen6_sf_state.c     |   16 +-<br>
 src/mesa/drivers/dri/i965/gen6_vs_state.c     |    9 +<br>
 src/mesa/drivers/dri/i965/gen6_wm_state.c     |   20 +-<br>
 src/mesa/drivers/dri/i965/gen7_clip_state.c   |   20 +-<br>
 src/mesa/drivers/dri/i965/gen7_hiz.c          |  471 ++++++++++++++<br>
 src/mesa/drivers/dri/i965/gen7_hiz.h          |   43 ++<br>
 src/mesa/drivers/dri/i965/gen7_sf_state.c     |   19 +-<br>
 src/mesa/drivers/dri/i965/gen7_wm_state.c     |   18 -<br>
 18 files changed, 1154 insertions(+), 517 deletions(-)<br>
 create mode 100644 src/mesa/drivers/dri/i965/gen7_hiz.c<br>
 create mode 100644 src/mesa/drivers/dri/i965/gen7_hiz.h<br>
 create mode 100644 src/mesa/drivers/dri/i965/junk<br>
<br>
diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources<br>
index 355bfe2..750be51 100644<br>
--- a/src/mesa/drivers/dri/i965/Makefile.sources<br>
+++ b/src/mesa/drivers/dri/i965/Makefile.sources<br>
@@ -100,6 +100,7 @@ i965_C_FILES := \<br>
        gen7_cc_state.c \<br>
        gen7_clip_state.c \<br>
        gen7_disable.c \<br>
+       gen7_hiz.c \<br>
        gen7_misc_state.c \<br>
        gen7_sampler_state.c \<br>
        gen7_sf_state.c \<br>
diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c<br>
index 1ab6310..65de260 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_context.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_context.c<br>
@@ -41,8 +41,6 @@<br>
 #include &quot;brw_draw.h&quot;<br>
 #include &quot;brw_state.h&quot;<br>
<br>
-#include &quot;gen6_hiz.h&quot;<br>
-<br>
 #include &quot;intel_fbo.h&quot;<br>
 #include &quot;intel_mipmap_tree.h&quot;<br>
 #include &quot;intel_regions.h&quot;<br>
@@ -57,58 +55,6 @@<br>
  * Mesa&#39;s Driver Functions<br>
  ***************************************/<br>
<br>
-/**<br>
- * \brief Prepare for entry into glBegin/glEnd block.<br>
- *<br>
- * Resolve buffers before entering a glBegin/glEnd block. This is<br>
- * necessary to prevent recursive calls to FLUSH_VERTICES.<br>
- *<br>
- * This resolves the depth buffer of each enabled depth texture and the HiZ<br>
- * buffer of the attached depth renderbuffer.<br>
- *<br>
- * Details<br>
- * -------<br>
- * When vertices are queued during a glBegin/glEnd block, those vertices must<br>
- * be drawn before any rendering state changes. To ensure this, Mesa calls<br>
- * FLUSH_VERTICES as a prehook to such state changes. Therefore,<br>
- * FLUSH_VERTICES itself cannot change rendering state without falling into a<br>
- * recursive trap.<br>
- *<br>
- * This precludes meta-ops, namely buffer resolves, from occurring while any<br>
- * vertices are queued. To prevent that situation, we resolve some buffers on<br>
- * entering a glBegin/glEnd<br>
- *<br>
- * \see brwCleanupExecEnd()<br>
- */<br>
-static void brwPrepareExecBegin(struct gl_context *ctx)<br>
-{<br>
-   struct brw_context *brw = brw_context(ctx);<br>
-   struct intel_context *intel = &amp;brw-&gt;intel;<br>
-   struct intel_renderbuffer *draw_irb;<br>
-   struct intel_texture_object *tex_obj;<br>
-<br>
-   if (!intel-&gt;has_hiz) {<br>
-      /* The context uses no feature that requires buffer resolves. */<br>
-      return;<br>
-   }<br>
-<br>
-   /* Resolve each enabled texture. */<br>
-   for (int i = 0; i &lt; ctx-&gt;Const.MaxTextureImageUnits; i++) {<br>
-      if (!ctx-&gt;Texture.Unit[i]._ReallyEnabled)<br>
-        continue;<br>
-      tex_obj = intel_texture_object(ctx-&gt;Texture.Unit[i]._Current);<br>
-      if (!tex_obj || !tex_obj-&gt;mt)<br>
-        continue;<br>
-      intel_miptree_all_slices_resolve_depth(intel, tex_obj-&gt;mt);<br>
-   }<br>
-<br>
-   /* Resolve the attached depth buffer. */<br>
-   draw_irb = intel_get_renderbuffer(ctx-&gt;DrawBuffer, BUFFER_DEPTH);<br>
-   if (draw_irb) {<br>
-      intel_renderbuffer_resolve_hiz(intel, draw_irb);<br>
-   }<br>
-}<br>
-<br>
 static void brwInitDriverFunctions(struct intel_screen *screen,<br>
                                   struct dd_function_table *functions)<br>
 {<br>
@@ -117,7 +63,6 @@ static void brwInitDriverFunctions(struct intel_screen *screen,<br>
    brwInitFragProgFuncs( functions );<br>
    brw_init_queryobj_functions(functions);<br>
<br>
-   functions-&gt;PrepareExecBegin = brwPrepareExecBegin;<br>
    functions-&gt;BeginTransformFeedback = brw_begin_transform_feedback;<br>
<br>
    if (screen-&gt;gen &gt;= 7)<br>
diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h<br>
index c027bef..72e5059 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_context.h<br>
+++ b/src/mesa/drivers/dri/i965/brw_context.h<br>
@@ -119,6 +119,10 @@<br>
 #define BRW_MAX_CURBE                    (32*16)<br>
<br>
 struct brw_context;<br>
+struct brw_instruction;<br>
+struct brw_vs_prog_key;<br>
+struct brw_wm_prog_key;<br>
+struct brw_wm_prog_data;<br>
<br>
 enum brw_state_id {<br>
    BRW_STATE_URB_FENCE,<br>
@@ -144,7 +148,6 @@ enum brw_state_id {<br>
    BRW_STATE_VS_CONSTBUF,<br>
    BRW_STATE_PROGRAM_CACHE,<br>
    BRW_STATE_STATE_BASE_ADDRESS,<br>
-   BRW_STATE_HIZ,<br>
    BRW_STATE_SOL_INDICES,<br>
 };<br>
<br>
@@ -174,7 +177,6 @@ enum brw_state_id {<br>
 #define BRW_NEW_VS_CONSTBUF            (1 &lt;&lt; BRW_STATE_VS_CONSTBUF)<br>
 #define BRW_NEW_PROGRAM_CACHE          (1 &lt;&lt; BRW_STATE_PROGRAM_CACHE)<br>
 #define BRW_NEW_STATE_BASE_ADDRESS     (1 &lt;&lt; BRW_STATE_STATE_BASE_ADDRESS)<br>
-#define BRW_NEW_HIZ                    (1 &lt;&lt; BRW_STATE_HIZ)<br>
 #define BRW_NEW_SOL_INDICES            (1 &lt;&lt; BRW_STATE_SOL_INDICES)<br>
<br>
 struct brw_state_flags {<br>
@@ -950,38 +952,18 @@ struct brw_context<br>
    int state_batch_count;<br>
<br>
    /**<br>
-    * \brief State needed to execute HiZ meta-ops<br>
+    * \brief State needed to execute HiZ ops.<br>
     *<br>
-    * All fields except \c op are initialized by gen6_hiz_init().<br>
+    * \see gen6_hiz_init()<br>
+    * \see gen6_hiz_exec()<br>
     */<br>
    struct brw_hiz_state {<br>
-      /**<br>
-       * \brief Indicates which HiZ operation is in progress.<br>
+      /** \brief VBO for rectangle primitive.<br>
        *<br>
-       * See the following sections of the Sandy Bridge PRM, Volume 1, Part2:<br>
-       *   - 7.5.3.1 Depth Buffer Clear<br>
-       *   - 7.5.3.2 Depth Buffer Resolve<br>
-       *   - 7.5.3.3 Hierarchical Depth Buffer Resolve<br>
+       * Rather than using glGenBuffers(), we allocate the VBO directly<br>
+       * through drm.<br>
        */<br>
-      enum brw_hiz_op {<br>
-        BRW_HIZ_OP_NONE = 0,<br>
-        BRW_HIZ_OP_DEPTH_CLEAR,<br>
-        BRW_HIZ_OP_DEPTH_RESOLVE,<br>
-        BRW_HIZ_OP_HIZ_RESOLVE,<br>
-      } op;<br>
-<br>
-      /** \brief Shader state */<br>
-      struct {<br>
-        GLuint program;<br>
-        GLuint position_vbo;<br>
-        GLint position_location;<br>
-      } shader;<br>
-<br>
-      /** \brief VAO for the rectangle primitive&#39;s vertices. */<br>
-      GLuint vao;<br>
-<br>
-      GLuint fbo;<br>
-      struct gl_renderbuffer *depth_rb;<br>
+      drm_intel_bo *vertex_bo;<br>
    } hiz;<br>
<br>
    struct brw_sol_state {<br>
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c<br>
index f50fffd..e919f3e 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_draw.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_draw.c<br>
@@ -126,12 +126,7 @@ static void gen6_set_prim(struct brw_context *brw,<br>
<br>
    DBG(&quot;PRIM: %s\n&quot;, _mesa_lookup_enum_by_nr(prim-&gt;mode));<br>
<br>
-   if (brw-&gt;hiz.op) {<br>
-      assert(prim-&gt;mode == GL_TRIANGLES);<br>
-      hw_prim = _3DPRIM_RECTLIST;<br>
-   } else {<br>
-      hw_prim = prim_to_hw_prim[prim-&gt;mode];<br>
-   }<br>
+   hw_prim = prim_to_hw_prim[prim-&gt;mode];<br>
<br>
    if (hw_prim != brw-&gt;primitive) {<br>
       brw-&gt;primitive = hw_prim;<br>
@@ -307,17 +302,11 @@ brw_predraw_resolve_buffers(struct brw_context *brw)<br>
    struct intel_context *intel = &amp;brw-&gt;intel;<br>
    struct intel_renderbuffer *depth_irb;<br>
    struct intel_texture_object *tex_obj;<br>
-   bool did_resolve = false;<br>
-<br>
-   /* Avoid recursive HiZ op. */<br>
-   if (brw-&gt;hiz.op) {<br>
-      return;<br>
-   }<br>
<br>
    /* Resolve the depth buffer&#39;s HiZ buffer. */<br>
    depth_irb = intel_get_renderbuffer(ctx-&gt;DrawBuffer, BUFFER_DEPTH);<br>
    if (depth_irb &amp;&amp; depth_irb-&gt;mt) {<br>
-      did_resolve |= intel_renderbuffer_resolve_hiz(intel, depth_irb);<br>
+      intel_renderbuffer_resolve_hiz(intel, depth_irb);<br>
    }<br>
<br>
    /* Resolve depth buffer of each enabled depth texture. */<br>
@@ -327,33 +316,7 @@ brw_predraw_resolve_buffers(struct brw_context *brw)<br>
       tex_obj = intel_texture_object(ctx-&gt;Texture.Unit[i]._Current);<br>
       if (!tex_obj || !tex_obj-&gt;mt)<br>
         continue;<br>
-      did_resolve |= intel_miptree_all_slices_resolve_depth(intel, tex_obj-&gt;mt);<br>
-   }<br>
-<br>
-   if (did_resolve) {<br>
-      /* Call vbo_bind_array() to synchronize the vbo module&#39;s vertex<br>
-       * attributes to the gl_context&#39;s.<br>
-       *<br>
-       * Details<br>
-       * -------<br>
-       * The vbo module tracks vertex attributes separately from the<br>
-       * gl_context.  Specifically, the vbo module maintins vertex attributes<br>
-       * in vbo_exec_context::array::inputs, which is synchronized with<br>
-       * gl_context::Array::ArrayObj::VertexAttrib by vbo_bind_array().<br>
-       * vbo_draw_arrays() calls vbo_bind_array() to perform the<br>
-       * synchronization before calling the real draw call,<br>
-       * vbo_context::draw_arrays.<br>
-       *<br>
-       * At this point (after performing a resolve meta-op but before calling<br>
-       * vbo_bind_array), the gl_context&#39;s vertex attributes have been<br>
-       * restored to their original state (that is, their state before the<br>
-       * meta-op began), but the vbo module&#39;s vertex attribute are those used<br>
-       * in the last meta-op. Therefore we must manually synchronize the two with<br>
-       * vbo_bind_array() before continuing with the original draw command.<br>
-       */<br>
-      _mesa_update_state(ctx);<br>
-      vbo_bind_arrays(ctx);<br>
-      _mesa_update_state(ctx);<br>
+      intel_miptree_all_slices_resolve_depth(intel, tex_obj-&gt;mt);<br>
    }<br>
 }<br>
<br>
@@ -372,9 +335,7 @@ static void brw_postdraw_set_buffers_need_resolve(struct brw_context *brw)<br>
    struct intel_renderbuffer *depth_irb =<br>
         intel_get_renderbuffer(fb, BUFFER_DEPTH);<br>
<br>
-   if (depth_irb &amp;&amp;<br>
-       ctx-&gt;Depth.Mask &amp;&amp;<br>
-       !brw-&gt;hiz.op) {<br>
+   if (depth_irb &amp;&amp; ctx-&gt;Depth.Mask) {<br>
       intel_renderbuffer_set_needs_depth_resolve(depth_irb);<br>
    }<br>
 }<br>
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c<br>
index d071f87..f5e6fdc 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c<br>
@@ -372,7 +372,6 @@ static struct dirty_bit_map brw_bits[] = {<br>
    DEFINE_BIT(BRW_NEW_GS_BINDING_TABLE),<br>
    DEFINE_BIT(BRW_NEW_PS_BINDING_TABLE),<br>
    DEFINE_BIT(BRW_NEW_STATE_BASE_ADDRESS),<br>
-   DEFINE_BIT(BRW_NEW_HIZ),<br>
    {0, 0, 0}<br>
 };<br>
<br>
diff --git a/src/mesa/drivers/dri/i965/brw_vtbl.c b/src/mesa/drivers/dri/i965/brw_vtbl.c<br>
index be975d1..724111c 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_vtbl.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_vtbl.c<br>
@@ -50,6 +50,7 @@<br>
 #include &quot;brw_wm.h&quot;<br>
<br>
 #include &quot;gen6_hiz.h&quot;<br>
+#include &quot;gen7_hiz.h&quot;<br>
<br>
 #include &quot;glsl/ralloc.h&quot;<br>
<br>
@@ -70,9 +71,11 @@ static void brw_destroy_context( struct intel_context *intel )<br>
<br>
    brw_destroy_state(brw);<br>
    brw_draw_destroy( brw );<br>
+<br>
    ralloc_free(brw-&gt;wm.compile_data);<br>
<br>
    dri_bo_release(&amp;brw-&gt;curbe.curbe_bo);<br>
+   dri_bo_release(&amp;brw-&gt;hiz.vertex_bo);<br>
    dri_bo_release(&amp;brw-&gt;vs.const_bo);<br>
    dri_bo_release(&amp;brw-&gt;wm.const_bo);<br>
<br>
@@ -236,8 +239,15 @@ void brwInitVtbl( struct brw_context *brw )<br>
    brw-&gt;intel.vtbl.is_hiz_depth_format = brw_is_hiz_depth_format;<br>
<br>
    if (brw-&gt;intel.has_hiz) {<br>
-      brw-&gt;intel.vtbl.resolve_depth_slice = gen6_resolve_depth_slice;<br>
-      brw-&gt;intel.vtbl.resolve_hiz_slice = gen6_resolve_hiz_slice;<br>
+      if (brw-&gt;intel.gen == 7) {<br>
+         brw-&gt;intel.vtbl.resolve_depth_slice = gen7_resolve_depth_slice;<br>
+         brw-&gt;intel.vtbl.resolve_hiz_slice = gen7_resolve_hiz_slice;<br>
+      } else if (brw-&gt;intel.gen == 6) {<br>
+         brw-&gt;intel.vtbl.resolve_depth_slice = gen6_resolve_depth_slice;<br>
+         brw-&gt;intel.vtbl.resolve_hiz_slice = gen6_resolve_hiz_slice;<br>
+      } else {<br>
+         assert(0);<br>
+      }<br>
    }<br>
<br>
    if (brw-&gt;intel.gen &gt;= 7) {<br>
diff --git a/src/mesa/drivers/dri/i965/gen6_clip_state.c b/src/mesa/drivers/dri/i965/gen6_clip_state.c<br>
index d2a5f75..b3bb8ae 100644<br>
--- a/src/mesa/drivers/dri/i965/gen6_clip_state.c<br>
+++ b/src/mesa/drivers/dri/i965/gen6_clip_state.c<br>
@@ -67,23 +67,6 @@ upload_clip_state(struct brw_context *brw)<br>
          GEN6_CLIP_NON_PERSPECTIVE_BARYCENTRIC_ENABLE;<br>
    }<br>
<br>
-   if (brw-&gt;hiz.op) {<br>
-      /* HiZ operations emit a rectangle primitive, which requires clipping to<br>
-       * be disabled. From page 10 of the Sandy Bridge PRM Volume 2 Part 1<br>
-       * Section 1.3 3D Primitives Overview:<br>
-       *    RECTLIST:<br>
-       *    Either the CLIP unit should be DISABLED, or the CLIP unit&#39;s Clip<br>
-       *    Mode should be set to a value other than CLIPMODE_NORMAL.<br>
-       */<br>
-      BEGIN_BATCH(4);<br>
-      OUT_BATCH(_3DSTATE_CLIP &lt;&lt; 16 | (4 - 2));<br>
-      OUT_BATCH(0);<br>
-      OUT_BATCH(0);<br>
-      OUT_BATCH(0);<br>
-      ADVANCE_BATCH();<br>
-      return;<br>
-   }<br>
-<br>
    if (!ctx-&gt;Transform.DepthClamp)<br>
       depth_clamp = GEN6_CLIP_Z_TEST;<br>
<br>
@@ -124,8 +107,7 @@ const struct brw_tracked_state gen6_clip_state = {<br>
    .dirty = {<br>
       .mesa  = _NEW_TRANSFORM | _NEW_LIGHT,<br>
       .brw   = (BRW_NEW_CONTEXT |<br>
-                BRW_NEW_FRAGMENT_PROGRAM |<br>
-                BRW_NEW_HIZ),<br>
+                BRW_NEW_FRAGMENT_PROGRAM),<br>
       .cache = 0<br>
    },<br>
    .emit = upload_clip_state,<br>
diff --git a/src/mesa/drivers/dri/i965/gen6_depthstencil.c b/src/mesa/drivers/dri/i965/gen6_depthstencil.c<br>
index d9f686a..4ea517f 100644<br>
--- a/src/mesa/drivers/dri/i965/gen6_depthstencil.c<br>
+++ b/src/mesa/drivers/dri/i965/gen6_depthstencil.c<br>
@@ -82,11 +82,7 @@ gen6_upload_depth_stencil_state(struct brw_context *brw)<br>
    }<br>
<br>
    /* _NEW_DEPTH */<br>
-   if ((ctx-&gt;Depth.Test || brw-&gt;hiz.op) &amp;&amp; depth_irb) {<br>
-      assert(brw-&gt;hiz.op != BRW_HIZ_OP_DEPTH_RESOLVE || ctx-&gt;Depth.Test);<br>
-      assert(brw-&gt;hiz.op != BRW_HIZ_OP_HIZ_RESOLVE   || !ctx-&gt;Depth.Test);<br>
-      assert(brw-&gt;hiz.op != BRW_HIZ_OP_DEPTH_CLEAR   || !ctx-&gt;Depth.Test);<br>
-<br>
+   if (ctx-&gt;Depth.Test &amp;&amp; depth_irb) {<br>
       ds-&gt;ds2.depth_test_enable = ctx-&gt;Depth.Test;<br>
       ds-&gt;ds2.depth_test_func = intel_translate_compare_func(ctx-&gt;Depth.Func);<br>
       ds-&gt;ds2.depth_write_enable = ctx-&gt;Depth.Mask;<br>
@@ -98,8 +94,7 @@ gen6_upload_depth_stencil_state(struct brw_context *brw)<br>
 const struct brw_tracked_state gen6_depth_stencil_state = {<br>
    .dirty = {<br>
       .mesa = _NEW_DEPTH | _NEW_STENCIL | _NEW_BUFFERS,<br>
-      .brw  = (BRW_NEW_BATCH |<br>
-              BRW_NEW_HIZ),<br>
+      .brw  = BRW_NEW_BATCH,<br>
       .cache = 0,<br>
    },<br>
    .emit = gen6_upload_depth_stencil_state,<br>
diff --git a/src/mesa/drivers/dri/i965/gen6_hiz.c b/src/mesa/drivers/dri/i965/gen6_hiz.c<br>
index d7698ed..2b19100 100644<br>
--- a/src/mesa/drivers/dri/i965/gen6_hiz.c<br>
+++ b/src/mesa/drivers/dri/i965/gen6_hiz.c<br>
@@ -21,345 +21,621 @@<br>
  * IN THE SOFTWARE.<br>
  */<br>
<br>
-#include &quot;gen6_hiz.h&quot;<br>
-<br>
 #include &lt;assert.h&gt;<br>
<br>
-#include &quot;mesa/drivers/common/meta.h&quot;<br>
-<br>
-#include &quot;mesa/main/arrayobj.h&quot;<br>
-#include &quot;mesa/main/bufferobj.h&quot;<br>
-#include &quot;mesa/main/depth.h&quot;<br>
-#include &quot;mesa/main/enable.h&quot;<br>
-#include &quot;mesa/main/fbobject.h&quot;<br>
-#include &quot;mesa/main/framebuffer.h&quot;<br>
-#include &quot;mesa/main/get.h&quot;<br>
-#include &quot;mesa/main/renderbuffer.h&quot;<br>
-#include &quot;mesa/main/shaderapi.h&quot;<br>
-#include &quot;mesa/main/varray.h&quot;<br>
-<br>
+#include &quot;intel_batchbuffer.h&quot;<br>
 #include &quot;intel_fbo.h&quot;<br>
 #include &quot;intel_mipmap_tree.h&quot;<br>
-#include &quot;intel_regions.h&quot;<br>
-#include &quot;intel_tex.h&quot;<br>
<br>
 #include &quot;brw_context.h&quot;<br>
 #include &quot;brw_defines.h&quot;<br>
+#include &quot;brw_state.h&quot;<br>
<br>
-static const uint32_t gen6_hiz_meta_save =<br>
-<br>
-      /* Disable alpha, depth, and stencil test.<br>
-       *<br>
-       * See the following sections of the Sandy Bridge PRM, Volume 1, Part2:<br>
-       *   - 7.5.3.1 Depth Buffer Clear<br>
-       *   - 7.5.3.2 Depth Buffer Resolve<br>
-       *   - 7.5.3.3 Hierarchical Depth Buffer Resolve<br>
-       */<br>
-      MESA_META_ALPHA_TEST |<br>
-      MESA_META_DEPTH_TEST |<br>
-      MESA_META_STENCIL_TEST |<br>
-<br>
-      /* Disable viewport mapping.<br>
-       *<br>
-       * From page 11 of the Sandy Bridge PRM, Volume 2, Part 1, Section 1.3<br>
-       * 3D Primitives Overview:<br>
-       *    RECTLIST:<br>
-       *    Viewport Mapping must be DISABLED (as is typical with the use of<br>
-       *    screen- space coordinates).<br>
-       *<br>
-       * We must also manually disable 3DSTATE_SF.Viewport_Transform_Enable.<br>
-       */<br>
-      MESA_META_VIEWPORT |<br>
-<br>
-      /* Disable clipping.<br>
-       *<br>
-       * From page 11 of the Sandy Bridge PRM, Volume 2, Part 1, Section 1.3<br>
-       * 3D Primitives Overview:<br>
-       *     Either the CLIP unit should be DISABLED, or the CLIP unit’s Clip<br>
-       *     Mode should be set to a value other than CLIPMODE_NORMAL.<br>
-       */<br>
-      MESA_META_CLIP |<br>
-<br>
-      /* Render a solid rectangle (set 3DSTATE_SF.FrontFace_Fill_Mode).<br>
-       *<br>
-       * From page 249 of the Sandy Bridge PRM, Volume 2, Part 1, Section<br>
-       * 6.4.1.1 3DSTATE_SF, FrontFace_Fill_Mode:<br>
-       *     SOLID: Any triangle or rectangle object found to be front-facing<br>
-       *     is rendered as a solid object. This setting is required when<br>
-       *     (rendering rectangle (RECTLIST) objects.<br>
-       * Also see field BackFace_Fill_Mode.<br>
-       *<br>
-       * Note: MESA_META_RASTERIZAION also disables culling, but that is<br>
-       * irrelevant. See 3DSTATE_SF.Cull_Mode.<br>
-       */<br>
-      MESA_META_RASTERIZATION |<br>
-<br>
-      /* Each HiZ operation uses a vertex shader and VAO. */<br>
-      MESA_META_SHADER |<br>
-      MESA_META_VERTEX |<br>
-<br>
-      /* Disable scissoring.<br>
-       *<br>
-       * Scissoring is disabled for resolves because a resolve operation<br>
-       * should resolve the entire buffer. Scissoring is disabled for depth<br>
-       * clears because, if we are performing a partial depth clear, then we<br>
-       * specify the clear region with the RECTLIST vertices.<br>
-       */<br>
-      MESA_META_SCISSOR |<br>
-<br>
-      MESA_META_SELECT_FEEDBACK;<br>
+#include &quot;gen6_hiz.h&quot;<br>
<br>
-static void<br>
-gen6_hiz_get_framebuffer_enum(struct gl_context *ctx,<br>
-                              GLenum *bind_enum,<br>
-                              GLenum *get_enum)<br>
-{<br>
-   if (ctx-&gt;Extensions.EXT_framebuffer_blit &amp;&amp; ctx-&gt;API == API_OPENGL) {<br>
-      /* Different buffers may be bound to GL_DRAW_FRAMEBUFFER and<br>
-       * GL_READ_FRAMEBUFFER. Take care to not disrupt the read buffer.<br>
-       */<br>
-      *bind_enum = GL_DRAW_FRAMEBUFFER;<br>
-      *get_enum = GL_DRAW_FRAMEBUFFER_BINDING;<br>
-   } else {<br>
-      /* The enums GL_DRAW_FRAMEBUFFER and GL_READ_FRAMEBUFFER do not exist.<br>
-       * The bound framebuffer is both the read and draw buffer.<br>
-       */<br>
-      *bind_enum = GL_FRAMEBUFFER;<br>
-      *get_enum = GL_FRAMEBUFFER_BINDING;<br>
-   }<br>
-}<br>
+/**<br>
+ * \name Constants for HiZ VBO<br>
+ * \{<br>
+ *<br>
+ * \see brw_context::hiz::vertex_bo<br>
+ */<br>
+#define GEN6_HIZ_NUM_VERTICES 3<br>
+#define GEN6_HIZ_NUM_VUE_ELEMS 8<br>
+#define GEN6_HIZ_VBO_SIZE (GEN6_HIZ_NUM_VERTICES \<br>
+                           * GEN6_HIZ_NUM_VUE_ELEMS \<br>
+                           * sizeof(float))<br>
+/** \} */<br>
<br>
 /**<br>
- * Initialize static data needed for HiZ operations.<br>
+ * \brief Initialize data needed for the HiZ op.<br>
+ *<br>
+ * This called when executing the first HiZ op.<br>
+ * \see brw_context::hiz<br>
  */<br>
-static void<br>
+void<br>
 gen6_hiz_init(struct brw_context *brw)<br>
 {<br>
    struct gl_context *ctx = &amp;brw-&gt;intel.ctx;<br>
+   struct intel_context *intel = &amp;brw-&gt;intel;<br>
    struct brw_hiz_state *hiz = &amp;brw-&gt;hiz;<br>
-   GLenum fb_bind_enum, fb_get_enum;<br>
<br>
-   if (hiz-&gt;fbo != 0)<br>
-      return;<br>
+   hiz-&gt;vertex_bo = drm_intel_bo_alloc(intel-&gt;bufmgr, &quot;bufferobj&quot;,<br>
+                                       GEN6_HIZ_VBO_SIZE, /* size */<br>
+                                       64); /* alignment */<br>
<br>
-   gen6_hiz_get_framebuffer_enum(ctx, &amp;fb_bind_enum, &amp;fb_get_enum);<br>
+   if (!hiz-&gt;vertex_bo)<br>
+      _mesa_error(ctx, GL_OUT_OF_MEMORY, &quot;failed to allocate internal VBO&quot;);<br>
+}<br>
<br>
-   /* Create depthbuffer.<br>
+void<br>
+gen6_hiz_emit_batch_head(struct brw_context *brw)<br>
+{<br>
+   struct gl_context *ctx = &amp;brw-&gt;intel.ctx;<br>
+   struct intel_context *intel = &amp;brw-&gt;intel;<br>
+   struct brw_hiz_state *hiz = &amp;brw-&gt;hiz;<br>
+<br>
+   /* To ensure that the batch contains only the resolve, flush the batch<br>
+    * before beginning and after finishing emitting the resolve packets.<br>
     *<br>
-    * Until glRenderbufferStorage is called, the renderbuffer hash table<br>
-    * maps the renderbuffer name to a dummy renderbuffer. We need the<br>
-    * renderbuffer to be registered in the hash table so that framebuffer<br>
-    * validation succeeds, so we hackishly allocate storage then immediately<br>
-    * discard it.<br>
+    * Ideally, we would not need to flush for the resolve op. But, I suspect<br>
+    * that it&#39;s unsafe for CMD_PIPELINE_SELECT to occur multiple times in<br>
+    * a single batch, and there is no safe way to ensure that other than by<br>
+    * fencing the resolve with flushes. Ideally, we would just detect if<br>
+    * a batch is in progress and do the right thing, but that would require<br>
+    * the ability to *safely* access brw_context::state::dirty::brw<br>
+    * outside of the brw_state_init() codepath.<br>
     */<br>
-   GLuint depth_rb_name;<br>
-   _mesa_GenRenderbuffersEXT(1, &amp;depth_rb_name);<br>
-   _mesa_BindRenderbufferEXT(GL_RENDERBUFFER, depth_rb_name);<br>
-   _mesa_RenderbufferStorageEXT(GL_RENDERBUFFER, GL_DEPTH_COMPONENT, 32, 32);<br>
-   _mesa_reference_renderbuffer(&amp;hiz-&gt;depth_rb,<br>
-                                _mesa_lookup_renderbuffer(ctx, depth_rb_name));<br>
-   intel_miptree_release(&amp;((struct intel_renderbuffer*) hiz-&gt;depth_rb)-&gt;mt);<br>
-<br>
-   /* Setup FBO. */<br>
-   _mesa_GenFramebuffersEXT(1, &amp;hiz-&gt;fbo);<br>
-   _mesa_BindFramebufferEXT(fb_bind_enum, hiz-&gt;fbo);<br>
-   _mesa_FramebufferRenderbufferEXT(fb_bind_enum,<br>
-                                    GL_DEPTH_ATTACHMENT,<br>
-                                    GL_RENDERBUFFER,<br>
-                                    hiz-&gt;depth_rb-&gt;Name);<br>
-<br>
-   /* Compile vertex shader. */<br>
-   const char *vs_source =<br>
-      &quot;attribute vec4 position;\n&quot;<br>
-      &quot;void main()\n&quot;<br>
-      &quot;{\n&quot;<br>
-      &quot;   gl_Position = position;\n&quot;<br>
-      &quot;}\n&quot;;<br>
-   GLuint vs = _mesa_CreateShaderObjectARB(GL_VERTEX_SHADER);<br>
-   _mesa_ShaderSourceARB(vs, 1, &amp;vs_source, NULL);<br>
-   _mesa_CompileShaderARB(vs);<br>
-<br>
-   /* Compile fragment shader. */<br>
-   const char *fs_source = &quot;void main() {}&quot;;<br>
-   GLuint fs = _mesa_CreateShaderObjectARB(GL_FRAGMENT_SHADER);<br>
-   _mesa_ShaderSourceARB(fs, 1, &amp;fs_source, NULL);<br>
-   _mesa_CompileShaderARB(fs);<br>
-<br>
-   /* Link and use program. */<br>
-   hiz-&gt;shader.program = _mesa_CreateProgramObjectARB();<br>
-   _mesa_AttachShader(hiz-&gt;shader.program, vs);<br>
-   _mesa_AttachShader(hiz-&gt;shader.program, fs);<br>
-   _mesa_LinkProgramARB(hiz-&gt;shader.program);<br>
-   _mesa_UseProgramObjectARB(hiz-&gt;shader.program);<br>
-<br>
-   /* Create and bind VAO. */<br>
-   _mesa_GenVertexArrays(1, &amp;hiz-&gt;vao);<br>
-   _mesa_BindVertexArray(hiz-&gt;vao);<br>
-<br>
-   /* Setup VBO for &#39;position&#39;. */<br>
-   hiz-&gt;shader.position_location =<br>
-      _mesa_GetAttribLocationARB(hiz-&gt;shader.program, &quot;position&quot;);<br>
-   _mesa_GenBuffersARB(1, &amp;hiz-&gt;shader.position_vbo);<br>
-   _mesa_BindBufferARB(GL_ARRAY_BUFFER_ARB, hiz-&gt;shader.position_vbo);<br>
-   _mesa_VertexAttribPointerARB(hiz-&gt;shader.position_location,<br>
-                               2, /*components*/<br>
-                               GL_FLOAT,<br>
-                               GL_FALSE, /*normalized?*/<br>
-                               0, /*stride*/<br>
-                               NULL);<br>
-   _mesa_EnableVertexAttribArrayARB(hiz-&gt;shader.position_location);<br>
-<br>
-   /* Cleanup. */<br>
-   _mesa_DeleteShader(vs);<br>
-   _mesa_DeleteShader(fs);<br>
+   intel_flush(ctx);<br>
+<br>
+   /* CMD_PIPELINE_SELECT<br>
+    *<br>
+    * Select the 3D pipeline, as opposed to the media pipeline.<br>
+    */<br>
+   {<br>
+      BEGIN_BATCH(1);<br>
+      OUT_BATCH(brw-&gt;CMD_PIPELINE_SELECT &lt;&lt; 16);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DSTATE_MULTISAMPLE */<br>
+   {<br>
+      int length = intel-&gt;gen == 7 ? 4 : 3;<br>
+<br>
+      BEGIN_BATCH(length);<br>
+      OUT_BATCH(_3DSTATE_MULTISAMPLE &lt;&lt; 16 | (3 - 2));<br>
+      OUT_BATCH(MS_PIXEL_LOCATION_CENTER |<br>
+                MS_NUMSAMPLES_1);<br>
+      OUT_BATCH(0);<br>
+      if (length &gt;= 4)<br>
+         OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+<br>
+   }<br>
+<br>
+   /* 3DSTATE_SAMPLE_MASK */<br>
+   {<br>
+      BEGIN_BATCH(2);<br>
+      OUT_BATCH(_3DSTATE_SAMPLE_MASK &lt;&lt; 16 | (2 - 2));<br>
+      OUT_BATCH(1);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* CMD_STATE_BASE_ADDRESS<br>
+    *<br>
+    * From the Sandy Bridge PRM, Volume 1, Part 1, Table STATE_BASE_ADDRESS:<br>
+    *     The following commands must be reissued following any change to the<br>
+    *     base addresses:<br>
+    *         3DSTATE_CC_POINTERS<br>
+    *         3DSTATE_BINDING_TABLE_POINTERS<br>
+    *         3DSTATE_SAMPLER_STATE_POINTERS<br>
+    *         3DSTATE_VIEWPORT_STATE_POINTERS<br>
+    *         MEDIA_STATE_POINTERS<br>
+    */<br>
+   {<br>
+      BEGIN_BATCH(10);<br>
+      OUT_BATCH(CMD_STATE_BASE_ADDRESS &lt;&lt; 16 | (10 - 2));<br>
+      OUT_BATCH(1); /* GeneralStateBaseAddressModifyEnable */<br>
+      /* SurfaceStateBaseAddress */<br>
+      OUT_RELOC(intel-&gt;<a href="http://batch.bo" target="_blank">batch.bo</a>, I915_GEM_DOMAIN_SAMPLER, 0, 1);<br>
+      /* DynamicStateBaseAddress */<br>
+      OUT_RELOC(intel-&gt;<a href="http://batch.bo" target="_blank">batch.bo</a>, (I915_GEM_DOMAIN_RENDER |<br>
+                                  I915_GEM_DOMAIN_INSTRUCTION), 0, 1);<br>
+      OUT_BATCH(1); /* IndirectObjectBaseAddress */<br>
+      OUT_BATCH(1); /* InstructionBaseAddress */<br>
+      OUT_BATCH(1); /* GeneralStateUpperBound */<br>
+      OUT_BATCH(1); /* DynamicStateUpperBound */<br>
+      OUT_BATCH(1); /* IndirectObjectUpperBound*/<br>
+      OUT_BATCH(1); /* InstructionAccessUpperBound */<br>
+      ADVANCE_BATCH();<br>
+   }<br>
 }<br>
<br>
-/**<br>
- * Wrap \c brw-&gt;hiz.depth_rb around a miptree.<br>
- *<br>
- * \see gen6_hiz_teardown_depth_buffer()<br>
- */<br>
-static void<br>
-gen6_hiz_setup_depth_buffer(struct brw_context *brw,<br>
-                           struct intel_mipmap_tree *mt,<br>
-                           unsigned int level,<br>
-                           unsigned int layer)<br>
+void<br>
+gen6_hiz_emit_vertices(struct brw_context *brw,<br>
+                       struct intel_mipmap_tree *mt,<br>
+                       unsigned int level,<br>
+                       unsigned int layer)<br>
 {<br>
-   struct gl_renderbuffer *rb = brw-&gt;hiz.depth_rb;<br>
-   struct intel_renderbuffer *irb = intel_renderbuffer(rb);<br>
+   struct intel_context *intel = &amp;brw-&gt;intel;<br>
+   struct brw_hiz_state *hiz = &amp;brw-&gt;hiz;<br>
<br>
-   rb-&gt;Format = mt-&gt;format;<br>
-   rb-&gt;_BaseFormat = _mesa_get_format_base_format(rb-&gt;Format);<br>
-   rb-&gt;InternalFormat = rb-&gt;_BaseFormat;<br>
-   rb-&gt;Width = mt-&gt;level[level].width;<br>
-   rb-&gt;Height = mt-&gt;level[level].height;<br>
+   /* Setup VBO for the rectangle primitive..<br>
+    *<br>
+    * A rectangle primitive (3DPRIM_RECTLIST) consists of only three<br>
+    * vertices. The vertices reside in screen space with DirectX coordinates<br>
+    * (that is, (0, 0) is the upper left corner).<br>
+    *<br>
+    *   v2 ------ implied<br>
+    *    |        |<br>
+    *    |        |<br>
+    *   v0 ----- v1<br>
+    *<br>
+    * Since the VS is disabled, the clipper loads each VUE directly from<br>
+    * the URB. This is controlled by the 3DSTATE_VERTEX_BUFFERS and<br>
+    * 3DSTATE_VERTEX_ELEMENTS packets below. The VUE contents are as follows:<br>
+    *   dw0: Reserved, MBZ.<br>
+    *   dw1: Render Target Array Index. The HiZ op does not use indexed<br>
+    *        vertices, so set the dword to 0.<br>
+    *   dw2: Viewport Index. The HiZ op disables viewport mapping and<br>
+    *        scissoring, so set the dword to 0.<br>
+    *   dw3: Point Width: The HiZ op does not emit the POINTLIST primitive, so<br>
+    *        set the dword to 0.<br>
+    *   dw4: Vertex Position X.<br>
+    *   dw5: Vertex Position Y.<br>
+    *   dw6: Vertex Position Z.<br>
+    *   dw7: Vertex Position W.<br>
+    *<br>
+    * For details, see the Sandybridge PRM, Volume 2, Part 1, Section 1.5.1<br>
+    * &quot;Vertex URB Entry (VUE) Formats&quot;.<br>
+    */<br>
+   {<br>
+      const int width = mt-&gt;level[level].width;<br>
+      const int height = mt-&gt;level[level].height;<br>
<br>
-   irb-&gt;mt_level = level;<br>
-   irb-&gt;mt_layer = layer;<br>
+      const float vertices[GEN6_HIZ_VBO_SIZE] = {<br>
+         /* v0 */ 0, 0, 0, 0,         0, height, 0, 1,<br>
+         /* v1 */ 0, 0, 0, 0,     width, height, 0, 1,<br>
+         /* v2 */ 0, 0, 0, 0,         0,      0, 0, 1,<br>
+      };<br>
<br>
-   intel_miptree_reference(&amp;irb-&gt;mt, mt);<br>
-   intel_renderbuffer_set_draw_offset(irb);<br>
+      drm_intel_bo_subdata(hiz-&gt;vertex_bo, 0, GEN6_HIZ_VBO_SIZE, vertices);<br>
+   }<br>
+<br>
+   /* 3DSTATE_VERTEX_BUFFERS */<br>
+   {<br>
+      const int num_buffers = 1;<br>
+      const int batch_length = 1 + 4 * num_buffers;<br>
+<br>
+      uint32_t dw0 = GEN6_VB0_ACCESS_VERTEXDATA |<br>
+                     (GEN6_HIZ_NUM_VUE_ELEMS * sizeof(float)) &lt;&lt; BRW_VB0_PITCH_SHIFT;<br>
+<br>
+      if (intel-&gt;gen &gt;= 7)<br>
+         dw0 |= GEN7_VB0_ADDRESS_MODIFYENABLE;<br>
+<br>
+      BEGIN_BATCH(batch_length);<br>
+      OUT_BATCH((_3DSTATE_VERTEX_BUFFERS &lt;&lt; 16) | (batch_length - 2));<br>
+      OUT_BATCH(dw0);<br>
+      /* start address */<br>
+      OUT_RELOC(hiz-&gt;vertex_bo, I915_GEM_DOMAIN_VERTEX, 0, 0);<br>
+      /* end address */<br>
+      OUT_RELOC(hiz-&gt;vertex_bo, I915_GEM_DOMAIN_VERTEX,<br>
+                0, hiz-&gt;vertex_bo-&gt;size - 1);<br>
+      OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DSTATE_VERTEX_ELEMENTS<br>
+    *<br>
+    * Fetch dwords 0 - 7 from each VUE. See the comments above where<br>
+    * hiz-&gt;vertex_bo is filled with data.<br>
+    */<br>
+   {<br>
+      const int num_elements = 2;<br>
+      const int batch_length = 1 + 2 * num_elements;<br>
+<br>
+      BEGIN_BATCH(batch_length);<br>
+      OUT_BATCH((_3DSTATE_VERTEX_ELEMENTS &lt;&lt; 16) | (batch_length - 2));<br>
+      /* Element 0 */<br>
+      OUT_BATCH(GEN6_VE0_VALID |<br>
+                BRW_SURFACEFORMAT_R32G32B32A32_FLOAT &lt;&lt; BRW_VE0_FORMAT_SHIFT |<br>
+                0 &lt;&lt; BRW_VE0_SRC_OFFSET_SHIFT);<br>
+      OUT_BATCH(BRW_VE1_COMPONENT_STORE_SRC &lt;&lt; BRW_VE1_COMPONENT_0_SHIFT |<br>
+                BRW_VE1_COMPONENT_STORE_SRC &lt;&lt; BRW_VE1_COMPONENT_1_SHIFT |<br>
+                BRW_VE1_COMPONENT_STORE_SRC &lt;&lt; BRW_VE1_COMPONENT_2_SHIFT |<br>
+                BRW_VE1_COMPONENT_STORE_SRC &lt;&lt; BRW_VE1_COMPONENT_3_SHIFT);<br>
+      /* Element 1 */<br>
+      OUT_BATCH(GEN6_VE0_VALID |<br>
+                BRW_SURFACEFORMAT_R32G32B32A32_FLOAT &lt;&lt; BRW_VE0_FORMAT_SHIFT |<br>
+                16 &lt;&lt; BRW_VE0_SRC_OFFSET_SHIFT);<br>
+      OUT_BATCH(BRW_VE1_COMPONENT_STORE_SRC &lt;&lt; BRW_VE1_COMPONENT_0_SHIFT |<br>
+                BRW_VE1_COMPONENT_STORE_SRC &lt;&lt; BRW_VE1_COMPONENT_1_SHIFT |<br>
+                BRW_VE1_COMPONENT_STORE_SRC &lt;&lt; BRW_VE1_COMPONENT_2_SHIFT |<br>
+                BRW_VE1_COMPONENT_STORE_SRC &lt;&lt; BRW_VE1_COMPONENT_3_SHIFT);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
 }<br>
<br>
 /**<br>
- * Release the region from \c brw-&gt;hiz.depth_rb.<br>
+ * \brief Execute a HiZ op on a miptree slice.<br>
+ *<br>
+ * To execute the HiZ op, this function manually constructs and emits a batch<br>
+ * to &quot;draw&quot; the HiZ op&#39;s rectangle primitive. The batchbuffer is flushed<br>
+ * before constructing and after emitting the batch.<br>
  *<br>
- * \see gen6_hiz_setup_depth_buffer()<br>
+ * This function alters no GL state.<br>
+ *<br>
+ * For an overview of HiZ ops, see the following sections of the Sandy Bridge<br>
+ * PRM, Volume 1, Part 2:<br>
+ *   - 7.5.3.1 Depth Buffer Clear<br>
+ *   - 7.5.3.2 Depth Buffer Resolve<br>
+ *   - 7.5.3.3 Hierarchical Depth Buffer Resolve<br>
  */<br>
 static void<br>
-gen6_hiz_teardown_depth_buffer(struct gl_renderbuffer *rb)<br>
-{<br>
-   struct intel_renderbuffer *irb = intel_renderbuffer(rb);<br>
-   intel_miptree_release(&amp;irb-&gt;mt);<br>
-}<br>
-<br>
-static void<br>
-gen6_resolve_slice(struct intel_context *intel,<br>
-                struct intel_mipmap_tree *mt,<br>
-                unsigned int level,<br>
-                unsigned int layer,<br>
-                 enum brw_hiz_op op)<br>
+gen6_hiz_exec(struct intel_context *intel,<br>
+              struct intel_mipmap_tree *mt,<br>
+              unsigned int level,<br>
+              unsigned int layer,<br>
+              enum gen6_hiz_op op)<br>
 {<br>
    struct gl_context *ctx = &amp;intel-&gt;ctx;<br>
    struct brw_context *brw = brw_context(ctx);<br>
    struct brw_hiz_state *hiz = &amp;brw-&gt;hiz;<br>
-   GLenum fb_bind_enum, fb_get_enum;<br>
-<br>
-   /* Do not recurse. */<br>
-   assert(!brw-&gt;hiz.op);<br>
<br>
+   assert(op != GEN6_HIZ_OP_DEPTH_CLEAR); /* Not implemented yet. */<br>
    assert(mt-&gt;hiz_mt != NULL);<br>
-   assert(level &gt;= mt-&gt;first_level);<br>
-   assert(level &lt;= mt-&gt;last_level);<br>
-   assert(layer &lt; mt-&gt;level[level].depth);<br>
-<br>
-   gen6_hiz_get_framebuffer_enum(ctx, &amp;fb_bind_enum, &amp;fb_get_enum);<br>
-<br>
-   /* Save state. */<br>
-   GLint save_drawbuffer;<br>
-   GLint save_renderbuffer;<br>
-   _mesa_meta_begin(ctx, gen6_hiz_meta_save);<br>
-   _mesa_GetIntegerv(fb_get_enum, &amp;save_drawbuffer);<br>
-   _mesa_GetIntegerv(GL_RENDERBUFFER_BINDING, &amp;save_renderbuffer);<br>
-<br>
-   /* Initialize context data for HiZ operations. */<br>
-   gen6_hiz_init(brw);<br>
-<br>
-   /* Set depth state. */<br>
-   if (!ctx-&gt;Depth.Mask) {<br>
-      /* This sets 3DSTATE_WM.Depth_Buffer_Write_Enable. */<br>
-      _mesa_DepthMask(GL_TRUE);<br>
+   intel_miptree_check_level_layer(mt, level, layer);<br>
+<br>
+   if (hiz-&gt;vertex_bo == NULL)<br>
+      gen6_hiz_init(brw);<br>
+<br>
+   if (hiz-&gt;vertex_bo == NULL) {<br>
+      /* Ouch. Give up. */<br>
+      return;<br>
    }<br>
-   if (op == BRW_HIZ_OP_DEPTH_RESOLVE) {<br>
-      _mesa_set_enable(ctx, GL_DEPTH_TEST, GL_TRUE);<br>
-      _mesa_DepthFunc(GL_NEVER);<br>
+<br>
+   gen6_hiz_emit_batch_head(brw);<br>
+   gen6_hiz_emit_vertices(brw, mt, level, layer);<br>
+<br>
+   /* 3DSTATE_URB<br>
+    *<br>
+    * Assign the entire URB to the VS. Even though the VS disabled, URB space<br>
+    * is still needed because the cliiper loads the VUE&#39;s from the URB. From<br>
+    * the Sandybridge PRM, Volume 2, Part 1, Section 3DSTATE,<br>
+    * Dword 1.15:0 &quot;VS Number of URB Entries&quot;:<br>
+    *     This field is always used (even if VS Function Enable is DISABLED).<br>
+<br>
+    * The warning below appears in the PRM (Section 3DSTATE_URB), but we can<br>
+    * safely ignore it because this batch contains only one draw call.<br>
+    *     Because of URB corruption caused by allocating a previous GS unit<br>
+    *     URB entry to the VS unit, software is required to send a “GS NULL<br>
+    *     Fence” (Send URB fence with VS URB size == 1 and GS URB size == 0)<br>
+    *     plus a dummy DRAW call before any case where VS will be taking over<br>
+    *     GS URB space.<br>
+    */<br>
+   {<br>
+      BEGIN_BATCH(3);<br>
+      OUT_BATCH(_3DSTATE_URB &lt;&lt; 16 | (3 - 2));<br>
+      OUT_BATCH(brw-&gt;urb.max_vs_entries &lt;&lt; GEN6_URB_VS_ENTRIES_SHIFT);<br>
+      OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DSTATE_CC_STATE_POINTERS<br>
+    *<br>
+    * The pointer offsets are relative to<br>
+    * CMD_STATE_BASE_ADDRESS.DynamicStateBaseAddress.<br>
+    *<br>
+    * The HiZ op doesn&#39;t use BLEND_STATE or COLOR_CALC_STATE.<br>
+    */<br>
+   {<br>
+      uint32_t depthstencil_offset;<br>
+      gen6_hiz_emit_depth_stencil_state(brw, op, &amp;depthstencil_offset);<br>
+<br>
+      BEGIN_BATCH(4);<br>
+      OUT_BATCH(_3DSTATE_CC_STATE_POINTERS &lt;&lt; 16 | (4 - 2));<br>
+      OUT_BATCH(1); /* BLEND_STATE offset */<br>
+      OUT_BATCH(depthstencil_offset | 1); /* DEPTH_STENCIL_STATE offset */<br>
+      OUT_BATCH(1); /* COLOR_CALC_STATE offset */<br>
+      ADVANCE_BATCH();<br>
    }<br>
<br>
-   /* Setup FBO. */<br>
-   gen6_hiz_setup_depth_buffer(brw, mt, level, layer);<br>
-   _mesa_BindFramebufferEXT(fb_bind_enum, hiz-&gt;fbo);<br>
+   /* 3DSTATE_VS<br>
+    *<br>
+    * Disable vertex shader.<br>
+    */<br>
+   {<br>
+      /* From the BSpec, Volume 2a, Part 3 &quot;Vertex Shader&quot;, Section<br>
+       * 3DSTATE_VS, Dword 5.0 &quot;VS Function Enable&quot;:<br>
+       *   [DevSNB] A pipeline flush must be programmed prior to a 3DSTATE_VS<br>
+       *   command that causes the VS Function Enable to toggle. Pipeline<br>
+       *   flush can be executed by sending a PIPE_CONTROL command with CS<br>
+       *   stall bit set and a post sync operation.<br>
+       */<br>
+      intel_emit_post_sync_nonzero_flush(intel);<br>
+<br>
+      BEGIN_BATCH(6);<br>
+      OUT_BATCH(_3DSTATE_VS &lt;&lt; 16 | (6 - 2));<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
<br>
+   /* 3DSTATE_GS<br>
+    *<br>
+    * Disable the geometry shader.<br>
+    */<br>
+   {<br>
+      BEGIN_BATCH(7);<br>
+      OUT_BATCH(_3DSTATE_GS &lt;&lt; 16 | (7 - 2));<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
<br>
-   /* A rectangle primitive (3DPRIM_RECTLIST) consists of only three vertices.<br>
-    * The vertices reside in screen space with DirectX coordinates (this is,<br>
-    * (0, 0) is the upper left corner).<br>
+   /* 3DSTATE_CLIP<br>
     *<br>
-    *   v2 ------ implied<br>
-    *    |        |<br>
-    *    |        |<br>
-    *   v0 ----- v1<br>
+    * Disable the clipper.<br>
+    *<br>
+    * The HiZ op emits a rectangle primitive, which requires clipping to<br>
+    * be disabled. From page 10 of the Sandy Bridge PRM Volume 2 Part 1<br>
+    * Section 1.3 &quot;3D Primitives Overview&quot;:<br>
+    *    RECTLIST:<br>
+    *    Either the CLIP unit should be DISABLED, or the CLIP unit&#39;s Clip<br>
+    *    Mode should be set to a value other than CLIPMODE_NORMAL.<br>
+    *<br>
+    * Also disable perspective divide. This doesn&#39;t change the clipper&#39;s<br>
+    * output, but does spare a few electrons.<br>
+    */<br>
+   {<br>
+      BEGIN_BATCH(4);<br>
+      OUT_BATCH(_3DSTATE_CLIP &lt;&lt; 16 | (4 - 2));<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(GEN6_CLIP_PERSPECTIVE_DIVIDE_DISABLE);<br>
+      OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DSTATE_SF<br>
+    *<br>
+    * Disable ViewportTransformEnable (dw2.1)<br>
+    *<br>
+    * From the SandyBridge PRM, Volume 2, Part 1, Section 1.3, &quot;3D<br>
+    * Primitives Overview&quot;:<br>
+    *     RECTLIST: Viewport Mapping must be DISABLED (as is typical with the<br>
+    *     use of screen- space coordinates).<br>
+    *<br>
+    * A solid rectangle must be rendered, so set FrontFaceFillMode (dw2.4:3)<br>
+    * and BackFaceFillMode (dw2.5:6) to SOLID(0).<br>
+    *<br>
+    * From the Sandy Bridge PRM, Volume 2, Part 1, Section<br>
+    * 6.4.1.1 3DSTATE_SF, Field FrontFaceFillMode:<br>
+    *     SOLID: Any triangle or rectangle object found to be front-facing<br>
+    *     is rendered as a solid object. This setting is required when<br>
+    *     (rendering rectangle (RECTLIST) objects.<br>
+    */<br>
+   {<br>
+      BEGIN_BATCH(20);<br>
+      OUT_BATCH(_3DSTATE_SF &lt;&lt; 16 | (20 - 2));<br>
+      OUT_BATCH((1 - 1) &lt;&lt; GEN6_SF_NUM_OUTPUTS_SHIFT | /* only position */<br>
+                1 &lt;&lt; GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT |<br>
+                0 &lt;&lt; GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT);<br>
+      for (int i = 0; i &lt; 18; ++i)<br>
+         OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DSTATE_WM<br>
+    *<br>
+    * Disable thread dispatch (dw5.19) and enable the HiZ op.<br>
+    *<br>
+    * Even though thread dispatch is disabled, max threads (dw5.25:31) must be<br>
+    * nonzero to prevent the GPU from hanging. See the valid ranges in the<br>
+    * BSpec, Volume 2a.11 Windower, Section 3DSTATE_WM, Dword 5.25:31<br>
+    * &quot;Maximum Number Of Threads&quot;.<br>
     */<br>
-   const int width = hiz-&gt;depth_rb-&gt;Width;<br>
-   const int height = hiz-&gt;depth_rb-&gt;Height;<br>
-   const GLfloat positions[] = {<br>
-          0, height,<br>
-      width, height,<br>
-          0,      0,<br>
-   };<br>
-<br>
-   /* Setup program and vertex attributes. */<br>
-   _mesa_UseProgramObjectARB(hiz-&gt;shader.program);<br>
-   _mesa_BindVertexArray(hiz-&gt;vao);<br>
-   _mesa_BindBufferARB(GL_ARRAY_BUFFER, hiz-&gt;shader.position_vbo);<br>
-   _mesa_BufferDataARB(GL_ARRAY_BUFFER_ARB, sizeof(positions), positions,<br>
-                      GL_DYNAMIC_DRAW_ARB);<br>
-<br>
-   /* Execute the HiZ operation. */<br>
-   brw-&gt;hiz.op = op;<br>
-   brw-&gt;state.dirty.brw |= BRW_NEW_HIZ;<br>
-   _mesa_DrawArrays(GL_TRIANGLES, 0, 3);<br>
-   brw-&gt;state.dirty.brw |= BRW_NEW_HIZ;<br>
-   brw-&gt;hiz.op = BRW_HIZ_OP_NONE;<br>
-<br>
-   /* Restore state.<br>
+   {<br>
+      uint32_t dw4 = 0;<br>
+<br>
+      switch (op) {<br>
+      case GEN6_HIZ_OP_DEPTH_CLEAR:<br>
+         assert(!&quot;not implemented&quot;);<br>
+         dw4 |= GEN6_WM_DEPTH_CLEAR;<br>
+         break;<br>
+      case GEN6_HIZ_OP_DEPTH_RESOLVE:<br>
+         dw4 |= GEN6_WM_DEPTH_RESOLVE;<br>
+         break;<br>
+      case GEN6_HIZ_OP_HIZ_RESOLVE:<br>
+         dw4 |= GEN6_WM_HIERARCHICAL_DEPTH_RESOLVE;<br>
+         break;<br>
+      default:<br>
+         assert(0);<br>
+         break;<br>
+      }<br>
+<br>
+      BEGIN_BATCH(9);<br>
+      OUT_BATCH(_3DSTATE_WM &lt;&lt; 16 | (9 - 2));<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(dw4);<br>
+      OUT_BATCH((brw-&gt;max_wm_threads - 1) &lt;&lt; GEN6_WM_MAX_THREADS_SHIFT);<br>
+      OUT_BATCH((1 - 1) &lt;&lt; GEN6_WM_NUM_SF_OUTPUTS_SHIFT); /* only position */<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DSTATE_DEPTH_BUFFER */<br>
+   {<br>
+      uint32_t width = mt-&gt;level[level].width;<br>
+      uint32_t height = mt-&gt;level[level].height;<br>
+<br>
+      uint32_t tile_x;<br>
+      uint32_t tile_y;<br>
+      uint32_t offset;<br>
+      {<br>
+         /* Construct a dummy renderbuffer just to extract tile offsets. */<br>
+         struct intel_renderbuffer rb;<br>
+         <a href="http://rb.mt" target="_blank">rb.mt</a> = mt;<br>
+         rb.mt_level = level;<br>
+         rb.mt_layer = layer;<br>
+         intel_renderbuffer_set_draw_offset(&amp;rb);<br>
+         offset = intel_renderbuffer_tile_offsets(&amp;rb, &amp;tile_x, &amp;tile_y);<br>
+      }<br>
+<br>
+      uint32_t format;<br>
+      switch (mt-&gt;format) {<br>
+      case MESA_FORMAT_Z16:       format = BRW_DEPTHFORMAT_D16_UNORM; break;<br>
+      case MESA_FORMAT_Z32_FLOAT: format = BRW_DEPTHFORMAT_D32_FLOAT; break;<br>
+      case MESA_FORMAT_X8_Z24:    format = BRW_DEPTHFORMAT_D24_UNORM_X8_UINT; break;<br>
+      default:                    assert(0); break;<br>
+      }<br>
+<br>
+      intel_emit_post_sync_nonzero_flush(intel);<br>
+      intel_emit_depth_stall_flushes(intel);<br>
+<br>
+      BEGIN_BATCH(7);<br>
+      OUT_BATCH(_3DSTATE_DEPTH_BUFFER &lt;&lt; 16 | (7 - 2));<br>
+      OUT_BATCH(((mt-&gt;region-&gt;pitch * mt-&gt;region-&gt;cpp) - 1) |<br>
+                format &lt;&lt; 18 |<br>
+                1 &lt;&lt; 21 | /* separate stencil enable */<br>
+                1 &lt;&lt; 22 | /* hiz enable */<br>
+                BRW_TILEWALK_YMAJOR &lt;&lt; 26 |<br>
+                1 &lt;&lt; 27 | /* y-tiled */<br>
+                BRW_SURFACE_2D &lt;&lt; 29);<br>
+      OUT_RELOC(mt-&gt;region-&gt;bo,<br>
+                I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,<br>
+                offset);<br>
+      OUT_BATCH(BRW_SURFACE_MIPMAPLAYOUT_BELOW &lt;&lt; 1 |<br>
+                (width + tile_x - 1) &lt;&lt; 6 |<br>
+                (height + tile_y - 1) &lt;&lt; 19);<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(tile_x |<br>
+                tile_y &lt;&lt; 16);<br>
+      OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DSTATE_HIER_DEPTH_BUFFER */<br>
+   {<br>
+      struct intel_region *hiz_region = mt-&gt;hiz_mt-&gt;region;<br>
+<br>
+      BEGIN_BATCH(3);<br>
+      OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER &lt;&lt; 16) | (3 - 2));<br>
+      OUT_BATCH(hiz_region-&gt;pitch * hiz_region-&gt;cpp - 1);<br>
+      OUT_RELOC(hiz_region-&gt;bo,<br>
+                I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,<br>
+                0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DSTATE_STENCIL_BUFFER */<br>
+   {<br>
+      BEGIN_BATCH(3);<br>
+      OUT_BATCH((_3DSTATE_STENCIL_BUFFER &lt;&lt; 16) | (3 - 2));<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DSTATE_CLEAR_PARAMS<br>
     *<br>
-    * The order in which state is restored is significant. The draw buffer<br>
-    * used for the HiZ op has no stencil buffer, and glStencilFunc() clamps<br>
-    * the stencil reference value to the range allowed by the draw buffer&#39;s<br>
-    * number of stencil bits. So, the draw buffer binding must be restored<br>
-    * before the stencil state, or else the stencil ref will be clamped to 0.<br>
+    * From the Sandybridge PRM, Volume 2, Part 1, Section 3DSTATE_CLEAR_PARAMS:<br>
+    *   [DevSNB] 3DSTATE_CLEAR_PARAMS packet must follow the DEPTH_BUFFER_STATE<br>
+    *   packet when HiZ is enabled and the DEPTH_BUFFER_STATE changes.<br>
     */<br>
-   gen6_hiz_teardown_depth_buffer(hiz-&gt;depth_rb);<br>
-   _mesa_BindRenderbufferEXT(GL_RENDERBUFFER, save_renderbuffer);<br>
-   _mesa_BindFramebufferEXT(fb_bind_enum, save_drawbuffer);<br>
-   _mesa_meta_end(ctx);<br>
+   {<br>
+      BEGIN_BATCH(2);<br>
+      OUT_BATCH(_3DSTATE_CLEAR_PARAMS &lt;&lt; 16 | (2 - 2));<br>
+      OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DSTATE_DRAWING_RECTANGLE */<br>
+   {<br>
+      BEGIN_BATCH(4);<br>
+      OUT_BATCH(_3DSTATE_DRAWING_RECTANGLE &lt;&lt; 16 | (4 - 2));<br>
+      OUT_BATCH(0);<br>
+      OUT_BATCH(((mt-&gt;level[level].width - 1) &amp; 0xffff) |<br>
+                ((mt-&gt;level[level].height - 1) &lt;&lt; 16));<br>
+      OUT_BATCH(0);<br>
+      ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* 3DPRIMITIVE */<br>
+   {<br>
+     BEGIN_BATCH(6);<br>
+     OUT_BATCH(CMD_3D_PRIM &lt;&lt; 16 | (6 - 2) |<br>
+               _3DPRIM_RECTLIST &lt;&lt; GEN4_3DPRIM_TOPOLOGY_TYPE_SHIFT |<br>
+               GEN4_3DPRIM_VERTEXBUFFER_ACCESS_SEQUENTIAL);<br>
+     OUT_BATCH(3); /* vertex count per instance */<br>
+     OUT_BATCH(0);<br>
+     OUT_BATCH(1); /* instance count */<br>
+     OUT_BATCH(0);<br>
+     OUT_BATCH(0);<br>
+     ADVANCE_BATCH();<br>
+   }<br>
+<br>
+   /* See comments above at first invocation of intel_flush() in<br>
+    * gen6_hiz_emit_batch_head().<br>
+    */<br>
+   intel_flush(ctx);<br>
+<br>
+   /* Be safe. */<br>
+   brw-&gt;state.dirty.brw = ~0;<br>
+   brw-&gt;state.dirty.cache = ~0;<br>
 }<br>
<br>
+/**<br>
+ * \param out_offset is relative to<br>
+ *        CMD_STATE_BASE_ADDRESS.DynamicStateBaseAddress.<br>
+ */<br>
+void<br>
+gen6_hiz_emit_depth_stencil_state(struct brw_context *brw,<br>
+                                  enum gen6_hiz_op op,<br>
+                                  uint32_t *out_offset)<br>
+{<br>
+   struct gen6_depth_stencil_state *state;<br>
+   state = brw_state_batch(brw, AUB_TRACE_DEPTH_STENCIL_STATE,<br>
+                              sizeof(*state), 64,<br>
+                              out_offset);<br>
+   memset(state, 0, sizeof(*state));<br>
+<br>
+   /* See the following sections of the Sandy Bridge PRM, Volume 1, Part2:<br>
+    *   - 7.5.3.1 Depth Buffer Clear<br>
+    *   - 7.5.3.2 Depth Buffer Resolve<br>
+    *   - 7.5.3.3 Hierarchical Depth Buffer Resolve<br>
+    */<br>
+   state-&gt;ds2.depth_write_enable = 1;<br>
+   if (op == GEN6_HIZ_OP_DEPTH_RESOLVE) {<br>
+      state-&gt;ds2.depth_test_enable = 1;<br>
+      state-&gt;ds2.depth_test_func = COMPAREFUNC_NEVER;<br>
+   }<br>
+}<br>
+<br>
+/** \see intel_context::vtbl::resolve_hiz_slice */<br>
 void<br>
 gen6_resolve_hiz_slice(struct intel_context *intel,<br>
                        struct intel_mipmap_tree *mt,<br>
                        uint32_t level,<br>
                        uint32_t layer)<br>
 {<br>
-   gen6_resolve_slice(intel, mt, level, layer, BRW_HIZ_OP_HIZ_RESOLVE);<br>
+   gen6_hiz_exec(intel, mt, level, layer, GEN6_HIZ_OP_HIZ_RESOLVE);<br>
 }<br>
<br>
-<br>
+/** \see intel_context::vtbl::resolve_depth_slice */<br>
 void<br>
 gen6_resolve_depth_slice(struct intel_context *intel,<br>
                          struct intel_mipmap_tree *mt,<br>
                          uint32_t level,<br>
                          uint32_t layer)<br>
 {<br>
-   gen6_resolve_slice(intel, mt, level, layer, BRW_HIZ_OP_DEPTH_RESOLVE);<br>
+   gen6_hiz_exec(intel, mt, level, layer, GEN6_HIZ_OP_DEPTH_RESOLVE);<br>
 }<br>
diff --git a/src/mesa/drivers/dri/i965/gen6_hiz.h b/src/mesa/drivers/dri/i965/gen6_hiz.h<br>
index 4929012..0a13ba0 100644<br>
--- a/src/mesa/drivers/dri/i965/gen6_hiz.h<br>
+++ b/src/mesa/drivers/dri/i965/gen6_hiz.h<br>
@@ -28,6 +28,44 @@<br>
 struct intel_context;<br>
 struct intel_mipmap_tree;<br>
<br>
+/**<br>
+ * For an overview of the HiZ operations, see the following sections of the<br>
+ * Sandy Bridge PRM, Volume 1, Part2:<br>
+ *   - 7.5.3.1 Depth Buffer Clear<br>
+ *   - 7.5.3.2 Depth Buffer Resolve<br>
+ *   - 7.5.3.3 Hierarchical Depth Buffer Resolve<br>
+ */<br>
+enum gen6_hiz_op {<br>
+   GEN6_HIZ_OP_DEPTH_CLEAR,<br>
+   GEN6_HIZ_OP_DEPTH_RESOLVE,<br>
+   GEN6_HIZ_OP_HIZ_RESOLVE,<br>
+};<br>
+<br>
+/**<br>
+ * \name HiZ internals<br>
+ * \{<br>
+ *<br>
+ * Used internally by gen6_hiz_exec() and gen7_hiz_exec().<br>
+ */<br>
+<br>
+void<br>
+gen6_hiz_init(struct brw_context *brw);<br>
+<br>
+void<br>
+gen6_hiz_emit_batch_head(struct brw_context *brw);<br>
+<br>
+void<br>
+gen6_hiz_emit_vertices(struct brw_context *brw,<br>
+                       struct intel_mipmap_tree *mt,<br>
+                       unsigned int level,<br>
+                       unsigned int layer);<br>
+<br>
+void<br>
+gen6_hiz_emit_depth_stencil_state(struct brw_context *brw,<br>
+                                  enum gen6_hiz_op op,<br>
+                                  uint32_t *out_offset);<br>
+/** \} */<br>
+<br>
 void<br>
 gen6_resolve_hiz_slice(struct intel_context *intel,<br>
                        struct intel_mipmap_tree *mt,<br>
diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c b/src/mesa/drivers/dri/i965/gen6_sf_state.c<br>
index 163b54c..07b8e6d 100644<br>
--- a/src/mesa/drivers/dri/i965/gen6_sf_state.c<br>
+++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c<br>
@@ -149,17 +149,8 @@ upload_sf_state(struct brw_context *brw)<br>
       urb_entry_read_length &lt;&lt; GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT |<br>
       urb_entry_read_offset &lt;&lt; GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT;<br>
<br>
-   dw2 = GEN6_SF_STATISTICS_ENABLE;<br>
-<br>
-   /* Enable viewport transform only if no HiZ operation is progress<br>
-    *<br>
-    * From page 11 of the SandyBridge PRM, Volume 2, Part 1, Section 1.3, &quot;3D<br>
-    * Primitives Overview&quot;:<br>
-    *     RECTLIST: Viewport Mapping must be DISABLED (as is typical with the<br>
-    *     use of screen- space coordinates).<br>
-    */<br>
-   if (!brw-&gt;hiz.op)<br>
-      dw2 |= GEN6_SF_VIEWPORT_TRANSFORM_ENABLE;<br>
+   dw2 = GEN6_SF_STATISTICS_ENABLE |<br>
+         GEN6_SF_VIEWPORT_TRANSFORM_ENABLE;<br>
<br>
    dw3 = 0;<br>
    dw4 = 0;<br>
@@ -354,8 +345,7 @@ const struct brw_tracked_state gen6_sf_state = {<br>
                _NEW_POINT |<br>
                _NEW_TRANSFORM),<br>
       .brw   = (BRW_NEW_CONTEXT |<br>
-               BRW_NEW_FRAGMENT_PROGRAM |<br>
-               BRW_NEW_HIZ),<br>
+               BRW_NEW_FRAGMENT_PROGRAM),<br>
       .cache = CACHE_NEW_VS_PROG<br>
    },<br>
    .emit = upload_sf_state,<br>
diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c b/src/mesa/drivers/dri/i965/gen6_vs_state.c<br>
index 63efaa4..3392a9f 100644<br>
--- a/src/mesa/drivers/dri/i965/gen6_vs_state.c<br>
+++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c<br>
@@ -133,6 +133,15 @@ upload_vs_state(struct brw_context *brw)<br>
    struct intel_context *intel = &amp;brw-&gt;intel;<br>
    uint32_t floating_point_mode = 0;<br>
<br>
+   /* From the BSpec, Volume 2a, Part 3 &quot;Vertex Shader&quot;, Section<br>
+    * 3DSTATE_VS, Dword 5.0 &quot;VS Function Enable&quot;:<br>
+    *   [DevSNB] A pipeline flush must be programmed prior to a 3DSTATE_VS<br>
+    *   command that causes the VS Function Enable to toggle. Pipeline<br>
+    *   flush can be executed by sending a PIPE_CONTROL command with CS<br>
+    *   stall bit set and a post sync operation.<br>
+    */<br>
+   intel_emit_post_sync_nonzero_flush(intel);<br>
+<br>
    if (brw-&gt;vs.push_const_size == 0) {<br>
       /* Disable the push constant buffers. */<br>
       BEGIN_BATCH(5);<br>
diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c b/src/mesa/drivers/dri/i965/gen6_wm_state.c<br>
index 3669811..205e648 100644<br>
--- a/src/mesa/drivers/dri/i965/gen6_wm_state.c<br>
+++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c<br>
@@ -149,23 +149,6 @@ upload_wm_state(struct brw_context *brw)<br>
    dw4 |= (brw-&gt;wm.prog_data-&gt;first_curbe_grf_16 &lt;&lt;<br>
           GEN6_WM_DISPATCH_START_GRF_SHIFT_2);<br>
<br>
-   switch (brw-&gt;hiz.op) {<br>
-   case BRW_HIZ_OP_NONE:<br>
-      break;<br>
-   case BRW_HIZ_OP_DEPTH_CLEAR:<br>
-      dw4 |= GEN6_WM_DEPTH_CLEAR;<br>
-      break;<br>
-   case BRW_HIZ_OP_DEPTH_RESOLVE:<br>
-      dw4 |= GEN6_WM_DEPTH_RESOLVE;<br>
-      break;<br>
-   case BRW_HIZ_OP_HIZ_RESOLVE:<br>
-      dw4 |= GEN6_WM_HIERARCHICAL_DEPTH_RESOLVE;<br>
-      break;<br>
-   default:<br>
-      assert(0);<br>
-      break;<br>
-   }<br>
-<br>
    dw5 |= (brw-&gt;max_wm_threads - 1) &lt;&lt; GEN6_WM_MAX_THREADS_SHIFT;<br>
<br>
    /* CACHE_NEW_WM_PROG */<br>
@@ -233,8 +216,7 @@ const struct brw_tracked_state gen6_wm_state = {<br>
                _NEW_PROGRAM_CONSTANTS |<br>
                _NEW_POLYGON),<br>
       .brw   = (BRW_NEW_FRAGMENT_PROGRAM |<br>
-               BRW_NEW_BATCH |<br>
-               BRW_NEW_HIZ),<br>
+               BRW_NEW_BATCH),<br>
       .cache = (CACHE_NEW_SAMPLER |<br>
                CACHE_NEW_WM_PROG)<br>
    },<br>
diff --git a/src/mesa/drivers/dri/i965/gen7_clip_state.c b/src/mesa/drivers/dri/i965/gen7_clip_state.c<br>
index 9be3ce9..c32cd98 100644<br>
--- a/src/mesa/drivers/dri/i965/gen7_clip_state.c<br>
+++ b/src/mesa/drivers/dri/i965/gen7_clip_state.c<br>
@@ -39,23 +39,6 @@ upload_clip_state(struct brw_context *brw)<br>
    /* BRW_NEW_FRAGMENT_PROGRAM */<br>
    const struct gl_fragment_program *fprog = brw-&gt;fragment_program;<br>
<br>
-   if (brw-&gt;hiz.op) {<br>
-      /* HiZ operations emit a rectangle primitive, which requires clipping to<br>
-       * be disabled. From page 10 of the Sandy Bridge PRM Volume 2 Part 1<br>
-       * Section 1.3 3D Primitives Overview:<br>
-       *    RECTLIST:<br>
-       *    Either the CLIP unit should be DISABLED, or the CLIP unit&#39;s Clip<br>
-       *    Mode should be set to a value other than CLIPMODE_NORMAL.<br>
-       */<br>
-      BEGIN_BATCH(4);<br>
-      OUT_BATCH(_3DSTATE_CLIP &lt;&lt; 16 | (4 - 2));<br>
-      OUT_BATCH(0);<br>
-      OUT_BATCH(0);<br>
-      OUT_BATCH(0);<br>
-      ADVANCE_BATCH();<br>
-      return;<br>
-   }<br>
-<br>
    /* _NEW_BUFFERS */<br>
    bool render_to_fbo = brw-&gt;intel.ctx.DrawBuffer-&gt;Name != 0;<br>
<br>
@@ -133,8 +116,7 @@ const struct brw_tracked_state gen7_clip_state = {<br>
                 _NEW_LIGHT |<br>
                 _NEW_TRANSFORM),<br>
       .brw   = (BRW_NEW_CONTEXT |<br>
-                BRW_NEW_FRAGMENT_PROGRAM |<br>
-                BRW_NEW_HIZ),<br>
+                BRW_NEW_FRAGMENT_PROGRAM),<br>
       .cache = 0<br>
    },<br>
    .emit = upload_clip_state,<br>
diff --git a/src/mesa/drivers/dri/i965/gen7_hiz.c b/src/mesa/drivers/dri/i965/gen7_hiz.c<br>
new file mode 100644<br>
index 0000000..fd50df9<br>
--- /dev/null<br>
+++ b/src/mesa/drivers/dri/i965/gen7_hiz.c<br>
@@ -0,0 +1,471 @@<br>
+/*<br>
+ * Copyright © 2011 Intel Corporation<br>
+ *<br>
+ * Permission is hereby granted, free of charge, to any person obtaining a<br>
+ * copy of this software and associated documentation files (the &quot;Software&quot;),<br>
+ * to deal in the Software without restriction, including without limitation<br>
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,<br>
+ * and/or sell copies of the Software, and to permit persons to whom the<br>
+ * Software is furnished to do so, subject to the following conditions:<br>
+ *<br>
+ * The above copyright notice and this permission notice (including the next<br>
+ * paragraph) shall be included in all copies or substantial portions of the<br>
+ * Software.<br>
+ *<br>
+ * THE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR<br>
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,<br>
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL<br>
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER<br>
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING<br>
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS<br>
+ * IN THE SOFTWARE.<br>
+ */<br>
+<br>
+#include &lt;assert.h&gt;<br>
+<br>
+#include &quot;intel_batchbuffer.h&quot;<br>
+#include &quot;intel_fbo.h&quot;<br>
+#include &quot;intel_mipmap_tree.h&quot;<br>
+<br>
+#include &quot;brw_context.h&quot;<br>
+#include &quot;brw_defines.h&quot;<br>
+#include &quot;brw_state.h&quot;<br>
+<br>
+#include &quot;gen6_hiz.h&quot;<br>
+#include &quot;gen7_hiz.h&quot;<br>
+<br>
+/**<br>
+ * \copydoc gen6_hiz_exec()<br>
+ */<br>
+static void<br>
+gen7_hiz_exec(struct intel_context *intel,<br>
+              struct intel_mipmap_tree *mt,<br>
+              unsigned int level,<br>
+              unsigned int layer,<br>
+              enum gen6_hiz_op op)<br>
+{<br>
+   struct gl_context *ctx = &amp;intel-&gt;ctx;<br>
+   struct brw_context *brw = brw_context(ctx);<br>
+   struct brw_hiz_state *hiz = &amp;brw-&gt;hiz;<br>
+<br>
+   assert(op != GEN6_HIZ_OP_DEPTH_CLEAR); /* Not implemented yet. */<br>
+   assert(mt-&gt;hiz_mt != NULL);<br>
+   intel_miptree_check_level_layer(mt, level, layer);<br>
+<br>
+   if (hiz-&gt;vertex_bo == NULL)<br>
+      gen6_hiz_init(brw);<br>
+<br>
+   if (hiz-&gt;vertex_bo == NULL) {<br>
+      /* Ouch. Give up. */<br>
+      return;<br>
+   }<br>
+<br>
+   gen6_hiz_emit_batch_head(brw);<br>
+   gen6_hiz_emit_vertices(brw, mt, level, layer);<br>
+<br>
+   /* 3DSTATE_URB_VS<br>
+    * 3DSTATE_URB_HS<br>
+    * 3DSTATE_URB_DS<br>
+    * 3DSTATE_URB_GS<br>
+    *<br>
+    * If the 3DSTATE_URB_VS is emitted, than the others must be also.