[Mesa-dev] [PATCH] [RFC] i965: Rewrite the HiZ op

Chad Versace chad.versace at linux.intel.com
Fri Feb 3 15:58:18 PST 2012


==== RFC notes ====

This patch needs to land in 8.0, so please share your comments. You can
find this on my hiz-kill-meta-op-v7 branch.

The patch is complete for gen6, and everything I test seems to work well.
It introduces no Piglit delta; it fixes the remaining oglc failures in bug
43327; it fixes bug 44927.  I haven't yet tested it with Egypt, Nexuiz,
and Citybench, but I plan to do that this afternoon.

You may notice that /*TODO: gen7*/ is sprinkled about. I'm working on that
now.

What follows is the real commit message.

==== Summary ====

This HiZ op was implemented as a meta-op. This patch reimplements it by
constructing and emitting a special batch. This fixes several known bugs,
and likely a lot of undiscovered ones too.

==== Why the HiZ meta-op needed to die ====

The HiZ op was implemented as a meta-op, which caused lots of trouble. All
other meta-ops occur as a result of some GL call (for example, glClear and
glGenerateMipmap), but the HiZ meta-op was special. It was called in
places that Mesa (in particular, the vbo and swrast modules) did not
expect---and were not prepared for---state changes to occur (for example:
glDraw; glCallList; within glBegin/End blocks; and within
swrast_prepare_render as a result of intel_miptree_map).

In an attempt to work around these unexpected state changes, I added two
hooks in i965:
  - A hook for glDraw, located in brw_predraw_resolve_buffers (which is
    called in the glDraw path). This hook detected if a predraw resolve
    meta-op had occurred, and would hackishly repropagate some GL state
    if necessary. This ensured that the meta-op state changes would not
    intefere with the vbo module's subsequent execution of glDraw.
  - A hook for glBegin, implemented by brwPrepareExecBegin. This hook
    resolved all buffers before entering
    a glBegin/End block, thus preventing an infinitely recurring call to
    vbo_exec_FlushVertices. The vbo module called vbo_exec_FlushVertices to
    flush its vertex queue in response to GL state changes.

Unfortunately, these hooks were not sufficient. The meta-op state changes
still interacted badly with glPopAttrib (as discovered in bug 44927) and
with swrast rendering (as discovered by debugging gen6's swrast fallback
for glBitmap). I expect there are more undiscovered bugs. Rather than play
whack-a-mole in a minefield, the sane approach is to replace the HiZ
meta-op with something safer.

==== How it was killed ====

This patch consists of several logical components:
  1. Rewrite the HiZ op by replacing gen6_resolve_slice with
     gen6_hiz_exec. The new function does not call a meta-op, but instead
     manually constructs and emits a batch to "draw" the HiZ op's
     rectangle primitive. The new function alters no GL state.
  2. Add fields to brw_context::hiz for the new HiZ op.
  3. Rewrite gen6_hiz_init, and invoke it in the eglMakeCurrent codepath.
  4. Kill all dead HiZ code:
     - the function gen6_resolve_slice
     - the dirty flag BRW_NEW_HIZ
     - the dead fields in brw_context::hiz
     - the state packet manipulation triggered by the now removed
       brw_context::hiz::op
     - the meta-op workaround in brw_predraw_resolve_buffers (discussed above)
     - the meta-op workaround brwPrepareExecBegin (discussed above)

TODO: write gen7 variant for gen6_hiz_exec
TODO: test egypt
TODO: test nexuiz
TODO: test citybench
Note: This is a candidate for the 8.0 branch.
CC: Eric Anholt <eric at anholt.net>
CC: Kenneth Graunke <kenneth at whitecape.org>
CC: Paul Berry <stereotype441 at gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43327
Reported-by: xunx.fang at intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44927
Reported-by: chao.a.chen at intel.com
Signed-off-by: Chad Versace <chad.versace at linux.intel.com>
---
 src/mesa/drivers/dri/i965/brw_context.c       |   55 --
 src/mesa/drivers/dri/i965/brw_context.h       |   75 ++-
 src/mesa/drivers/dri/i965/brw_draw.c          |   47 +-
 src/mesa/drivers/dri/i965/brw_state_upload.c  |    1 -
 src/mesa/drivers/dri/i965/brw_vtbl.c          |    4 +
 src/mesa/drivers/dri/i965/gen6_clip_state.c   |   20 +-
 src/mesa/drivers/dri/i965/gen6_depthstencil.c |    9 +-
 src/mesa/drivers/dri/i965/gen6_hiz.c          | 1070 ++++++++++++++++++-------
 src/mesa/drivers/dri/i965/gen6_hiz.h          |    8 +
 src/mesa/drivers/dri/i965/gen6_sf_state.c     |   16 +-
 src/mesa/drivers/dri/i965/gen6_wm_state.c     |   20 +-
 src/mesa/drivers/dri/i965/gen7_clip_state.c   |   20 +-
 src/mesa/drivers/dri/i965/gen7_sf_state.c     |   19 +-
 src/mesa/drivers/dri/i965/gen7_wm_state.c     |   18 -
 src/mesa/drivers/dri/intel/intel_context.c    |    5 +
 src/mesa/drivers/dri/intel/intel_context.h    |    4 +
 16 files changed, 882 insertions(+), 509 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c
index 1ab6310..65de260 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -41,8 +41,6 @@
 #include "brw_draw.h"
 #include "brw_state.h"
 
-#include "gen6_hiz.h"
-
 #include "intel_fbo.h"
 #include "intel_mipmap_tree.h"
 #include "intel_regions.h"
@@ -57,58 +55,6 @@
  * Mesa's Driver Functions
  ***************************************/
 
-/**
- * \brief Prepare for entry into glBegin/glEnd block.
- *
- * Resolve buffers before entering a glBegin/glEnd block. This is
- * necessary to prevent recursive calls to FLUSH_VERTICES.
- *
- * This resolves the depth buffer of each enabled depth texture and the HiZ
- * buffer of the attached depth renderbuffer.
- *
- * Details
- * -------
- * When vertices are queued during a glBegin/glEnd block, those vertices must
- * be drawn before any rendering state changes. To ensure this, Mesa calls
- * FLUSH_VERTICES as a prehook to such state changes. Therefore,
- * FLUSH_VERTICES itself cannot change rendering state without falling into a
- * recursive trap.
- *
- * This precludes meta-ops, namely buffer resolves, from occurring while any
- * vertices are queued. To prevent that situation, we resolve some buffers on
- * entering a glBegin/glEnd
- *
- * \see brwCleanupExecEnd()
- */
-static void brwPrepareExecBegin(struct gl_context *ctx)
-{
-   struct brw_context *brw = brw_context(ctx);
-   struct intel_context *intel = &brw->intel;
-   struct intel_renderbuffer *draw_irb;
-   struct intel_texture_object *tex_obj;
-
-   if (!intel->has_hiz) {
-      /* The context uses no feature that requires buffer resolves. */
-      return;
-   }
-
-   /* Resolve each enabled texture. */
-   for (int i = 0; i < ctx->Const.MaxTextureImageUnits; i++) {
-      if (!ctx->Texture.Unit[i]._ReallyEnabled)
-	 continue;
-      tex_obj = intel_texture_object(ctx->Texture.Unit[i]._Current);
-      if (!tex_obj || !tex_obj->mt)
-	 continue;
-      intel_miptree_all_slices_resolve_depth(intel, tex_obj->mt);
-   }
-
-   /* Resolve the attached depth buffer. */
-   draw_irb = intel_get_renderbuffer(ctx->DrawBuffer, BUFFER_DEPTH);
-   if (draw_irb) {
-      intel_renderbuffer_resolve_hiz(intel, draw_irb);
-   }
-}
-
 static void brwInitDriverFunctions(struct intel_screen *screen,
 				   struct dd_function_table *functions)
 {
@@ -117,7 +63,6 @@ static void brwInitDriverFunctions(struct intel_screen *screen,
    brwInitFragProgFuncs( functions );
    brw_init_queryobj_functions(functions);
 
-   functions->PrepareExecBegin = brwPrepareExecBegin;
    functions->BeginTransformFeedback = brw_begin_transform_feedback;
 
    if (screen->gen >= 7)
diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
index c027bef..981822c 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -119,6 +119,10 @@
 #define BRW_MAX_CURBE                    (32*16)
 
 struct brw_context;
+struct brw_instruction;
+struct brw_vs_prog_key;
+struct brw_wm_prog_key;
+struct brw_wm_prog_data;
 
 enum brw_state_id {
    BRW_STATE_URB_FENCE,
@@ -144,7 +148,6 @@ enum brw_state_id {
    BRW_STATE_VS_CONSTBUF,
    BRW_STATE_PROGRAM_CACHE,
    BRW_STATE_STATE_BASE_ADDRESS,
-   BRW_STATE_HIZ,
    BRW_STATE_SOL_INDICES,
 };
 
@@ -174,7 +177,6 @@ enum brw_state_id {
 #define BRW_NEW_VS_CONSTBUF            (1 << BRW_STATE_VS_CONSTBUF)
 #define BRW_NEW_PROGRAM_CACHE		(1 << BRW_STATE_PROGRAM_CACHE)
 #define BRW_NEW_STATE_BASE_ADDRESS	(1 << BRW_STATE_STATE_BASE_ADDRESS)
-#define BRW_NEW_HIZ			(1 << BRW_STATE_HIZ)
 #define BRW_NEW_SOL_INDICES		(1 << BRW_STATE_SOL_INDICES)
 
 struct brw_state_flags {
@@ -950,38 +952,57 @@ struct brw_context
    int state_batch_count;
 
    /**
-    * \brief State needed to execute HiZ meta-ops
+    * \brief State needed to execute HiZ ops.
     *
-    * All fields except \c op are initialized by gen6_hiz_init().
+    * All fields are initialized by gen6_hiz_init().
     */
    struct brw_hiz_state {
-      /**
-       * \brief Indicates which HiZ operation is in progress.
+
+      /** \brief ralloc context for all brw_hiz_state fields
        *
-       * See the following sections of the Sandy Bridge PRM, Volume 1, Part2:
-       *   - 7.5.3.1 Depth Buffer Clear
-       *   - 7.5.3.2 Depth Buffer Resolve
-       *   - 7.5.3.3 Hierarchical Depth Buffer Resolve
+       * This is initialized by gen6_hiz_init() and is destroyed at GL context
+       * destruction, brwDestroyContext().
        */
-      enum brw_hiz_op {
-	 BRW_HIZ_OP_NONE = 0,
-	 BRW_HIZ_OP_DEPTH_CLEAR,
-	 BRW_HIZ_OP_DEPTH_RESOLVE,
-	 BRW_HIZ_OP_HIZ_RESOLVE,
-      } op;
-
-      /** \brief Shader state */
-      struct {
-	 GLuint program;
-	 GLuint position_vbo;
-	 GLint position_location;
-      } shader;
+      void *mem_ctx;
 
-      /** \brief VAO for the rectangle primitive's vertices. */
-      GLuint vao;
+      /** \brief VBO for rectlist vertices.
+       *
+       * The HiZ op emits a rectangle primitive (3DPRIM_RECTLIST), which
+       * consists of only three vertices. The vertices reside in screen space
+       * with DirectX coordinates (that is, (0, 0) is the upper left corner).
+       * The total size of the bo is 24 bytes (3 vertices * 2 components *
+       * sizeof(float)).
+       *
+       *   v2 ------ implied
+       *    |        |
+       *    |        |
+       *   v0 ----- v1
+       *
+       * Rather than using glGenBuffers(), we allocate the VBO directly
+       * through drm.
+       *
+       * \see gen6_resolve_slice()
+       */
+      drm_intel_bo *vertex_bo;
 
-      GLuint fbo;
-      struct gl_renderbuffer *depth_rb;
+      /**
+       * \name HiZ shader programs
+       * \{
+       *
+       * These programs execute the "drawing" of the HiZ op's rectangle
+       * primitive.  They are compiled in gen6_hiz_init() and are not changed
+       * afterwards.
+       */
+      struct brw_vs_prog_key *vs_key;
+      struct brw_vs_prog_data *vs_data;
+      const void *vs_instr;
+      uint32_t vs_size;
+
+      struct brw_wm_prog_key *wm_key;
+      struct brw_wm_prog_data *wm_data;
+      const void *wm_instr;
+      uint32_t wm_size;
+      /** \} */
    } hiz;
 
    struct brw_sol_state {
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c
index f50fffd..e919f3e 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -126,12 +126,7 @@ static void gen6_set_prim(struct brw_context *brw,
 
    DBG("PRIM: %s\n", _mesa_lookup_enum_by_nr(prim->mode));
 
-   if (brw->hiz.op) {
-      assert(prim->mode == GL_TRIANGLES);
-      hw_prim = _3DPRIM_RECTLIST;
-   } else {
-      hw_prim = prim_to_hw_prim[prim->mode];
-   }
+   hw_prim = prim_to_hw_prim[prim->mode];
 
    if (hw_prim != brw->primitive) {
       brw->primitive = hw_prim;
@@ -307,17 +302,11 @@ brw_predraw_resolve_buffers(struct brw_context *brw)
    struct intel_context *intel = &brw->intel;
    struct intel_renderbuffer *depth_irb;
    struct intel_texture_object *tex_obj;
-   bool did_resolve = false;
-
-   /* Avoid recursive HiZ op. */
-   if (brw->hiz.op) {
-      return;
-   }
 
    /* Resolve the depth buffer's HiZ buffer. */
    depth_irb = intel_get_renderbuffer(ctx->DrawBuffer, BUFFER_DEPTH);
    if (depth_irb && depth_irb->mt) {
-      did_resolve |= intel_renderbuffer_resolve_hiz(intel, depth_irb);
+      intel_renderbuffer_resolve_hiz(intel, depth_irb);
    }
 
    /* Resolve depth buffer of each enabled depth texture. */
@@ -327,33 +316,7 @@ brw_predraw_resolve_buffers(struct brw_context *brw)
       tex_obj = intel_texture_object(ctx->Texture.Unit[i]._Current);
       if (!tex_obj || !tex_obj->mt)
 	 continue;
-      did_resolve |= intel_miptree_all_slices_resolve_depth(intel, tex_obj->mt);
-   }
-
-   if (did_resolve) {
-      /* Call vbo_bind_array() to synchronize the vbo module's vertex
-       * attributes to the gl_context's.
-       *
-       * Details
-       * -------
-       * The vbo module tracks vertex attributes separately from the
-       * gl_context.  Specifically, the vbo module maintins vertex attributes
-       * in vbo_exec_context::array::inputs, which is synchronized with
-       * gl_context::Array::ArrayObj::VertexAttrib by vbo_bind_array().
-       * vbo_draw_arrays() calls vbo_bind_array() to perform the
-       * synchronization before calling the real draw call,
-       * vbo_context::draw_arrays.
-       *
-       * At this point (after performing a resolve meta-op but before calling
-       * vbo_bind_array), the gl_context's vertex attributes have been
-       * restored to their original state (that is, their state before the
-       * meta-op began), but the vbo module's vertex attribute are those used
-       * in the last meta-op. Therefore we must manually synchronize the two with
-       * vbo_bind_array() before continuing with the original draw command.
-       */
-      _mesa_update_state(ctx);
-      vbo_bind_arrays(ctx);
-      _mesa_update_state(ctx);
+      intel_miptree_all_slices_resolve_depth(intel, tex_obj->mt);
    }
 }
 
@@ -372,9 +335,7 @@ static void brw_postdraw_set_buffers_need_resolve(struct brw_context *brw)
    struct intel_renderbuffer *depth_irb =
 	 intel_get_renderbuffer(fb, BUFFER_DEPTH);
 
-   if (depth_irb &&
-       ctx->Depth.Mask &&
-       !brw->hiz.op) {
+   if (depth_irb && ctx->Depth.Mask) {
       intel_renderbuffer_set_needs_depth_resolve(depth_irb);
    }
 }
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c
index d071f87..f5e6fdc 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -372,7 +372,6 @@ static struct dirty_bit_map brw_bits[] = {
    DEFINE_BIT(BRW_NEW_GS_BINDING_TABLE),
    DEFINE_BIT(BRW_NEW_PS_BINDING_TABLE),
    DEFINE_BIT(BRW_NEW_STATE_BASE_ADDRESS),
-   DEFINE_BIT(BRW_NEW_HIZ),
    {0, 0, 0}
 };
 
diff --git a/src/mesa/drivers/dri/i965/brw_vtbl.c b/src/mesa/drivers/dri/i965/brw_vtbl.c
index be975d1..87d55db 100644
--- a/src/mesa/drivers/dri/i965/brw_vtbl.c
+++ b/src/mesa/drivers/dri/i965/brw_vtbl.c
@@ -70,9 +70,12 @@ static void brw_destroy_context( struct intel_context *intel )
 
    brw_destroy_state(brw);
    brw_draw_destroy( brw );
+
+   ralloc_free(brw->hiz.mem_ctx);
    ralloc_free(brw->wm.compile_data);
 
    dri_bo_release(&brw->curbe.curbe_bo);
+   dri_bo_release(&brw->hiz.vertex_bo);
    dri_bo_release(&brw->vs.const_bo);
    dri_bo_release(&brw->wm.const_bo);
 
@@ -236,6 +239,7 @@ void brwInitVtbl( struct brw_context *brw )
    brw->intel.vtbl.is_hiz_depth_format = brw_is_hiz_depth_format;
 
    if (brw->intel.has_hiz) {
+      brw->intel.vtbl.hiz_init = gen6_hiz_init;
       brw->intel.vtbl.resolve_depth_slice = gen6_resolve_depth_slice;
       brw->intel.vtbl.resolve_hiz_slice = gen6_resolve_hiz_slice;
    }
diff --git a/src/mesa/drivers/dri/i965/gen6_clip_state.c b/src/mesa/drivers/dri/i965/gen6_clip_state.c
index d2a5f75..b3bb8ae 100644
--- a/src/mesa/drivers/dri/i965/gen6_clip_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_clip_state.c
@@ -67,23 +67,6 @@ upload_clip_state(struct brw_context *brw)
          GEN6_CLIP_NON_PERSPECTIVE_BARYCENTRIC_ENABLE;
    }
 
-   if (brw->hiz.op) {
-      /* HiZ operations emit a rectangle primitive, which requires clipping to
-       * be disabled. From page 10 of the Sandy Bridge PRM Volume 2 Part 1
-       * Section 1.3 3D Primitives Overview:
-       *    RECTLIST:
-       *    Either the CLIP unit should be DISABLED, or the CLIP unit's Clip
-       *    Mode should be set to a value other than CLIPMODE_NORMAL.
-       */
-      BEGIN_BATCH(4);
-      OUT_BATCH(_3DSTATE_CLIP << 16 | (4 - 2));
-      OUT_BATCH(0);
-      OUT_BATCH(0);
-      OUT_BATCH(0);
-      ADVANCE_BATCH();
-      return;
-   }
-
    if (!ctx->Transform.DepthClamp)
       depth_clamp = GEN6_CLIP_Z_TEST;
 
@@ -124,8 +107,7 @@ const struct brw_tracked_state gen6_clip_state = {
    .dirty = {
       .mesa  = _NEW_TRANSFORM | _NEW_LIGHT,
       .brw   = (BRW_NEW_CONTEXT |
-                BRW_NEW_FRAGMENT_PROGRAM |
-                BRW_NEW_HIZ),
+                BRW_NEW_FRAGMENT_PROGRAM),
       .cache = 0
    },
    .emit = upload_clip_state,
diff --git a/src/mesa/drivers/dri/i965/gen6_depthstencil.c b/src/mesa/drivers/dri/i965/gen6_depthstencil.c
index d9f686a..4ea517f 100644
--- a/src/mesa/drivers/dri/i965/gen6_depthstencil.c
+++ b/src/mesa/drivers/dri/i965/gen6_depthstencil.c
@@ -82,11 +82,7 @@ gen6_upload_depth_stencil_state(struct brw_context *brw)
    }
 
    /* _NEW_DEPTH */
-   if ((ctx->Depth.Test || brw->hiz.op) && depth_irb) {
-      assert(brw->hiz.op != BRW_HIZ_OP_DEPTH_RESOLVE || ctx->Depth.Test);
-      assert(brw->hiz.op != BRW_HIZ_OP_HIZ_RESOLVE   || !ctx->Depth.Test);
-      assert(brw->hiz.op != BRW_HIZ_OP_DEPTH_CLEAR   || !ctx->Depth.Test);
-
+   if (ctx->Depth.Test && depth_irb) {
       ds->ds2.depth_test_enable = ctx->Depth.Test;
       ds->ds2.depth_test_func = intel_translate_compare_func(ctx->Depth.Func);
       ds->ds2.depth_write_enable = ctx->Depth.Mask;
@@ -98,8 +94,7 @@ gen6_upload_depth_stencil_state(struct brw_context *brw)
 const struct brw_tracked_state gen6_depth_stencil_state = {
    .dirty = {
       .mesa = _NEW_DEPTH | _NEW_STENCIL | _NEW_BUFFERS,
-      .brw  = (BRW_NEW_BATCH |
-	       BRW_NEW_HIZ),
+      .brw  = BRW_NEW_BATCH,
       .cache = 0,
    },
    .emit = gen6_upload_depth_stencil_state,
diff --git a/src/mesa/drivers/dri/i965/gen6_hiz.c b/src/mesa/drivers/dri/i965/gen6_hiz.c
index d7698ed..b716ede 100644
--- a/src/mesa/drivers/dri/i965/gen6_hiz.c
+++ b/src/mesa/drivers/dri/i965/gen6_hiz.c
@@ -25,324 +25,848 @@
 
 #include <assert.h>
 
-#include "mesa/drivers/common/meta.h"
-
-#include "mesa/main/arrayobj.h"
-#include "mesa/main/bufferobj.h"
-#include "mesa/main/depth.h"
-#include "mesa/main/enable.h"
-#include "mesa/main/fbobject.h"
-#include "mesa/main/framebuffer.h"
-#include "mesa/main/get.h"
-#include "mesa/main/renderbuffer.h"
+#include "glsl/ralloc.h"
+
 #include "mesa/main/shaderapi.h"
-#include "mesa/main/varray.h"
+#include "mesa/main/shaderobj.h"
 
+#include "intel_batchbuffer.h"
 #include "intel_fbo.h"
 #include "intel_mipmap_tree.h"
-#include "intel_regions.h"
-#include "intel_tex.h"
 
 #include "brw_context.h"
 #include "brw_defines.h"
+#include "brw_state.h"
+#include "brw_vs.h"
+#include "brw_wm.h"
 
-static const uint32_t gen6_hiz_meta_save =
-
-      /* Disable alpha, depth, and stencil test.
-       *
-       * See the following sections of the Sandy Bridge PRM, Volume 1, Part2:
-       *   - 7.5.3.1 Depth Buffer Clear
-       *   - 7.5.3.2 Depth Buffer Resolve
-       *   - 7.5.3.3 Hierarchical Depth Buffer Resolve
-       */
-      MESA_META_ALPHA_TEST |
-      MESA_META_DEPTH_TEST |
-      MESA_META_STENCIL_TEST |
-
-      /* Disable viewport mapping.
-       *
-       * From page 11 of the Sandy Bridge PRM, Volume 2, Part 1, Section 1.3
-       * 3D Primitives Overview:
-       *    RECTLIST:
-       *    Viewport Mapping must be DISABLED (as is typical with the use of
-       *    screen- space coordinates).
-       *
-       * We must also manually disable 3DSTATE_SF.Viewport_Transform_Enable.
-       */
-      MESA_META_VIEWPORT |
-
-      /* Disable clipping.
-       *
-       * From page 11 of the Sandy Bridge PRM, Volume 2, Part 1, Section 1.3
-       * 3D Primitives Overview:
-       *     Either the CLIP unit should be DISABLED, or the CLIP unit’s Clip
-       *     Mode should be set to a value other than CLIPMODE_NORMAL.
-       */
-      MESA_META_CLIP |
-
-      /* Render a solid rectangle (set 3DSTATE_SF.FrontFace_Fill_Mode).
-       *
-       * From page 249 of the Sandy Bridge PRM, Volume 2, Part 1, Section
-       * 6.4.1.1 3DSTATE_SF, FrontFace_Fill_Mode:
-       *     SOLID: Any triangle or rectangle object found to be front-facing
-       *     is rendered as a solid object. This setting is required when
-       *     (rendering rectangle (RECTLIST) objects.
-       * Also see field BackFace_Fill_Mode.
-       *
-       * Note: MESA_META_RASTERIZAION also disables culling, but that is
-       * irrelevant. See 3DSTATE_SF.Cull_Mode.
-       */
-      MESA_META_RASTERIZATION |
-
-      /* Each HiZ operation uses a vertex shader and VAO. */
-      MESA_META_SHADER |
-      MESA_META_VERTEX |
-
-      /* Disable scissoring.
-       *
-       * Scissoring is disabled for resolves because a resolve operation
-       * should resolve the entire buffer. Scissoring is disabled for depth
-       * clears because, if we are performing a partial depth clear, then we
-       * specify the clear region with the RECTLIST vertices.
-       */
-      MESA_META_SCISSOR |
-
-      MESA_META_SELECT_FEEDBACK;
+/**
+ * For an overview of the HiZ operations, see the following sections of the
+ * Sandy Bridge PRM, Volume 1, Part2:
+ *   - 7.5.3.1 Depth Buffer Clear
+ *   - 7.5.3.2 Depth Buffer Resolve
+ *   - 7.5.3.3 Hierarchical Depth Buffer Resolve
+ */
+static enum gen6_hiz_op {
+   GEN6_HIZ_OP_DEPTH_CLEAR,
+   GEN6_HIZ_OP_DEPTH_RESOLVE,
+   GEN6_HIZ_OP_HIZ_RESOLVE,
+};
 
-static void
-gen6_hiz_get_framebuffer_enum(struct gl_context *ctx,
-                              GLenum *bind_enum,
-                              GLenum *get_enum)
-{
-   if (ctx->Extensions.EXT_framebuffer_blit && ctx->API == API_OPENGL) {
-      /* Different buffers may be bound to GL_DRAW_FRAMEBUFFER and
-       * GL_READ_FRAMEBUFFER. Take care to not disrupt the read buffer.
-       */
-      *bind_enum = GL_DRAW_FRAMEBUFFER;
-      *get_enum = GL_DRAW_FRAMEBUFFER_BINDING;
-   } else {
-      /* The enums GL_DRAW_FRAMEBUFFER and GL_READ_FRAMEBUFFER do not exist.
-       * The bound framebuffer is both the read and draw buffer.
-       */
-      *bind_enum = GL_FRAMEBUFFER;
-      *get_enum = GL_FRAMEBUFFER_BINDING;
-   }
-}
+/**
+ * \name Constants for HiZ VBO
+ * \{
+ *
+ * \see brw_context::hiz::vertex_bo
+ */
+#define GEN6_HIZ_VERTEX_SIZE 8
+#define GEN6_HIZ_NUM_VERTICES 3
+#define GEN6_HIZ_VBO_SIZE (GEN6_HIZ_NUM_VERTICES * GEN6_HIZ_VERTEX_SIZE)
+/** \{ */
 
 /**
- * Initialize static data needed for HiZ operations.
+ * \return true on success
  */
-static void
-gen6_hiz_init(struct brw_context *brw)
+void
+gen6_hiz_init(struct intel_context *intel)
 {
-   struct gl_context *ctx = &brw->intel.ctx;
+   struct gl_context *ctx = &intel->ctx;
+   struct brw_context *brw = brw_context(ctx);
    struct brw_hiz_state *hiz = &brw->hiz;
-   GLenum fb_bind_enum, fb_get_enum;
 
-   if (hiz->fbo != 0)
+   bool ok;
+
+   if (hiz->mem_ctx != NULL) {
+      /* Already initialized. */
       return;
+   }
 
-   gen6_hiz_get_framebuffer_enum(ctx, &fb_bind_enum, &fb_get_enum);
+   hiz->mem_ctx = ralloc_context(NULL);
+   if (!hiz->mem_ctx)
+      goto out_of_memory;
+
+   /* Allocate vertex_bo. */
+   {
+      hiz->vertex_bo = drm_intel_bo_alloc(intel->bufmgr,
+                                          "bufferobj",
+                                          GEN6_HIZ_VERTEX_SIZE,
+                                          GEN6_HIZ_VBO_SIZE);
+      if (!hiz->vertex_bo)
+         goto out_of_memory;
+      drm_intel_bo_reference(hiz->vertex_bo);
+   }
 
-   /* Create depthbuffer.
-    *
-    * Until glRenderbufferStorage is called, the renderbuffer hash table
-    * maps the renderbuffer name to a dummy renderbuffer. We need the
-    * renderbuffer to be registered in the hash table so that framebuffer
-    * validation succeeds, so we hackishly allocate storage then immediately
-    * discard it.
-    */
-   GLuint depth_rb_name;
-   _mesa_GenRenderbuffersEXT(1, &depth_rb_name);
-   _mesa_BindRenderbufferEXT(GL_RENDERBUFFER, depth_rb_name);
-   _mesa_RenderbufferStorageEXT(GL_RENDERBUFFER, GL_DEPTH_COMPONENT, 32, 32);
-   _mesa_reference_renderbuffer(&hiz->depth_rb,
-                                _mesa_lookup_renderbuffer(ctx, depth_rb_name));
-   intel_miptree_release(&((struct intel_renderbuffer*) hiz->depth_rb)->mt);
-
-   /* Setup FBO. */
-   _mesa_GenFramebuffersEXT(1, &hiz->fbo);
-   _mesa_BindFramebufferEXT(fb_bind_enum, hiz->fbo);
-   _mesa_FramebufferRenderbufferEXT(fb_bind_enum,
-                                    GL_DEPTH_ATTACHMENT,
-                                    GL_RENDERBUFFER,
-                                    hiz->depth_rb->Name);
-
-   /* Compile vertex shader. */
-   const char *vs_source =
-      "attribute vec4 position;\n"
-      "void main()\n"
-      "{\n"
-      "   gl_Position = position;\n"
-      "}\n";
-   GLuint vs = _mesa_CreateShaderObjectARB(GL_VERTEX_SHADER);
-   _mesa_ShaderSourceARB(vs, 1, &vs_source, NULL);
-   _mesa_CompileShaderARB(vs);
-
-   /* Compile fragment shader. */
-   const char *fs_source = "void main() {}";
-   GLuint fs = _mesa_CreateShaderObjectARB(GL_FRAGMENT_SHADER);
-   _mesa_ShaderSourceARB(fs, 1, &fs_source, NULL);
-   _mesa_CompileShaderARB(fs);
-
-   /* Link and use program. */
-   hiz->shader.program = _mesa_CreateProgramObjectARB();
-   _mesa_AttachShader(hiz->shader.program, vs);
-   _mesa_AttachShader(hiz->shader.program, fs);
-   _mesa_LinkProgramARB(hiz->shader.program);
-   _mesa_UseProgramObjectARB(hiz->shader.program);
-
-   /* Create and bind VAO. */
-   _mesa_GenVertexArrays(1, &hiz->vao);
-   _mesa_BindVertexArray(hiz->vao);
-
-   /* Setup VBO for 'position'. */
-   hiz->shader.position_location =
-      _mesa_GetAttribLocationARB(hiz->shader.program, "position");
-   _mesa_GenBuffersARB(1, &hiz->shader.position_vbo);
-   _mesa_BindBufferARB(GL_ARRAY_BUFFER_ARB, hiz->shader.position_vbo);
-   _mesa_VertexAttribPointerARB(hiz->shader.position_location,
-				2, /*components*/
-				GL_FLOAT,
-				GL_FALSE, /*normalized?*/
-				0, /*stride*/
-				NULL);
-   _mesa_EnableVertexAttribArrayARB(hiz->shader.position_location);
-
-   /* Cleanup. */
-   _mesa_DeleteShader(vs);
-   _mesa_DeleteShader(fs);
-}
+   /* Link GLSL program. */
+   struct gl_shader_program *gl_shader_program;
+   {
+      /* Compile vertex shader. */
+      const char *vs_source =
+         "attribute vec4 position;\n"
+         "void main()\n"
+         "{\n"
+         "   gl_Position = position;\n"
+         "}\n";
+      GLuint vs = _mesa_CreateShaderObjectARB(GL_VERTEX_SHADER);
+      _mesa_ShaderSourceARB(vs, 1, &vs_source, NULL);
+      _mesa_CompileShaderARB(vs);
+
+      /* Compile fragment shader. */
+      const char *fs_source = "void main() {}";
+      GLuint fs = _mesa_CreateShaderObjectARB(GL_FRAGMENT_SHADER);
+      _mesa_ShaderSourceARB(fs, 1, &fs_source, NULL);
+      _mesa_CompileShaderARB(fs);
+
+      /* Link program. */
+      GLuint program = _mesa_CreateProgramObjectARB();
+      _mesa_AttachShader(program, vs);
+      _mesa_AttachShader(program, fs);
+      _mesa_LinkProgramARB(program);
+
+      GLint link_status;
+      _mesa_GetProgramiv(program, GL_LINK_STATUS, &link_status);
+      if (!link_status)
+         goto program_failure;
+
+      gl_shader_program = _mesa_lookup_shader_program(ctx, program);
+
+      /* Clean up. */
+      _mesa_DeleteShader(vs);
+      _mesa_DeleteShader(fs);
+   }
 
-/**
- * Wrap \c brw->hiz.depth_rb around a miptree.
- *
- * \see gen6_hiz_teardown_depth_buffer()
- */
-static void
-gen6_hiz_setup_depth_buffer(struct brw_context *brw,
-			    struct intel_mipmap_tree *mt,
-			    unsigned int level,
-			    unsigned int layer)
-{
-   struct gl_renderbuffer *rb = brw->hiz.depth_rb;
-   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
+   /* Compile brw vertex program. */
+   {
+      struct brw_vs_compile c;
+      memset(&c, 0, sizeof(c));
+
+      struct brw_vertex_program *vp =
+         (struct brw_vertex_program*)
+         gl_shader_program->_LinkedShaders[MESA_SHADER_VERTEX]->Program;
+
+      brw_init_compile(brw, &c.func, hiz->mem_ctx);
+      c.vp = vp;
+      /* FIXME: The two fields below should be set by brw_vs_emit(). */
+      c.prog_data.outputs_written = vp->program.Base.OutputsWritten;
+      c.prog_data.inputs_read = vp->program.Base.InputsRead;
+      ok = brw_vs_emit(gl_shader_program, &c);
+      if (!ok)
+         goto program_failure;
+      hiz->vs_instr = brw_get_program(&c.func, &hiz->vs_size);
+
+      hiz->vs_key = rzalloc_size(hiz->mem_ctx, sizeof(*hiz->vs_key));
+      if (!hiz->vs_key)
+         goto out_of_memory;
+      hiz->vs_key->program_string_id = vp->id;
+
+      hiz->vs_data = ralloc_size(hiz->mem_ctx, sizeof(*hiz->vs_data));
+      if (!hiz->vs_data)
+         goto out_of_memory;
+      memcpy(hiz->vs_data, &c.prog_data, sizeof(*hiz->vs_data));
+   }
 
-   rb->Format = mt->format;
-   rb->_BaseFormat = _mesa_get_format_base_format(rb->Format);
-   rb->InternalFormat = rb->_BaseFormat;
-   rb->Width = mt->level[level].width;
-   rb->Height = mt->level[level].height;
+   /* Compile brw fragment program. */
+   {
+      struct brw_wm_compile c;
+      memset(&c, 0, sizeof(c));
+
+      struct brw_fragment_program *fp =
+         (struct brw_fragment_program*)
+         gl_shader_program->_LinkedShaders[MESA_SHADER_FRAGMENT]->Program;
+
+      c.fp = fp;
+      brw_init_compile(brw, &c.func, hiz->mem_ctx);
+      ok = brw_wm_fs_emit(brw, &c, gl_shader_program);
+      if (!ok)
+         goto program_failure;
+      hiz->wm_instr = brw_get_program(&c.func, &hiz->wm_size);
+
+      hiz->wm_key = rzalloc_size(hiz->mem_ctx, sizeof(*hiz->wm_key));
+      if (!hiz->wm_key)
+         goto out_of_memory;
+      hiz->wm_key->program_string_id = fp->id;
+
+      hiz->wm_data = ralloc_size(hiz->mem_ctx, sizeof(*hiz->wm_data));
+      if (!hiz->wm_data)
+         goto out_of_memory;
+      memcpy(hiz->wm_data, &c.prog_data, sizeof(*hiz->wm_data));
+   }
+
+   return;
+
+out_of_memory:
+   _mesa_error(ctx, GL_OUT_OF_MEMORY, "allocation failed");
+   goto clean_up;
 
-   irb->mt_level = level;
-   irb->mt_layer = layer;
+program_failure:
+   _mesa_problem(ctx, "%s: failed to create HiZ program", __FILE__);
+   goto clean_up;
 
-   intel_miptree_reference(&irb->mt, mt);
-   intel_renderbuffer_set_draw_offset(irb);
+clean_up:
+   if (hiz->mem_ctx) {
+      ralloc_free(hiz->mem_ctx);
+      hiz->mem_ctx = NULL;
+   }
 }
 
 /**
- * Release the region from \c brw->hiz.depth_rb.
+ * \brief Execute a HiZ op on a miptree slice.
+ *
+ * To execute the HiZ op, this function manually constructs and emits a batch
+ * to "draw" the HiZ op's rectangle primitive. The batchbuffer is flushed
+ * before constructing and after emitting the batch.
  *
- * \see gen6_hiz_setup_depth_buffer()
+ * This function alters no GL state.
+ *
+ * For an overview of HiZ ops, see the following sections of the Sandy Bridge
+ * PRM, Volume 1, Part2:
+ *   - 7.5.3.1 Depth Buffer Clear
+ *   - 7.5.3.2 Depth Buffer Resolve
+ *   - 7.5.3.3 Hierarchical Depth Buffer Resolve
  */
 static void
-gen6_hiz_teardown_depth_buffer(struct gl_renderbuffer *rb)
-{
-   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
-   intel_miptree_release(&irb->mt);
-}
-
-static void
-gen6_resolve_slice(struct intel_context *intel,
-	         struct intel_mipmap_tree *mt,
-		 unsigned int level,
-		 unsigned int layer,
-                 enum brw_hiz_op op)
+gen6_hiz_exec(struct intel_context *intel,
+              struct intel_mipmap_tree *mt,
+              unsigned int level,
+              unsigned int layer,
+              enum gen6_hiz_op op)
 {
    struct gl_context *ctx = &intel->ctx;
    struct brw_context *brw = brw_context(ctx);
    struct brw_hiz_state *hiz = &brw->hiz;
-   GLenum fb_bind_enum, fb_get_enum;
-
-   /* Do not recurse. */
-   assert(!brw->hiz.op);
 
+   assert(op != GEN6_HIZ_OP_DEPTH_CLEAR); /* Not implemented yet. */
    assert(mt->hiz_mt != NULL);
-   assert(level >= mt->first_level);
-   assert(level <= mt->last_level);
-   assert(layer < mt->level[level].depth);
+   intel_miptree_check_level_layer(mt, level, layer);
+
+   /* Setup VBO for rectlist.
+    *
+    * A rectangle primitive (3DPRIM_RECTLIST) consists of only three
+    * vertices. The vertices reside in screen space with DirectX coordinates
+    * (that is, (0, 0) is the upper left corner).
+    *
+    *   v2 ------ implied
+    *    |        |
+    *    |        |
+    *   v0 ----- v1
+    */
+   {
+      const int width = mt->level[level].width;
+      const int height = mt->level[level].height;
+      const float vertices[] = {
+             0, height, /* v0 */
+         width, height, /* v1 */
+             0,      0, /* v2 */
+      };
+      drm_intel_bo_subdata(hiz->vertex_bo, 0, GEN6_HIZ_VBO_SIZE, vertices);
+   }
 
-   gen6_hiz_get_framebuffer_enum(ctx, &fb_bind_enum, &fb_get_enum);
+   /* Upload programs into cache.
+    *
+    * The vs_offset and wm_offset are offsets from the cache's base address,
+    * to which CMD_STATE_BASE_ADDRESS.InstructionBaseAddress is set.
+    * Therefore, for the offsets to be valid, the cache must be initialized
+    * before emitting CMD_STATE_BASE_ADDRESS.
+    */
+   uint32_t vs_offset;
+   uint32_t wm_offset;
+   {
+      void *junk;
+      brw_upload_cache(&brw->cache,
+                       BRW_VS_PROG,
+                       hiz->vs_key, sizeof(*hiz->vs_key),
+                       hiz->vs_instr, hiz->vs_size,
+                       NULL, 0,
+                       &vs_offset, &junk);
+      brw_upload_cache(&brw->cache,
+                       BRW_WM_PROG,
+                       hiz->wm_key, sizeof(*hiz->wm_key),
+                       hiz->wm_instr, hiz->wm_size,
+                       NULL, 0,
+                       &wm_offset, &junk);
+   }
 
-   /* Save state. */
-   GLint save_drawbuffer;
-   GLint save_renderbuffer;
-   _mesa_meta_begin(ctx, gen6_hiz_meta_save);
-   _mesa_GetIntegerv(fb_get_enum, &save_drawbuffer);
-   _mesa_GetIntegerv(GL_RENDERBUFFER_BINDING, &save_renderbuffer);
+   /* To ensure that the batch contains only the resolve, flush the batch
+    * before beginning and after finishing emitting the resolve packets.
+    *
+    * Ideally, we would not need to flush for the resolve op. But, I suspect
+    * that it's unsafe for CMD_PIPELINE_SELECT to occur multiple times in
+    * a single batch, and there is no safe way to ensure that other than by
+    * fencing the resolve with flushes. Ideally, we would just detect if
+    * a batch is in progress and do the right thing, but that would require
+    * the ability to safely manipulate brw_context::state::dirty::brw outside
+    * of brw_state_init().
+    */
+   intel_flush(ctx);
+
+   /* Emit the following packets:
+    *     CMD_PIPELINE_SELECT
+    *     3DSTATE_MULTISAMPLE
+    *     3DSTATE_SAMPLE_MASK
+    *     3DSTATE_GS_SVB_INDEX
+    *     CMD_STATE_SIP
+    *     CMD_VF_STATISTICS
+    */
+   brw_invariant_state.emit(brw);
 
-   /* Initialize context data for HiZ operations. */
-   gen6_hiz_init(brw);
+   /* CMD_STATE_BASE_ADDRESS
+    *
+    * From the Sandy Bridge PRM, Volume 1, Part 1, Table STATE_BASE_ADDRESS:
+    *
+    *     The following commands must be reissued following any change to the
+    *     base addresses:
+    *         3DSTATE_CC_POINTERS
+    *         3DSTATE_BINDING_TABLE_POINTERS
+    *         3DSTATE_SAMPLER_STATE_POINTERS
+    *         3DSTATE_VIEWPORT_STATE_POINTERS
+    *         MEDIA_STATE_POINTERS
+    */
+   {
+      BEGIN_BATCH(10);
+      OUT_BATCH(CMD_STATE_BASE_ADDRESS << 16 | (10 - 2));
+      OUT_BATCH(1); /* GeneralStateBaseAddressModifyEnable */
+      /* SurfaceStateBaseAddress */
+      OUT_RELOC(intel->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0, 1);
+      /* DynamicStateBaseAddress */
+      OUT_RELOC(intel->batch.bo, (I915_GEM_DOMAIN_RENDER |
+                                  I915_GEM_DOMAIN_INSTRUCTION), 0, 1);
+      OUT_BATCH(1); /* IndirectObjectBaseAddress */
+      /* InstructionBaseAddress */
+      OUT_RELOC(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0, 1);
+      OUT_BATCH(1); /* GeneralStateUpperBound */
+      OUT_BATCH(1); /* DynamicStateUpperBound */
+      OUT_BATCH(1); /* IndirectObjectUpperBound*/
+      OUT_BATCH(1); /* InstructionAccessUpperBound */
+      ADVANCE_BATCH();
+   }
 
-   /* Set depth state. */
-   if (!ctx->Depth.Mask) {
-      /* This sets 3DSTATE_WM.Depth_Buffer_Write_Enable. */
-      _mesa_DepthMask(GL_TRUE);
+   /* 3DSTATE_VIEWPORT_STATE_POINTERS
+    *
+    * Viewport mapping is disabled for the HiZ op, so it's safe to
+    * set all viewport state pointers to null.
+    *
+    * From page 11 of the Sandy Bridge PRM, Volume 2, Part 1, Section 1.3
+    * 3D Primitives Overview:
+    *    RECTLIST:
+    *    Viewport Mapping must be DISABLED (as is typical with the use of
+    *    screen- space coordinates).
+    */
+   {
+      BEGIN_BATCH(4);
+      OUT_BATCH(_3DSTATE_VIEWPORT_STATE_POINTERS << 16 | (4 - 2) |
+                GEN6_CC_VIEWPORT_MODIFY |
+                GEN6_SF_VIEWPORT_MODIFY |
+                GEN6_CLIP_VIEWPORT_MODIFY);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
    }
-   if (op == BRW_HIZ_OP_DEPTH_RESOLVE) {
-      _mesa_set_enable(ctx, GL_DEPTH_TEST, GL_TRUE);
-      _mesa_DepthFunc(GL_NEVER);
+
+   /* 3DSTATE_URB
+    *
+    * Assign the entire URB to the VS.
+    *
+    * A warning appears in the Sandybridge PRM Volume 2 Part 1 Section 1.4.7
+    * 3DSTATE_URB, but here we can safely ignore it:
+    *     Because of URB corruption caused by allocating a previous GS unit
+    *     URB entry to the VS unit, software is required to send a “GS NULL
+    *     Fence” (Send URB fence with VS URB size == 1 and GS URB size == 0)
+    *     plus a dummy DRAW call before any case where VS will be taking over
+    *     GS URB space.
+    * We can ignore it because this batch contains only one draw call.
+    */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      BEGIN_BATCH(3);
+      OUT_BATCH(_3DSTATE_URB << 16 | (3 - 2));
+      OUT_BATCH(brw->urb.max_vs_entries << GEN6_URB_VS_ENTRIES_SHIFT);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
    }
 
-   /* Setup FBO. */
-   gen6_hiz_setup_depth_buffer(brw, mt, level, layer);
-   _mesa_BindFramebufferEXT(fb_bind_enum, hiz->fbo);
+   /* DEPTH_STENCIL_STATE */
+   uint32_t depthstencil_offset;
+   {
+      assert(intel->gen == 6);
 
+      struct gen6_depth_stencil_state *state;
+      state = brw_state_batch(brw, AUB_TRACE_DEPTH_STENCIL_STATE,
+                                 sizeof(*state), 64,
+                                 &depthstencil_offset);
+      memset(state, 0, sizeof(*state));
 
-   /* A rectangle primitive (3DPRIM_RECTLIST) consists of only three vertices.
-    * The vertices reside in screen space with DirectX coordinates (this is,
-    * (0, 0) is the upper left corner).
+      /* See the following sections of the Sandy Bridge PRM, Volume 1, Part2:
+       *   - 7.5.3.1 Depth Buffer Clear
+       *   - 7.5.3.2 Depth Buffer Resolve
+       *   - 7.5.3.3 Hierarchical Depth Buffer Resolve
+       */
+      state->ds2.depth_write_enable = 1;
+      if (op == GEN6_HIZ_OP_DEPTH_RESOLVE) {
+         state->ds2.depth_test_enable = 1;
+         state->ds2.depth_test_func = COMPAREFUNC_NEVER;
+      }
+   }
+
+   /* 3DSTATE_CC_STATE_POINTERS
     *
-    *   v2 ------ implied
-    *    |        |
-    *    |        |
-    *   v0 ----- v1
+    * TODO(chad): No longer true.
+    * The resolve batch doesn't use BLEND_STATE or COLOR_CALC_STATE,
+    * so tell the GPU to not fetch them.
+    */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+      BEGIN_BATCH(4);
+      OUT_BATCH(_3DSTATE_CC_STATE_POINTERS << 16 | (4 - 2));
+      OUT_BATCH(1); /* blend */
+      OUT_BATCH(depthstencil_offset | 1); /* depth stencil */
+      OUT_BATCH(1); /* color calc */
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_SAMPLER_STATE_POINTERS
+    *
+    * The HiZ program uses no samplers.
+    */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+      BEGIN_BATCH(4);
+      OUT_BATCH(_3DSTATE_SAMPLER_STATE_POINTERS << 16 |
+                VS_SAMPLER_STATE_CHANGE |
+                GS_SAMPLER_STATE_CHANGE |
+                PS_SAMPLER_STATE_CHANGE |
+                (4 - 2));
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_CONSTANT_VS
+    *
+    * The HiZ program uses no constants.
+    */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+      assert(hiz->vs_data->nr_params == 0);
+      BEGIN_BATCH(5);
+      OUT_BATCH(_3DSTATE_CONSTANT_VS << 16 | (5 - 2));
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_VS */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      assert(hiz->vs_data->total_scratch == 0);
+
+      BEGIN_BATCH(6);
+      OUT_BATCH(_3DSTATE_VS << 16 | (6 - 2));
+      OUT_BATCH(vs_offset);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(1 << GEN6_VS_DISPATCH_START_GRF_SHIFT |
+                hiz->vs_data->urb_read_length << GEN6_VS_URB_READ_LENGTH_SHIFT |
+                0 << GEN6_VS_URB_ENTRY_READ_OFFSET_SHIFT);
+      OUT_BATCH((brw->max_vs_threads - 1) << GEN6_VS_MAX_THREADS_SHIFT |
+                GEN6_VS_STATISTICS_ENABLE |
+                GEN6_VS_ENABLE);
+      ADVANCE_BATCH();
+
+      BEGIN_BATCH(4);
+      OUT_BATCH(_3DSTATE_PIPE_CONTROL);
+      OUT_BATCH(PIPE_CONTROL_DEPTH_STALL |
+                PIPE_CONTROL_INSTRUCTION_FLUSH |
+                PIPE_CONTROL_STATE_CACHE_INVALIDATE);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_CONSTANT_GS
+    *
+    * The HiZ program has no geometry shader.
+    */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      BEGIN_BATCH(5);
+      OUT_BATCH(_3DSTATE_CONSTANT_GS << 16 | (5 - 2));
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_GS
+    *
+    * The HiZ program has no geometry shader.
+    */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      BEGIN_BATCH(7);
+      OUT_BATCH(_3DSTATE_GS << 16 | (7 - 2));
+      OUT_BATCH(0);
+      OUT_BATCH(0 << GEN6_GS_SAMPLER_COUNT_SHIFT |
+                0 << GEN6_GS_BINDING_TABLE_ENTRY_COUNT_SHIFT);
+      OUT_BATCH(0);
+      OUT_BATCH(1 << GEN6_GS_DISPATCH_START_GRF_SHIFT |
+                0 << GEN6_GS_URB_READ_LENGTH_SHIFT |
+                0 << GEN6_GS_URB_ENTRY_READ_OFFSET_SHIFT);
+      OUT_BATCH(0 << GEN6_GS_MAX_THREADS_SHIFT |
+                GEN6_GS_STATISTICS_ENABLE |
+                GEN6_GS_RENDERING_ENABLE);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_CLIP
+    *
+    * The HiZ op emits a rectangle primitive, which requires clipping to
+    * be disabled. From page 10 of the Sandy Bridge PRM Volume 2 Part 1
+    * Section 1.3 "3D Primitives Overview":
+    *    RECTLIST:
+    *    Either the CLIP unit should be DISABLED, or the CLIP unit's Clip
+    *    Mode should be set to a value other than CLIPMODE_NORMAL.
+    */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      BEGIN_BATCH(4);
+      OUT_BATCH(_3DSTATE_CLIP << 16 | (4 - 2));
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_SF
+    *
+    * Disable
+    * ViewportTransformEnable (dw2.1)
+    *
+    * From the SandyBridge PRM, Volume 2, Part 1, Section 1.3, "3D
+    * Primitives Overview":
+    *     RECTLIST: Viewport Mapping must be DISABLED (as is typical with the
+    *     use of screen- space coordinates).
+    *
+    * A solid rectangle must be rendered, so set FrontFaceFillMode (dw2.4:3)
+    * and BackFaceFillMode (dw2.5:6) to SOLID(0).
+    *
+    * From the Sandy Bridge PRM, Volume 2, Part 1, Section
+    * 6.4.1.1 3DSTATE_SF, Field FrontFaceFillMode:
+    *     SOLID: Any triangle or rectangle object found to be front-facing
+    *     is rendered as a solid object. This setting is required when
+    *     (rendering rectangle (RECTLIST) objects.
     */
-   const int width = hiz->depth_rb->Width;
-   const int height = hiz->depth_rb->Height;
-   const GLfloat positions[] = {
-          0, height,
-      width, height,
-          0,      0,
-   };
-
-   /* Setup program and vertex attributes. */
-   _mesa_UseProgramObjectARB(hiz->shader.program);
-   _mesa_BindVertexArray(hiz->vao);
-   _mesa_BindBufferARB(GL_ARRAY_BUFFER, hiz->shader.position_vbo);
-   _mesa_BufferDataARB(GL_ARRAY_BUFFER_ARB, sizeof(positions), positions,
-		       GL_DYNAMIC_DRAW_ARB);
-
-   /* Execute the HiZ operation. */
-   brw->hiz.op = op;
-   brw->state.dirty.brw |= BRW_NEW_HIZ;
-   _mesa_DrawArrays(GL_TRIANGLES, 0, 3);
-   brw->state.dirty.brw |= BRW_NEW_HIZ;
-   brw->hiz.op = BRW_HIZ_OP_NONE;
-
-   /* Restore state.
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      BEGIN_BATCH(20);
+      OUT_BATCH(_3DSTATE_SF << 16 | (20 - 2));
+      OUT_BATCH(GEN6_SF_SWIZZLE_ENABLE |
+                (1 - 1) << GEN6_SF_NUM_OUTPUTS_SHIFT | /* Only gl_Position. */
+                1 << GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT |
+                1 << GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT);
+      OUT_BATCH(GEN6_SF_STATISTICS_ENABLE);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_CONSTANT_PS
+    *
+    * Disable the push constant buffer.
+    */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      assert(hiz->wm_data->nr_params == 0);
+      BEGIN_BATCH(5);
+      OUT_BATCH(_3DSTATE_CONSTANT_PS << 16 | (5 - 2));
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_WM */
+   {
+      uint32_t dw4 = 0;
+      uint32_t dw5 = 0;
+
+      dw4 |= GEN6_WM_STATISTICS_ENABLE;
+
+      switch (op) {
+      case GEN6_HIZ_OP_DEPTH_CLEAR:
+         assert(!"not implemented");
+         dw4 |= GEN6_WM_DEPTH_CLEAR;
+         break;
+      case GEN6_HIZ_OP_DEPTH_RESOLVE:
+         dw4 |= GEN6_WM_DEPTH_RESOLVE;
+         break;
+      case GEN6_HIZ_OP_HIZ_RESOLVE:
+         dw4 |= GEN6_WM_HIERARCHICAL_DEPTH_RESOLVE;
+         break;
+      default:
+         assert(0);
+         break;
+      }
+
+      dw5 |= (brw->max_wm_threads - 1) << GEN6_WM_MAX_THREADS_SHIFT;
+
+      if (hiz->wm_data->dispatch_width == 8) {
+         dw5 |= GEN6_WM_8_DISPATCH_ENABLE;
+      } else {
+         dw5 |= GEN6_WM_16_DISPATCH_ENABLE;
+      }
+
+      if (hiz->wm_data->prog_offset_16) {
+         dw5 |= GEN6_WM_16_DISPATCH_ENABLE;
+      }
+
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      assert(hiz->wm_data->total_scratch == 0);
+      BEGIN_BATCH(9);
+      OUT_BATCH(_3DSTATE_WM << 16 | (9 - 2));
+      OUT_BATCH(wm_offset);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(dw4);
+      OUT_BATCH(dw5);
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      OUT_BATCH(wm_offset + hiz->wm_data->prog_offset_16);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_SCISSOR_STATE_POINTERS
     *
-    * The order in which state is restored is significant. The draw buffer
-    * used for the HiZ op has no stencil buffer, and glStencilFunc() clamps
-    * the stencil reference value to the range allowed by the draw buffer's
-    * number of stencil bits. So, the draw buffer binding must be restored
-    * before the stencil state, or else the stencil ref will be clamped to 0.
+    * The HiZ op does not use scissoring, so set the state pointer to null.
     */
-   gen6_hiz_teardown_depth_buffer(hiz->depth_rb);
-   _mesa_BindRenderbufferEXT(GL_RENDERBUFFER, save_renderbuffer);
-   _mesa_BindFramebufferEXT(fb_bind_enum, save_drawbuffer);
-   _mesa_meta_end(ctx);
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      BEGIN_BATCH(2);
+      OUT_BATCH(_3DSTATE_SCISSOR_STATE_POINTERS << 16 | (2 - 2));
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_BINDING_TABLE_POINTERS
+    *
+    * XXX: Is this right?
+    */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      gen6_binding_table_pointers.emit(brw);
+   }
+
+   /* 3DSTATE_DEPTH_BUFFER */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      /* 3DSTATE_DEPTH_BUFFER, 3DSTATE_STENCIL_BUFFER are both
+       * non-pipelined state that will need the PIPE_CONTROL workaround.
+       */
+      intel_emit_post_sync_nonzero_flush(intel);
+      intel_emit_depth_stall_flushes(intel);
+
+      uint32_t width = mt->level[level].width;
+      uint32_t height = mt->level[level].height;
+
+      uint32_t tile_x;
+      uint32_t tile_y;
+      uint32_t offset;
+      {
+         /* Construct a dummy renderbuffer just to extract tile offsets. */
+         struct intel_renderbuffer rb;
+         rb.mt = mt;
+         rb.mt_level = level;
+         rb.mt_layer = layer;
+         intel_renderbuffer_set_draw_offset(&rb);
+         offset = intel_renderbuffer_tile_offsets(&rb, &tile_x, &tile_y);
+      }
+
+      uint32_t format;
+      switch (mt->format) {
+      case MESA_FORMAT_Z16:       format = BRW_DEPTHFORMAT_D16_UNORM; break;
+      case MESA_FORMAT_Z32_FLOAT: format = BRW_DEPTHFORMAT_D32_FLOAT; break;
+      case MESA_FORMAT_X8_Z24:    format = BRW_DEPTHFORMAT_D24_UNORM_X8_UINT; break;
+      default:                    assert(0); break;
+      }
+
+      BEGIN_BATCH(7);
+      OUT_BATCH(_3DSTATE_DEPTH_BUFFER << 16 | (7 - 2));
+      OUT_BATCH(((mt->region->pitch * mt->region->cpp) - 1) |
+                format << 18 |
+                1 << 21 | /* separate stencil enable */
+                1 << 22 | /* hiz enable */
+                BRW_TILEWALK_YMAJOR << 26 |
+                1 << 27 | /* y-tiled */
+                BRW_SURFACE_2D << 29);
+      OUT_RELOC(mt->region->bo,
+                I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
+                offset);
+      OUT_BATCH(BRW_SURFACE_MIPMAPLAYOUT_BELOW << 1 |
+                (width + tile_x - 1) << 6 |
+                (height + tile_y - 1) << 19);
+      OUT_BATCH(0);
+      OUT_BATCH(tile_x |
+                tile_y << 16);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_HIER_DEPTH_BUFFER */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      struct intel_region *hiz_region = mt->hiz_mt->region;
+
+      BEGIN_BATCH(3);
+      OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2));
+      OUT_BATCH(hiz_region->pitch * hiz_region->cpp - 1);
+      OUT_RELOC(hiz_region->bo,
+                I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
+                0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_STENCIL_BUFFER */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      BEGIN_BATCH(3);
+      OUT_BATCH((_3DSTATE_STENCIL_BUFFER << 16) | (3 - 2));
+      OUT_BATCH(0);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_CLEAR_PARAMS */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+      intel_emit_post_sync_nonzero_flush(intel);
+      BEGIN_BATCH(2);
+      OUT_BATCH(_3DSTATE_CLEAR_PARAMS << 16 | (2 - 2));
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_DRAWING_RECTANGLE */
+   {
+      BEGIN_BATCH(4);
+      OUT_BATCH(_3DSTATE_DRAWING_RECTANGLE << 16 | (4 - 2));
+      OUT_BATCH(0);
+      OUT_BATCH(((mt->level[level].width - 1) & 0xffff) |
+                ((mt->level[level].height - 1) << 16));
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_VERTEX_BUFFERS */
+   {
+      const int num_buffers = 1;
+      const int batch_length = 1 + 4 * num_buffers;
+
+      uint32_t dw0 = GEN6_VB0_ACCESS_VERTEXDATA |
+                     GEN6_HIZ_VERTEX_SIZE << BRW_VB0_PITCH_SHIFT;
+      if (intel->gen >= 7)
+         dw0 |= GEN7_VB0_ADDRESS_MODIFYENABLE;
+
+      BEGIN_BATCH(batch_length);
+      OUT_BATCH((_3DSTATE_VERTEX_BUFFERS << 16) | (batch_length - 2));
+      OUT_BATCH(dw0);
+      OUT_RELOC(hiz->vertex_bo, I915_GEM_DOMAIN_VERTEX, 0, 0);
+      OUT_RELOC(hiz->vertex_bo, I915_GEM_DOMAIN_VERTEX, 0, hiz->vertex_bo->size - 1);
+      OUT_BATCH(0);
+      ADVANCE_BATCH();
+   }
+
+   /* 3DSTATE_VERTEX_ELEMENTS */
+   {
+      const int num_elements = 1;
+      const int batch_length = 1 + 2 * num_elements;
+
+      BEGIN_BATCH(batch_length);
+      OUT_BATCH((_3DSTATE_VERTEX_ELEMENTS << 16) | (batch_length - 2));
+      OUT_BATCH(0 << GEN6_VE0_INDEX_SHIFT |
+                GEN6_VE0_VALID |
+                BRW_SURFACEFORMAT_R32G32_FLOAT << BRW_VE0_FORMAT_SHIFT |
+                0 << BRW_VE0_SRC_OFFSET_SHIFT);
+      OUT_BATCH(BRW_VE1_COMPONENT_STORE_SRC   << BRW_VE1_COMPONENT_0_SHIFT |
+                BRW_VE1_COMPONENT_STORE_SRC   << BRW_VE1_COMPONENT_1_SHIFT |
+                BRW_VE1_COMPONENT_STORE_0     << BRW_VE1_COMPONENT_2_SHIFT |
+                BRW_VE1_COMPONENT_STORE_1_FLT << BRW_VE1_COMPONENT_3_SHIFT);
+      ADVANCE_BATCH();
+   }
+
+   /* CMD_3D_PRIM */
+   {
+      /* TODO(chad): gen7 */
+      assert(intel->gen == 6);
+
+     BEGIN_BATCH(6);
+     OUT_BATCH(CMD_3D_PRIM << 16 | (6 - 2) |
+               _3DPRIM_RECTLIST << GEN4_3DPRIM_TOPOLOGY_TYPE_SHIFT |
+               GEN4_3DPRIM_VERTEXBUFFER_ACCESS_SEQUENTIAL);
+     OUT_BATCH(3); /* vertex count per instance */
+     OUT_BATCH(0);
+     OUT_BATCH(1); /* instance count */
+     OUT_BATCH(0);
+     OUT_BATCH(0);
+     ADVANCE_BATCH();
+   }
+
+   /* See comments above at first invocation of intel_flush(). */
+   intel_flush(ctx);
+
+   /* Be safe. */
+   brw->state.dirty.brw = ~0;
+   brw->state.dirty.cache = ~0;
 }
 
 void
@@ -351,7 +875,7 @@ gen6_resolve_hiz_slice(struct intel_context *intel,
                        uint32_t level,
                        uint32_t layer)
 {
-   gen6_resolve_slice(intel, mt, level, layer, BRW_HIZ_OP_HIZ_RESOLVE);
+   gen6_hiz_exec(intel, mt, level, layer, GEN6_HIZ_OP_HIZ_RESOLVE);
 }
 
 
@@ -361,5 +885,5 @@ gen6_resolve_depth_slice(struct intel_context *intel,
                          uint32_t level,
                          uint32_t layer)
 {
-   gen6_resolve_slice(intel, mt, level, layer, BRW_HIZ_OP_DEPTH_RESOLVE);
+   gen6_hiz_exec(intel, mt, level, layer, GEN6_HIZ_OP_DEPTH_RESOLVE);
 }
diff --git a/src/mesa/drivers/dri/i965/gen6_hiz.h b/src/mesa/drivers/dri/i965/gen6_hiz.h
index 4929012..c2c4edd 100644
--- a/src/mesa/drivers/dri/i965/gen6_hiz.h
+++ b/src/mesa/drivers/dri/i965/gen6_hiz.h
@@ -28,6 +28,14 @@
 struct intel_context;
 struct intel_mipmap_tree;
 
+/**
+ * \brief Initialize data needed for HiZ ops, contained in brw_context::hiz.
+ *
+ * This function is idempotent.
+ */
+void
+gen6_hiz_init(struct intel_context *intel);
+
 void
 gen6_resolve_hiz_slice(struct intel_context *intel,
                        struct intel_mipmap_tree *mt,
diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c b/src/mesa/drivers/dri/i965/gen6_sf_state.c
index 163b54c..07b8e6d 100644
--- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
@@ -149,17 +149,8 @@ upload_sf_state(struct brw_context *brw)
       urb_entry_read_length << GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT |
       urb_entry_read_offset << GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT;
 
-   dw2 = GEN6_SF_STATISTICS_ENABLE;
-
-   /* Enable viewport transform only if no HiZ operation is progress
-    *
-    * From page 11 of the SandyBridge PRM, Volume 2, Part 1, Section 1.3, "3D
-    * Primitives Overview":
-    *     RECTLIST: Viewport Mapping must be DISABLED (as is typical with the
-    *     use of screen- space coordinates).
-    */
-   if (!brw->hiz.op)
-      dw2 |= GEN6_SF_VIEWPORT_TRANSFORM_ENABLE;
+   dw2 = GEN6_SF_STATISTICS_ENABLE |
+         GEN6_SF_VIEWPORT_TRANSFORM_ENABLE;
 
    dw3 = 0;
    dw4 = 0;
@@ -354,8 +345,7 @@ const struct brw_tracked_state gen6_sf_state = {
 		_NEW_POINT |
 		_NEW_TRANSFORM),
       .brw   = (BRW_NEW_CONTEXT |
-		BRW_NEW_FRAGMENT_PROGRAM |
-		BRW_NEW_HIZ),
+		BRW_NEW_FRAGMENT_PROGRAM),
       .cache = CACHE_NEW_VS_PROG
    },
    .emit = upload_sf_state,
diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c b/src/mesa/drivers/dri/i965/gen6_wm_state.c
index 3669811..205e648 100644
--- a/src/mesa/drivers/dri/i965/gen6_wm_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c
@@ -149,23 +149,6 @@ upload_wm_state(struct brw_context *brw)
    dw4 |= (brw->wm.prog_data->first_curbe_grf_16 <<
 	   GEN6_WM_DISPATCH_START_GRF_SHIFT_2);
 
-   switch (brw->hiz.op) {
-   case BRW_HIZ_OP_NONE:
-      break;
-   case BRW_HIZ_OP_DEPTH_CLEAR:
-      dw4 |= GEN6_WM_DEPTH_CLEAR;
-      break;
-   case BRW_HIZ_OP_DEPTH_RESOLVE:
-      dw4 |= GEN6_WM_DEPTH_RESOLVE;
-      break;
-   case BRW_HIZ_OP_HIZ_RESOLVE:
-      dw4 |= GEN6_WM_HIERARCHICAL_DEPTH_RESOLVE;
-      break;
-   default:
-      assert(0);
-      break;
-   }
-
    dw5 |= (brw->max_wm_threads - 1) << GEN6_WM_MAX_THREADS_SHIFT;
 
    /* CACHE_NEW_WM_PROG */
@@ -233,8 +216,7 @@ const struct brw_tracked_state gen6_wm_state = {
 		_NEW_PROGRAM_CONSTANTS |
 		_NEW_POLYGON),
       .brw   = (BRW_NEW_FRAGMENT_PROGRAM |
-		BRW_NEW_BATCH |
-		BRW_NEW_HIZ),
+		BRW_NEW_BATCH),
       .cache = (CACHE_NEW_SAMPLER |
 		CACHE_NEW_WM_PROG)
    },
diff --git a/src/mesa/drivers/dri/i965/gen7_clip_state.c b/src/mesa/drivers/dri/i965/gen7_clip_state.c
index 9be3ce9..c32cd98 100644
--- a/src/mesa/drivers/dri/i965/gen7_clip_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_clip_state.c
@@ -39,23 +39,6 @@ upload_clip_state(struct brw_context *brw)
    /* BRW_NEW_FRAGMENT_PROGRAM */
    const struct gl_fragment_program *fprog = brw->fragment_program;
 
-   if (brw->hiz.op) {
-      /* HiZ operations emit a rectangle primitive, which requires clipping to
-       * be disabled. From page 10 of the Sandy Bridge PRM Volume 2 Part 1
-       * Section 1.3 3D Primitives Overview:
-       *    RECTLIST:
-       *    Either the CLIP unit should be DISABLED, or the CLIP unit's Clip
-       *    Mode should be set to a value other than CLIPMODE_NORMAL.
-       */
-      BEGIN_BATCH(4);
-      OUT_BATCH(_3DSTATE_CLIP << 16 | (4 - 2));
-      OUT_BATCH(0);
-      OUT_BATCH(0);
-      OUT_BATCH(0);
-      ADVANCE_BATCH();
-      return;
-   }
-
    /* _NEW_BUFFERS */
    bool render_to_fbo = brw->intel.ctx.DrawBuffer->Name != 0;
 
@@ -133,8 +116,7 @@ const struct brw_tracked_state gen7_clip_state = {
                 _NEW_LIGHT |
                 _NEW_TRANSFORM),
       .brw   = (BRW_NEW_CONTEXT |
-                BRW_NEW_FRAGMENT_PROGRAM |
-                BRW_NEW_HIZ),
+                BRW_NEW_FRAGMENT_PROGRAM),
       .cache = 0
    },
    .emit = upload_clip_state,
diff --git a/src/mesa/drivers/dri/i965/gen7_sf_state.c b/src/mesa/drivers/dri/i965/gen7_sf_state.c
index da7ef81..b215af2 100644
--- a/src/mesa/drivers/dri/i965/gen7_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sf_state.c
@@ -149,8 +149,7 @@ const struct brw_tracked_state gen7_sbe_state = {
 		_NEW_PROGRAM |
 		_NEW_TRANSFORM),
       .brw   = (BRW_NEW_CONTEXT |
-		BRW_NEW_FRAGMENT_PROGRAM |
-		BRW_NEW_HIZ),
+		BRW_NEW_FRAGMENT_PROGRAM),
       .cache = CACHE_NEW_VS_PROG
    },
    .emit = upload_sbe_state,
@@ -166,17 +165,8 @@ upload_sf_state(struct brw_context *brw)
    /* _NEW_BUFFERS */
    bool render_to_fbo = brw->intel.ctx.DrawBuffer->Name != 0;
 
-   dw1 = GEN6_SF_STATISTICS_ENABLE;
-
-   /* Enable viewport transform only if no HiZ operation is progress
-    *
-    * From page 11 of the SandyBridge PRM, Volume 2, Part 1, Section 1.3, "3D
-    * Primitives Overview":
-    *     RECTLIST: Viewport Mapping must be DISABLED (as is typical with the
-    *     use of screen- space coordinates).
-    */
-   if (!brw->hiz.op)
-      dw1 |= GEN6_SF_VIEWPORT_TRANSFORM_ENABLE;
+   dw1 = GEN6_SF_STATISTICS_ENABLE |
+         GEN6_SF_VIEWPORT_TRANSFORM_ENABLE;
 
    /* _NEW_BUFFERS */
    dw1 |= (brw_depthbuffer_format(brw) << GEN7_SF_DEPTH_BUFFER_SURFACE_FORMAT_SHIFT);
@@ -310,8 +300,7 @@ const struct brw_tracked_state gen7_sf_state = {
 		_NEW_SCISSOR |
 		_NEW_BUFFERS |
 		_NEW_POINT),
-      .brw   = (BRW_NEW_CONTEXT |
-		BRW_NEW_HIZ),
+      .brw   = BRW_NEW_CONTEXT,
       .cache = CACHE_NEW_VS_PROG
    },
    .emit = upload_sf_state,
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c b/src/mesa/drivers/dri/i965/gen7_wm_state.c
index 32222f9..870590f 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c
@@ -49,23 +49,6 @@ upload_wm_state(struct brw_context *brw)
    dw1 |= GEN7_WM_LINE_AA_WIDTH_1_0;
    dw1 |= GEN7_WM_LINE_END_CAP_AA_WIDTH_0_5;
 
-   switch (brw->hiz.op) {
-   case BRW_HIZ_OP_NONE:
-      break;
-   case BRW_HIZ_OP_DEPTH_CLEAR:
-      dw1 |= GEN7_WM_DEPTH_CLEAR;
-      break;
-   case BRW_HIZ_OP_DEPTH_RESOLVE:
-      dw1 |= GEN7_WM_DEPTH_RESOLVE;
-      break;
-   case BRW_HIZ_OP_HIZ_RESOLVE:
-      dw1 |= GEN7_WM_HIERARCHICAL_DEPTH_RESOLVE;
-      break;
-   default:
-      assert(0);
-      break;
-   }
-
    /* _NEW_LINE */
    if (ctx->Line.StippleFlag)
       dw1 |= GEN7_WM_LINE_STIPPLE_ENABLE;
@@ -106,7 +89,6 @@ const struct brw_tracked_state gen7_wm_state = {
       .mesa  = (_NEW_LINE | _NEW_LIGHT | _NEW_POLYGON |
 	        _NEW_COLOR | _NEW_BUFFERS),
       .brw   = (BRW_NEW_FRAGMENT_PROGRAM |
-                BRW_NEW_HIZ |
 		BRW_NEW_BATCH),
       .cache = 0,
    },
diff --git a/src/mesa/drivers/dri/intel/intel_context.c b/src/mesa/drivers/dri/intel/intel_context.c
index e1aa67c..a2aa389 100644
--- a/src/mesa/drivers/dri/intel/intel_context.c
+++ b/src/mesa/drivers/dri/intel/intel_context.c
@@ -891,6 +891,11 @@ intelMakeCurrent(__DRIcontext * driContextPriv,
       _mesa_make_current(NULL, NULL, NULL);
    }
 
+   if (intel->has_hiz) {
+      /* Explain why this is here. */
+      intel->vtbl.hiz_init(intel);
+   }
+
    return true;
 }
 
diff --git a/src/mesa/drivers/dri/intel/intel_context.h b/src/mesa/drivers/dri/intel/intel_context.h
index fe01186..8961219 100644
--- a/src/mesa/drivers/dri/intel/intel_context.h
+++ b/src/mesa/drivers/dri/intel/intel_context.h
@@ -168,6 +168,10 @@ struct intel_context
        *   - 7.5.3.3 Hierarchical Depth Buffer Resolve
        * \{
        */
+
+      /** \brief Initialize context state for HiZ operations. */
+      void (*hiz_init)(struct intel_context *intel);
+
       void (*resolve_hiz_slice)(struct intel_context *intel,
 				struct intel_mipmap_tree *mt,
 				uint32_t level,
-- 
1.7.7.5



More information about the mesa-dev mailing list