<div dir="ltr">On 2 December 2013 11:39, Francisco Jerez <span dir="ltr"><<a href="mailto:currojerez@riseup.net" target="_blank">currojerez@riseup.net</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This will be used to pass image information to the shader when we<br>
cannot use typed surface reads and writes.  All entries except<br>
surface_idx and size are otherwise unused and will get eliminated by<br>
the uniform packing pass.  size will be used for bounds checking with<br>
some image formats and will be useful for ARB_shader_image_size too.<br>
surface_idx is always used.<br>
---<br>
 src/mesa/drivers/dri/i965/brw_context.h           | 42 +++++++++++++<br>
 src/mesa/drivers/dri/i965/brw_program.c           |  5 ++<br>
 src/mesa/drivers/dri/i965/brw_vec4_gs.c           |  4 ++<br>
 src/mesa/drivers/dri/i965/brw_vs.c                |  7 ++-<br>
 src/mesa/drivers/dri/i965/brw_wm.c                |  7 ++-<br>
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 76 +++++++++++++++++++++++<br>
 6 files changed, 138 insertions(+), 3 deletions(-)<br></blockquote><div><br></div><div>There's something going on in this patch that seems really dodgy to me.  I haven't yet figured out whether it will lead to user-visible bugs, but it breaks a key assumption that we make elsewhere in the driver.  That assumption is: the data contained in brw_stage_prog_data (and transitively contained in other data structures owned by brw_stage_prog_data) is initialized at the time the shader is compiled, and is thereafter constant.  The only circumstance in which it's ok for brw_stage_prog_data to point to data that gets modified later is if that data is owned by another data structure (this is what we do for uniforms, and it's the reason why the types of brw_stage_prog_data::param and brw_stage_prog_data::pull_param are "const float **".  What's going on is that brw_stage_prog_data points to an array of const float *'s, which it owns, and then those pointers point to the current values of the uniforms, which are owned by core mesa.<br>

<br></div><div>Your patch introduces data that is owned by brw_stage_prog_data (in the sense that it gets deleted when brw_stage_prog_data gets deleted), but which is not initialized at shader compilation time; instead it is initialized at the time of brw_upload_image_surfaces().  Additionally, pointers in brw_stage_prog_data::param and brw_stage_prog_data::pull_param point to these values.  That leads to an unexpected asymmetry--push and pull parameters that are image parameters are owned by brw_stage_prog_data, and pointed to in two separate ways, whereas all other push and pull parameters are owned by other data structures.<br>

<br></div><div>Another problem is that brw_stage_prog_data_compare() compares the contents of the image data.  However, brw_stage_prog_data_compare() is called just after compilation, to see if the program that has just been compiled matches a program that was previously compiled; if so we can re-use the previous program.  (This allows us to avoid inefficiency in the case where the we thought we had to do a state-dependent recompile, but once the recompile completes we have the same program we had before).  Since the data isn't initialized until brw_upload_image_surfaces(), your patch will cause brw_stage-prog_data_compare() to compare initialized data with uninitialized data, defeating this optimization.<br>

<br></div><div>What I think we should do instead is have a fixed array of BRW_MAX_IMAGES brw_image_param objects in brw_stage_state.  brw_upload_image_surfaces() <br>already has a pointer to the brw_stage_state, so it can just refer to this directly when calling brw->vtbl.update_image_surface().  And backend_visitor can easily be modified to have a pointer to the brw_stage_state as well, so backend_visitor::setup_image_uniform_values() can use that to find the brw_image_param objects.  That way we can remove the brw_image_param pointer from brw_stage_prog_data, and then image params will work exactly the same way that all other pull/push constants work. <br>

</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h<br>
index 0816912..dc606c0f 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_context.h<br>
+++ b/src/mesa/drivers/dri/i965/brw_context.h<br>
@@ -352,6 +352,7 @@ struct brw_stage_prog_data {<br>
<br>
    GLuint nr_params;       /**< number of float params/constants */<br>
    GLuint nr_pull_params;<br>
+   GLuint nr_image_params;<br>
<br>
    /* Pointers to tracked values (only valid once<br>
     * _mesa_load_state_parameters has been called at runtime).<br>
@@ -361,6 +362,47 @@ struct brw_stage_prog_data {<br>
     */<br>
    const float **param;<br>
    const float **pull_param;<br>
+   struct brw_image_param *image_param;<br>
+};<br>
+<br>
+/*<br>
+ * Image meta-data structure as laid out in the shader parameter<br>
+ * buffer.  Entries have to be 16B-aligned for the vec4 back-end to be<br>
+ * able to use them.  That's okay because the padding and any unused<br>
+ * entries [most of them except when we're doing untyped surface<br>
+ * access] will be removed by the uniform packing pass.<br>
+ */<br>
+#define BRW_IMAGE_PARAM_SURFACE_IDX_OFFSET      0<br>
+#define BRW_IMAGE_PARAM_OFFSET_OFFSET           4<br>
+#define BRW_IMAGE_PARAM_SIZE_OFFSET             8<br>
+#define BRW_IMAGE_PARAM_STRIDE_OFFSET           12<br>
+#define BRW_IMAGE_PARAM_TILING_OFFSET           16<br>
+#define BRW_IMAGE_PARAM_SWIZZLING_OFFSET        20<br>
+#define BRW_IMAGE_PARAM_SIZE                    24<br>
+<br>
+struct brw_image_param {<br>
+   /** Surface binding table index. */<br>
+   uint32_t surface_idx;<br>
+<br>
+   /** Surface X, Y and Z dimensions. */<br>
+   uint32_t size[3];<br>
+<br>
+   /** Offset applied to the X and Y surface coordinates. */<br>
+   uint32_t offset[2];<br>
+<br>
+   /** X-stride in bytes, Y-stride in bytes, horizontal Z-stride in<br>
+    * pixels, vertical Z-stride in pixels.<br>
+    */<br></blockquote><div><br></div><div>The terms "X-stride", "Y-stride", "horizontal Z-stride", and "vertical Z-stride" need more explanation.<br></div><div><br>Also, for multi-line Doxygen-style comments the Mesa convention is to do:<br>
<br>/**<br></div><div> * text text<br></div><div> * text text<br> */<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
+   uint32_t stride[4];<br></blockquote><div><br></div><div>On first reading, it was not obvious to me why these values are grouped into a single array rather than split out into 4 separate fields.  Can we add a comment to clarify?  Perhaps something like "these values are grouped into an array so that the shader can access them as the components of a single vec4."<br>
</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
+<br>
+   /** Log2 of the tiling modulus in the X, Y and Z dimension. */<br>
+   uint32_t tiling[3];<br></blockquote><div><br></div><div>The term "tiling modulus" needs more explanation.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

+<br>
+   /** Right shift to apply for surface address swizzling.  Two<br>
+    * different swizzles can be specified and will be applied one<br>
+    * after the other.  Use \c 0xff if any of the swizzles is not<br>
+    * required. */<br></blockquote><div><br></div><div>Let's add a sentence to this comment explaining that these swizzles will only be applied if brw->has_swizzling is true.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

+   uint32_t swizzling[2];<br>
 };<br>
<br>
 /* Data about a particular attempt to compile a program.  Note that<br>
diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c<br>
index 908782b..094deeb 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_program.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_program.c<br>
@@ -552,6 +552,10 @@ brw_stage_prog_data_compare(const void *in_a, const void *in_b)<br>
    if (memcmp(a->pull_param, b->pull_param, a->nr_pull_params * sizeof(void *)))<br>
       return false;<br>
<br>
+   if (memcmp(a->image_param, b->image_param,<br>
+              a->nr_image_params * sizeof(struct brw_image_param)))<br>
+      return false;<br>
+<br>
    return true;<br>
 }<br>
<br>
@@ -562,4 +566,5 @@ brw_stage_prog_data_free(const void *p)<br>
<br>
    ralloc_free(prog_data->param);<br>
    ralloc_free(prog_data->pull_param);<br>
+   ralloc_free(prog_data->image_param);<br>
 }<br>
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c<br>
index 8dbd1e8..5583bfd 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c<br>
@@ -61,11 +61,15 @@ do_gs_prog(struct brw_context *brw,<br>
<br>
    /* We also upload clip plane data as uniforms */<br>
    param_count += MAX_CLIP_PLANES * 4;<br>
+   param_count += gs->NumImages * BRW_IMAGE_PARAM_SIZE;<br>
<br>
    c.prog_data.base.base.param =<br>
       rzalloc_array(NULL, const float *, param_count);<br>
    c.prog_data.base.base.pull_param =<br>
       rzalloc_array(NULL, const float *, param_count);<br>
+   c.prog_data.base.base.image_param =<br>
+      rzalloc_array(NULL, struct brw_image_param, gs->NumImages);<br>
+   c.prog_data.base.base.nr_image_params = gs->NumImages;<br>
<br>
    if (gp->program.OutputType == GL_POINTS) {<br>
       /* When the output type is points, the geometry shader may output data<br>
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c<br>
index e9f92d4..8118b7e 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_vs.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_vs.c<br>
@@ -231,8 +231,9 @@ do_vs_prog(struct brw_context *brw,<br>
        * case being a float value that gets blown up to a vec4, so be<br>
        * conservative here.<br>
        */<br>
-      param_count = vs->num_uniform_components * 4;<br>
-<br>
+      param_count = (vs->num_uniform_components * 4  +<br>
+                     vs->NumImages * BRW_IMAGE_PARAM_SIZE);<br>
+      stage_prog_data->nr_image_params = vs->NumImages;<br>
    } else {<br>
       param_count = vp->program.Base.Parameters->NumParameters * 4;<br>
    }<br>
@@ -243,6 +244,8 @@ do_vs_prog(struct brw_context *brw,<br>
<br>
    stage_prog_data->param = rzalloc_array(NULL, const float *, param_count);<br>
    stage_prog_data->pull_param = rzalloc_array(NULL, const float *, param_count);<br>
+   stage_prog_data->image_param = rzalloc_array(NULL, struct brw_image_param,<br>
+                                                stage_prog_data->nr_image_params);<br>
<br>
    GLbitfield64 outputs_written = vp->program.Base.OutputsWritten;<br>
    prog_data.inputs_read = vp->program.Base.InputsRead;<br>
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c<br>
index d78d8e3..790c98b 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_wm.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_wm.c<br>
@@ -148,7 +148,9 @@ bool do_wm_prog(struct brw_context *brw,<br>
     */<br>
    int param_count;<br>
    if (fs) {<br>
-      param_count = fs->num_uniform_components;<br>
+      param_count = (fs->num_uniform_components +<br>
+                     fs->NumImages * BRW_IMAGE_PARAM_SIZE);<br>
+      c->prog_data.base.nr_image_params = fs->NumImages;<br>
    } else {<br>
       param_count = fp->program.Base.Parameters->NumParameters * 4;<br>
    }<br>
@@ -157,6 +159,9 @@ bool do_wm_prog(struct brw_context *brw,<br>
    c->prog_data.base.param = rzalloc_array(NULL, const float *, param_count);<br>
    c->prog_data.base.pull_param =<br>
       rzalloc_array(NULL, const float *, param_count);<br>
+   c->prog_data.base.image_param =<br>
+      rzalloc_array(NULL, struct brw_image_param,<br>
+                    c->prog_data.base.nr_image_params);<br>
    c->prog_data.base.nr_params = param_count;<br>
<br>
    memcpy(&c->key, key, sizeof(*key));<br>
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c<br>
index 532a18c..38df1be 100644<br>
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c<br>
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c<br>
@@ -678,6 +678,78 @@ get_image_format(struct brw_context *brw, gl_format format)<br>
 }<br>
<br>
 static void<br>
+update_buffer_image_param(struct brw_context *brw,<br>
+                          struct gl_image_unit *u,<br>
+                          struct brw_image_param *param)<br>
+{<br>
+   struct gl_buffer_object *obj = u->TexObj->BufferObject;<br>
+<br>
+   param->size[0] = (obj->Size / _mesa_get_format_bytes(u->_ActualFormat));<br>
+   param->size[1] = 0;<br>
+   param->size[2] = 0;<br>
+   param->offset[0] = 0;<br>
+   param->offset[1] = 0;<br>
+   param->stride[0] = _mesa_get_format_bytes(u->_ActualFormat);<br>
+   param->stride[1] = 0;<br>
+   param->stride[2] = 0;<br>
+   param->stride[3] = 0;<br>
+   param->tiling[0] = 0;<br>
+   param->tiling[1] = 0;<br>
+   param->tiling[2] = 0;<br>
+   param->swizzling[0] = 0xff;<br>
+   param->swizzling[1] = 0xff;<br>
+}<br>
+<br>
+static void<br>
+update_texture_image_param(struct brw_context *brw,<br>
+                           struct gl_image_unit *u,<br>
+                           struct brw_image_param *param)<br>
+{<br>
+   struct intel_mipmap_tree *mt = intel_texture_object(u->TexObj)->mt;<br>
+<br>
+   param->size[0] = minify(mt->logical_width0, u->Level);<br>
+   param->size[1] = minify(mt->logical_height0, u->Level);<br>
+<br>
+   if (u->Layered)<br>
+      param->size[2] = minify(mt->logical_depth0, u->Level);<br></blockquote><div><br></div><div>Minification should only occur if this is a 3D texture. <br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

+   else<br>
+      param->size[2] = 1;<br>
+<br>
+   intel_miptree_get_image_offset(mt, u->Level, u->Layer,<br>
+                                  &param->offset[0],<br>
+                                  &param->offset[1]);<br>
+<br>
+   param->stride[0] = mt->region->cpp;<br>
+   param->stride[1] = mt->region->pitch;<br>
+   param->stride[2] =<br>
+      brw_miptree_get_horizontal_slice_pitch(brw, mt, u->Level);<br>
+   param->stride[3] =<br>
+      brw_miptree_get_vertical_slice_pitch(brw, mt, u->Level);<br>
+<br>
+   if (mt->region->tiling == I915_TILING_X) {<br>
+      param->tiling[0] = (ffs(512 / mt->region->cpp) - 1);<br>
+      param->tiling[1] = 3;<br>
+      param->swizzling[0] = 3;<br>
+      param->swizzling[1] = 4;<br>
+   } else if (mt->region->tiling == I915_TILING_Y) {<br>
+      param->tiling[0] = (ffs(16 / mt->region->cpp) - 1);<br>
+      param->tiling[1] = 5;<br>
+      param->swizzling[0] = 3;<br>
+      param->swizzling[1] = 0xff;<br>
+   } else {<br>
+      param->tiling[0] = 0;<br>
+      param->tiling[1] = 0;<br>
+      param->swizzling[0] = 0xff;<br>
+      param->swizzling[1] = 0xff;<br>
+   }<br>
+<br>
+   if (u->TexObj->Target == GL_TEXTURE_3D)<br>
+      param->tiling[2] = u->Level;<br>
+   else<br>
+      param->tiling[2] = 0;<br>
+}<br>
+<br>
+static void<br>
 gen7_update_image_surface(struct brw_context *brw,<br>
                           struct gl_image_unit *u,<br>
                           GLenum access,<br>
@@ -709,6 +781,8 @@ gen7_update_image_surface(struct brw_context *brw,<br>
                                      0 /* mocs */,<br>
                                      access != GL_READ_ONLY);<br>
<br>
+      update_buffer_image_param(brw, u, param);<br>
+<br>
    } else {<br>
       struct intel_texture_object *intel_obj = intel_texture_object(obj);<br>
       struct intel_mipmap_tree *mt = intel_obj->mt;<br>
@@ -735,6 +809,8 @@ gen7_update_image_surface(struct brw_context *brw,<br>
                                          access != GL_READ_ONLY,<br>
                                          false);<br>
       }<br>
+<br>
+      update_texture_image_param(brw, u, param);<br>
    }<br>
 }<br>
<span><font color="#888888"><br>
--<br>
1.8.3.4<br>
<br>
_______________________________________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br>
<a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev" target="_blank">http://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</font></span></blockquote></div><br></div></div>