[Mesa-dev] [PATCH 2/2] i965: Consign COORD_REPLACE VS hacks to Pre-Gen6.

Kenneth Graunke kenneth at whitecape.org
Tue Feb 19 15:40:50 PST 2013


On 02/16/2013 07:29 AM, Paul Berry wrote:
> Pre-Gen6, the SF thread requires exact matching between VS output
> slots (aka VUE slots) and FS input slots, even when the corresponding
> VS output slot is unused due to being overwritten by point coordinate
> replacement (glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE)).
> As a result, we have a special hack in the VS to ensure when any
> texture coordinate is subject to point coordinate replacement, it is
> always allocated space in the VUE, even if it isn't written to by the
> VS.
>
> This hack isn't needed from Gen6 onwards, since SF (Gen7: SBE)
> swizzling has the ability to insert the point coordinate into
> gl_TexCoord[] without needing a corresponding unused VUE slot.
>
> Note that no modification of SF setup code is required for this
> patch--get_attr_override() already does the right thing.  However, we
> make a slight comment change to clarify why this works.
>
> In addition to eliminating unnecessary VS recompiles and saving
> precious URB space on Gen6+, this will save us the trouble of having
> to adjust this hack when we implement geometry shaders.
> ---
>   src/mesa/drivers/dri/i965/brw_vs.c        | 22 ++++++++++++----------
>   src/mesa/drivers/dri/i965/brw_vs.h        | 10 ++++++++++
>   src/mesa/drivers/dri/i965/gen6_sf_state.c | 13 ++++++++++++-
>   3 files changed, 34 insertions(+), 11 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c
> index 0810471..64659c0 100644
> --- a/src/mesa/drivers/dri/i965/brw_vs.c
> +++ b/src/mesa/drivers/dri/i965/brw_vs.c
> @@ -258,15 +258,17 @@ do_vs_prog(struct brw_context *brw,
>         c.prog_data.inputs_read |= VERT_BIT_EDGEFLAG;
>      }
>
> -   /* Put dummy slots into the VUE for the SF to put the replaced
> -    * point sprite coords in.  We shouldn't need these dummy slots,
> -    * which take up precious URB space, but it would mean that the SF
> -    * doesn't get nice aligned pairs of input coords into output
> -    * coords, which would be a pain to handle.
> -    */
> -   for (i = 0; i < 8; i++) {
> -      if (c.key.point_coord_replace & (1 << i))
> -	 c.prog_data.outputs_written |= BITFIELD64_BIT(VERT_RESULT_TEX0 + i);
> +   if (intel->gen < 6) {
> +      /* Put dummy slots into the VUE for the SF to put the replaced
> +       * point sprite coords in.  We shouldn't need these dummy slots,
> +       * which take up precious URB space, but it would mean that the SF
> +       * doesn't get nice aligned pairs of input coords into output
> +       * coords, which would be a pain to handle.
> +       */
> +      for (i = 0; i < 8; i++) {
> +         if (c.key.point_coord_replace & (1 << i))
> +            c.prog_data.outputs_written |= BITFIELD64_BIT(VERT_RESULT_TEX0 + i);
> +      }

This looks good to me.

I wonder, thought, whether we could just move this into 
compute_vue_map()...just assign_vue_slot() some dummy slots in the gen 
4/5 cases.  Perhaps as a follow-on (if it's possible at all)?

As is,
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

>      }
>
>      brw_compute_vue_map(brw, &c);
> @@ -429,7 +431,7 @@ static void brw_upload_vs_prog(struct brw_context *brw)
>      key.clamp_vertex_color = ctx->Light._ClampVertexColor;
>
>      /* _NEW_POINT */
> -   if (ctx->Point.PointSprite) {
> +   if (intel->gen < 6 && ctx->Point.PointSprite) {
>         for (i = 0; i < 8; i++) {
>   	 if (ctx->Point.CoordReplace[i])
>   	    key.point_coord_replace |= (1 << i);
> diff --git a/src/mesa/drivers/dri/i965/brw_vs.h b/src/mesa/drivers/dri/i965/brw_vs.h
> index 75c8a5f..caa8f7c 100644
> --- a/src/mesa/drivers/dri/i965/brw_vs.h
> +++ b/src/mesa/drivers/dri/i965/brw_vs.h
> @@ -86,7 +86,17 @@ struct brw_vs_prog_key {
>      GLuint userclip_planes_enabled_gen_4_5:MAX_CLIP_PLANES;
>
>      GLuint copy_edgeflag:1;
> +
> +   /**
> +    * For pre-Gen6 hardware, a bitfield indicating which texture coordinates
> +    * are going to be replaced with point coordinates (as a consequence of a
> +    * call to glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE)).  Because
> +    * our SF thread requires exact matching between VS outputs and FS inputs,
> +    * these texture coordinates will need to be unconditionally included in
> +    * the VUE, even if they aren't written by the vertex shader.
> +    */
>      GLuint point_coord_replace:8;
> +
>      GLuint clamp_vertex_color:1;
>
>      struct brw_sampler_prog_key_data tex;
> diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c b/src/mesa/drivers/dri/i965/gen6_sf_state.c
> index d88c49a..11c929c 100644
> --- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
> @@ -78,7 +78,18 @@ get_attr_override(struct brw_vue_map *vue_map, int urb_entry_read_offset,
>
>      if (slot == -1) {
>         /* This attribute does not exist in the VUE--that means that the vertex
> -       * shader did not write to it.  Behavior is undefined in this case, so
> +       * shader did not write to it.  This means that either:
> +       *
> +       * (a) This attribute is a texture coordinate, and it is going to be
> +       * replaced with point coordinates (as a consequence of a call to
> +       * glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE)), so the
> +       * hardware will ignore whatever attribute override we supply.
> +       *
> +       * (b) This attribute is read by the fragment shader but not written by
> +       * the vertex shader, so its value is undefined.  Therefore the
> +       * attribute override we supply doesn't matter.
> +       *
> +       * In either case the attribute override we supply doesn't matter, so
>          * just reference the first available attribute.
>          */
>         return 0;
>



More information about the mesa-dev mailing list