[Mesa-dev] [PATCH v2] i965: Fix scalar VS float[] and vec2[] output arrays.

Jason Ekstrand jason at jlekstrand.net
Tue Nov 3 20:37:52 PST 2015


On Tue, Nov 3, 2015 at 7:32 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> The scalar VS backend has never handled float[] and vec2[] outputs
> correctly (my original code was broken).  Outputs need to be padded
> out to vec4 slots.
>
> In fs_visitor::nir_setup_outputs(), we tried to process each vec4 slot
> by looping from 0 to ALIGN(type_size_scalar(type), 4) / 4.  However,
> this is wrong: type_size_scalar() for a float[2] would return 2, or
> for vec2[2] it would return 4.  This looked like a single slot, even
> though in reality each array element would be stored in separate vec4
> slots.
>
> Because of this bug, outputs[] and output_components[] would not get
> initialized for the second element's VARYING_SLOT, which meant
> emit_urb_writes() would skip writing them.  Nothing used those values,
> and dead code elimination threw a party.
>
> To fix this, we introduce a new type_size_vec4_times_4() function which
> pads array elements correctly, but still counts in scalar components,
> generating correct indices in store_output intrinsics.
>
> Normally, varying packing avoids this problem by turning varyings into
> vec4s.  So this doesn't actually fix any Piglit or dEQP tests today.
> However, if varying packing is disabled, things would be broken.
> Tessellation shaders can't use varying packing, so this fixes various
> tcs-input Piglit tests on a branch of mine.
>
> v2: Shorten the implementation of type_size_4x to a single line (caught
>     by Connor Abbott), and rename it to type_size_vec4_times_4()

That's a good name.

Reviewed-by: Jason Ekstrand <jason.ekstrand at intel.com>

>     (renaming suggested by Jason Ekstrand).  Use type_size_vec4
>     rather than using type_size_vec4_times_4 and then dividing by 4.
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp     | 13 +++++++++++++
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  2 +-
>  src/mesa/drivers/dri/i965/brw_nir.c      |  3 ++-
>  src/mesa/drivers/dri/i965/brw_shader.h   |  1 +
>  4 files changed, 17 insertions(+), 2 deletions(-)
>
> I don't know why I thought this was different than 4 * type_size_vec4,
> but it's exactly that...too close to the problem I suppose :)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 4cc9626..3efa415 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -514,6 +514,19 @@ type_size_scalar(const struct glsl_type *type)
>  }
>
>  /**
> + * Returns the number of scalar components needed to store type, assuming
> + * that vectors are padded out to vec4.
> + *
> + * This has the packing rules of type_size_vec4(), but counts components
> + * similar to type_size_scalar().
> + */
> +extern "C" int
> +type_size_vec4_times_4(const struct glsl_type *type)
> +{
> +   return 4 * type_size_vec4(type);
> +}
> +
> +/**
>   * Create a MOV to read the timestamp register.
>   *
>   * The caller is responsible for emitting the MOV.  The return value is
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index b6eab06..f3e6b48 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -104,7 +104,7 @@ fs_visitor::nir_setup_outputs()
>        switch (stage) {
>        case MESA_SHADER_VERTEX:
>        case MESA_SHADER_GEOMETRY:
> -         for (unsigned int i = 0; i < ALIGN(type_size_scalar(var->type), 4) / 4; i++) {
> +         for (int i = 0; i < type_size_vec4(var->type); i++) {
>              int output = var->data.location + i;
>              this->outputs[output] = offset(reg, bld, 4 * i);
>              this->output_components[output] = vector_elements;
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c
> index a7a5eb5..dece208 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -150,7 +150,8 @@ brw_nir_lower_outputs(nir_shader *nir, bool is_scalar)
>     case MESA_SHADER_GEOMETRY:
>        if (is_scalar) {
>           nir_assign_var_locations(&nir->outputs, &nir->num_outputs,
> -                                  type_size_scalar);
> +                                  type_size_vec4_times_4);
> +         nir_lower_io(nir, nir_var_shader_out, type_size_vec4_times_4);
>        } else {
>           nir_foreach_variable(var, &nir->outputs)
>              var->data.driver_location = var->data.location;
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.h b/src/mesa/drivers/dri/i965/brw_shader.h
> index 6a2dfc9..29baebf 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.h
> +++ b/src/mesa/drivers/dri/i965/brw_shader.h
> @@ -277,6 +277,7 @@ bool brw_cs_precompile(struct gl_context *ctx,
>
>  int type_size_scalar(const struct glsl_type *type);
>  int type_size_vec4(const struct glsl_type *type);
> +int type_size_vec4_times_4(const struct glsl_type *type);
>
>  bool is_scalar_shader_stage(const struct brw_compiler *compiler, int stage);
>
> --
> 2.6.2
>


More information about the mesa-dev mailing list