[Mesa-stable] [PATCH 1/3] i965/vec4: fix vertical stride to avoid breaking region parameter rule

Andres Gomez agomez at igalia.com
Fri May 5 09:46:48 UTC 2017


Samuel, may it make sense to pick this series for 17.0 too?

Br.

On Wed, 2017-04-26 at 13:57 +0200, Samuel Iglesias Gonsálvez wrote:
> From IVB PRM, vol4, part3, "General Restrictions on Regioning
> Parameters":
> 
>   "If ExecSize = Width and HorzStride ≠ 0, VertStride must
>    be set to Width * HorzStride."
> 
> In next patch, we are going to modify the region parameter for
> uniforms and vgrf. For uniforms that are the source of
> DF align1 instructions, they will have <0, 4, 1> regioning and
> the execsize for those instructions will be 4, so they will break
> the regioning rule. This will be the same for VGRF sources where
> we use the vstride == 0 exploit.
> 
> As we know we are not going to cross the GRF boundary with that
> execsize and parameters (not even with the exploit), we just fix
> the vstride here.
> 
> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
> Cc: "17.1" <mesa-stable at lists.freedesktop.org>
> ---
>  src/intel/compiler/brw_reg.h    | 15 +++++++++++++++
>  src/intel/compiler/brw_vec4.cpp | 19 +++++++++++++++++++
>  2 files changed, 34 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
> index 17a51fbd655..24e09a84fce 100644
> --- a/src/intel/compiler/brw_reg.h
> +++ b/src/intel/compiler/brw_reg.h
> @@ -914,6 +914,21 @@ static inline unsigned cvt(unsigned val)
>     return 0;
>  }
>  
> +static inline unsigned inv_cvt(unsigned val)
> +{
> +   switch (val) {
> +   case 0: return 0;
> +   case 1: return 1;
> +   case 2: return 2;
> +   case 3: return 4;
> +   case 4: return 8;
> +   case 5: return 16;
> +   case 6: return 32;
> +   }
> +   return 0;
> +}
> +
> +
>  static inline struct brw_reg
>  stride(struct brw_reg reg, unsigned vstride, unsigned width, unsigned hstride)
>  {
> diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
> index f9b805ea5a9..95f96ea69c0 100644
> --- a/src/intel/compiler/brw_vec4.cpp
> +++ b/src/intel/compiler/brw_vec4.cpp
> @@ -38,6 +38,8 @@ using namespace brw;
>  
>  namespace brw {
>  
> +static bool is_align1_df(vec4_instruction *inst);
> +
>  void
>  src_reg::init()
>  {
> @@ -2049,6 +2051,23 @@ vec4_visitor::convert_to_hw_regs()
>  
>           apply_logical_swizzle(&reg, inst, i);
>           src = reg;
> +
> +         /* From IVB PRM, vol4, part3, "General Restrictions on Regioning
> +          * Parameters":
> +          *
> +          *   "If ExecSize = Width and HorzStride ≠ 0, VertStride must be set
> +          *    to Width * HorzStride."
> +          *
> +          * We can break this rule with DF sources on DF align1
> +          * instructions, because the exec_size would be 4 and width is 4.
> +          * As we know we are not accessing to next GRF, it is safe to
> +          * set vstride to the formula given by the rule itself.
> +          */
> +         if (is_align1_df(inst) && inst->exec_size == inv_cvt(src.width + 1)) {
> +            const unsigned width = inv_cvt(src.width + 1);
> +            const unsigned hstride = inv_cvt(src.hstride);
> +            src.vstride = cvt(width * hstride);
> +         }
>        }
>  
>        if (inst->is_3src(devinfo)) {
-- 
Br,

Andres


More information about the mesa-stable mailing list