[Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.

Francisco Jerez currojerez at riseup.net
Fri Jan 20 22:25:29 UTC 2017


Matt Turner <mattst88 at gmail.com> writes:

> In commit c35fa7a, we changed the "width" of DF source registers to 2,
> which is conceptually fine. Unfortunately a VertStride of 2 is not
> allowed by align16 instructions on IVB/BYT, and the regular VertStride
> of 4 works fine in any case.
>

I'll try to throw some light onto why this works -- AFAIUI, in Align16
mode the vertical stride control doesn't behave as you would expect -- A
VertStride=0 does behave as expected and steps zero elements across rows
(modulo instruction decompression bugs), but on Gen7 any other value
simply behaves as "step by a fixed offset of 16B across rows".  On HSW
they explicitly allowed VertStride=2, but I don't think the hardware
became any smarter, it's still most likely a 16B fixed offset.  On IVB
neither VertStride=2 nor VertStride=4 is supposed to work for our
purposes (the former because it's supposedly not supported and the
latter because one would expect it to step by 4 DF elements = 32B per
16B row), but they both likely work in practice.  Anyway let's stick to
what the docs say is not illegal, a couple more comments below.

> See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-round-double.shader_test
> for example:
>
> cmp.ge.f0(8)    g18<1>DF        g1<0>.xyxyDF    -g8<2>DF        { align16 1Q };
>         ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed
> cmp.ge.f0(8)    g19<1>DF        g1<0>.xyxyDF    -g9<2>DF        { align16 2N };
>         ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed
> ---
>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index 888f95e..a01083f 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, struct brw_reg reg)
>  	 /* This is an oddity of the fact we're using the same
>  	  * descriptions for registers in align_16 as align_1:
>  	  */

Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so
nobody gets confused and thinks that the code you added has to do with
our representation of align_16 regions?

> -	 if (reg.vstride == BRW_VERTICAL_STRIDE_8)
> +         if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
>              brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
> -	 else
> +         } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> +                    reg.type == BRW_REGISTER_TYPE_DF &&
> +                    reg.vstride >= BRW_VERTICAL_STRIDE_1) {

I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF Align16
regions with VertStride=4 really behave like VertStride=2.  If the
caller expected anything else it's not going to get it.

Maybe copy-paste the relevant spec text here and below to explain why we
only use BRW_VERTICAL_STRIDE_4?

> +            brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
> +         } else {
>              brw_inst_set_src0_vstride(devinfo, inst, reg.vstride);
> +         }
>        }
>     }
>  }
> @@ -594,10 +599,15 @@ brw_set_src1(struct brw_codegen *p, brw_inst *inst, struct brw_reg reg)
>  	 /* This is an oddity of the fact we're using the same
>  	  * descriptions for registers in align_16 as align_1:
>  	  */
> -	 if (reg.vstride == BRW_VERTICAL_STRIDE_8)
> +         if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
>              brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
> -	 else
> +         } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> +                    reg.type == BRW_REGISTER_TYPE_DF &&
> +                    reg.vstride >= BRW_VERTICAL_STRIDE_1) {
> +            brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
> +         } else {
>              brw_inst_set_src1_vstride(devinfo, inst, reg.vstride);
> +         }
>        }
>     }
>  }
> -- 
> 2.7.3
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170120/992f0335/attachment.sig>


More information about the mesa-dev mailing list