[Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.
Samuel Iglesias Gonsálvez
siglesias at igalia.com
Mon Jan 23 10:59:11 UTC 2017
On Fri, 2017-01-20 at 14:25 -0800, Francisco Jerez wrote:
> Matt Turner <mattst88 at gmail.com> writes:
>
> > In commit c35fa7a, we changed the "width" of DF source registers to
> > 2,
> > which is conceptually fine. Unfortunately a VertStride of 2 is not
> > allowed by align16 instructions on IVB/BYT, and the regular
> > VertStride
> > of 4 works fine in any case.
> >
>
> I'll try to throw some light onto why this works -- AFAIUI, in
> Align16
> mode the vertical stride control doesn't behave as you would expect
> -- A
> VertStride=0 does behave as expected and steps zero elements across
> rows
> (modulo instruction decompression bugs), but on Gen7 any other value
> simply behaves as "step by a fixed offset of 16B across rows". On
> HSW
> they explicitly allowed VertStride=2, but I don't think the hardware
> became any smarter, it's still most likely a 16B fixed offset. On
> IVB
> neither VertStride=2 nor VertStride=4 is supposed to work for our
> purposes (the former because it's supposedly not supported and the
> latter because one would expect it to step by 4 DF elements = 32B per
> 16B row), but they both likely work in practice. Anyway let's stick
> to
> what the docs say is not illegal, a couple more comments below.
>
> > See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-
> > functions/vs-round-double.shader_test
> > for example:
> >
> > cmp.ge.f0(8) g18<1>DF g1<0>.xyxyDF -g8<2>DF {
> > align16 1Q };
> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
> > allowed
> > cmp.ge.f0(8) g19<1>DF g1<0>.xyxyDF -g9<2>DF {
> > align16 2N };
> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
> > allowed
> > ---
> > src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++++++++++++++----
> > 1 file changed, 14 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index 888f95e..a01083f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst
> > *inst, struct brw_reg reg)
> > /* This is an oddity of the fact we're using the same
> > * descriptions for registers in align_16 as align_1:
> > */
>
> Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so
> nobody gets confused and thinks that the code you added has to do
> with
> our representation of align_16 regions?
>
> > - if (reg.vstride == BRW_VERTICAL_STRIDE_8)
> > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
> > brw_inst_set_src0_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > - else
> > + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> > + reg.type == BRW_REGISTER_TYPE_DF &&
> > + reg.vstride >= BRW_VERTICAL_STRIDE_1) {
>
> I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF
> Align16
> regions with VertStride=4 really behave like VertStride=2. If the
> caller expected anything else it's not going to get it.
>
> Maybe copy-paste the relevant spec text here and below to explain why
> we
> only use BRW_VERTICAL_STRIDE_4?
>
Matt, I can handle these changes... however, I have not found the
relevant spec quote. Can you provide it?
Sam
> > + brw_inst_set_src0_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > + } else {
> > brw_inst_set_src0_vstride(devinfo, inst, reg.vstride);
> > + }
> > }
> > }
> > }
> > @@ -594,10 +599,15 @@ brw_set_src1(struct brw_codegen *p, brw_inst
> > *inst, struct brw_reg reg)
> > /* This is an oddity of the fact we're using the same
> > * descriptions for registers in align_16 as align_1:
> > */
> > - if (reg.vstride == BRW_VERTICAL_STRIDE_8)
> > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
> > brw_inst_set_src1_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > - else
> > + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> > + reg.type == BRW_REGISTER_TYPE_DF &&
> > + reg.vstride >= BRW_VERTICAL_STRIDE_1) {
> > + brw_inst_set_src1_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > + } else {
> > brw_inst_set_src1_vstride(devinfo, inst, reg.vstride);
> > + }
> > }
> > }
> > }
> > --
> > 2.7.3
More information about the mesa-dev
mailing list