[Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.
Matt Turner
mattst88 at gmail.com
Mon Jan 23 22:58:22 UTC 2017
On Mon, Jan 23, 2017 at 2:59 AM, Samuel Iglesias Gonsálvez
<siglesias at igalia.com> wrote:
> On Fri, 2017-01-20 at 14:25 -0800, Francisco Jerez wrote:
>> Matt Turner <mattst88 at gmail.com> writes:
>>
>> > In commit c35fa7a, we changed the "width" of DF source registers to
>> > 2,
>> > which is conceptually fine. Unfortunately a VertStride of 2 is not
>> > allowed by align16 instructions on IVB/BYT, and the regular
>> > VertStride
>> > of 4 works fine in any case.
>> >
>>
>> I'll try to throw some light onto why this works -- AFAIUI, in
>> Align16
>> mode the vertical stride control doesn't behave as you would expect
>> -- A
>> VertStride=0 does behave as expected and steps zero elements across
>> rows
>> (modulo instruction decompression bugs), but on Gen7 any other value
>> simply behaves as "step by a fixed offset of 16B across rows". On
>> HSW
>> they explicitly allowed VertStride=2, but I don't think the hardware
>> became any smarter, it's still most likely a 16B fixed offset. On
>> IVB
>> neither VertStride=2 nor VertStride=4 is supposed to work for our
>> purposes (the former because it's supposedly not supported and the
>> latter because one would expect it to step by 4 DF elements = 32B per
>> 16B row), but they both likely work in practice. Anyway let's stick
>> to
>> what the docs say is not illegal, a couple more comments below.
>>
>> > See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-
>> > functions/vs-round-double.shader_test
>> > for example:
>> >
>> > cmp.ge.f0(8) g18<1>DF g1<0>.xyxyDF -g8<2>DF {
>> > align16 1Q };
>> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
>> > allowed
>> > cmp.ge.f0(8) g19<1>DF g1<0>.xyxyDF -g9<2>DF {
>> > align16 2N };
>> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
>> > allowed
>> > ---
>> > src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++++++++++++++----
>> > 1 file changed, 14 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > index 888f95e..a01083f 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst
>> > *inst, struct brw_reg reg)
>> > /* This is an oddity of the fact we're using the same
>> > * descriptions for registers in align_16 as align_1:
>> > */
>>
>> Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so
>> nobody gets confused and thinks that the code you added has to do
>> with
>> our representation of align_16 regions?
>>
>> > - if (reg.vstride == BRW_VERTICAL_STRIDE_8)
>> > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
>> > brw_inst_set_src0_vstride(devinfo, inst,
>> > BRW_VERTICAL_STRIDE_4);
>> > - else
>> > + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
>> > + reg.type == BRW_REGISTER_TYPE_DF &&
>> > + reg.vstride >= BRW_VERTICAL_STRIDE_1) {
>>
>> I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF
>> Align16
>> regions with VertStride=4 really behave like VertStride=2. If the
>> caller expected anything else it's not going to get it.
>>
>> Maybe copy-paste the relevant spec text here and below to explain why
>> we
>> only use BRW_VERTICAL_STRIDE_4?
>>
>
> Matt, I can handle these changes... however, I have not found the
> relevant spec quote. Can you provide it?
I could not find anything useful in the IVB PRM. The HSW PRM has the
second quote below.
In the internal documentation, it says for DevSNB "For Align16 access
mode, only encodings of 0000 and 0011 are allowed. Other codes are
reserved."
and for DevHSW "For Align16 access mode, only encodings of 0000, 0010
and 0011 are allowed. Other codes are reserved."
Presumably the DevSNB behavior applies to IVB as well.
Thanks for handling this.
More information about the mesa-dev
mailing list