[Mesa-dev] [PATCHv3 09/20] i965/fs: Get 64-bit indirect moves working on IVB.
Samuel Iglesias Gonsálvez
siglesias at igalia.com
Mon Feb 13 06:15:11 UTC 2017
On Fri, 2017-02-10 at 10:44 -0800, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
>
> > On Thu, 2017-02-09 at 18:28 -0800, Francisco Jerez wrote:
> > > Francisco Jerez <currojerez at riseup.net> writes:
> > >
> > > > ---
> > > > This replaces "[PATCH v2 09/20] i965/fs: indirect addressing
> > > > with
> > > > doubles is not supported in IVB/BYT".
> > > >
> > >
> > > Note that some of the fp64 indirect addressing test-cases still
> > > fail
> > > on
> > > IVB even with this patch applied, but the reason doesn't seem to
> > > have
> > > anything to do with indirect addressing. Apparently the
> > > remaining
> > > failures are caused by illegal code like:
> > >
> > > > sel(8) g30<1>D g9<0,2,1>DF g8.3<0,2,1>DF
> > > > {
> > > > align1 1Q };
> > >
> > > The problem is that the SEL instruction doesn't support most
> > > datatype
> > > conversions, so this leads to data corruption on at least IVB.
> > > According to the hardware docs, the only valid destination type
> > > of
> > > SEL
> > > is DF if the execution type is DF, and F (or HF on CHV+) if the
> > > execution type is F. Integer 16 or 32-bit execution types seem
> > > to
> > > allow
> > > converting to other single-precision integer types.
> > >
> >
> > Some weeks ago I saw this issue while working in a solution for
> > IVB's
> > MOV INDIRECT. That SEL is added by opt_peephole_sel():
> >
> > https://lists.freedesktop.org/archives/mesa-dev/2017-January/142023
> > .htm
> > l
> >
> > However, I did it very restrictive by just don't allowing any
> > conversion between different data type sizes. I am going to improve
> > that patch to include all the allowed cases.
> >
>
> I wouldn't worry too much about the couple of supported type
> conversions, it doesn't seem terribly important, my concern with your
> patch is that it does seem to allow several type conversions which
> are
> invalid, like D->F, F->D, DF->Q, etc.
>
Right. I can just disallowing all the conversions (src.type !=
dst.type) in that pass... but I prefer the solutions suggested below.
> Aside from that I'm not particularly confident that your patch will
> fix
> all potential sources of invalid conversions. E.g. do we know that
> no
> other optimization pass will ever change the destination type of the
> SEL
> assuming that the initial type was correct? What about other
> instructions with double-to-single or single-to-double type
> conversion
> restrictions? (A quick look at the PRMs would likely reveal
> instructions
> with similar restrictions) -- I don't think we can answer these
> questions without auditing the rest of the compiler back-end (while
> doing this in a legalization pass would be pretty much obviously
> correct). Alternatively you could fix the EU validator to recognize
> this and other cases of invalid conversions, which would make sure we
> get an assertion failure in the likely case that we've missed
> something
> so we can find the problem quickly instead of spending hours
> debugging
> non-deterministic data corruption issues. I'm okay either way,
> validation or legalization pass, do it as you like.
>
OK, thanks.
Sam
> > > I'm not sure why we haven't seen this cause massive breakage on
> > > HSW+,
> > > but I think we need some sort of legalization pass to do the type
> > > conversion in a separate MOV for any instruction with an invalid
> > > destination conversion. You may be able to do it within the d2x
> > > pass
> > > but then it would make sense to rename it to something more
> > > appropriate
> > > like destination conversion lowering (lower_cvt? :P).
> > >
> >
> > :)
> >
> > > In addition there seem to be other issues with your d2x lowering
> > > code
> > > (I'm bringing this up here because you don't seem to have sent
> > > the
> > > d2x
> > > changes for review to the mailing list yet) -- It removes the
> > > original
> > > instruction and creates a new one with the same opcode and first
> > > few
> > > sources, which will miscompile the program if the original
> > > instruction
> > > had a larger number of sources or any other instruction controls
> > > set. I
> > > suggest you modify the original instruction in-place to point it
> > > to
> > > the
> > > temporary you've allocated, and then copy the data into the
> > > original
> > > destination by using a strided move.
> > >
> >
> > OK.
> >
> > Thanks,
> >
> > Sam
> >
> > > > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 27
> > > > ++++++++++++++++++++++++--
> > > > 1 file changed, 25 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > index ea4a3fe1399..0e2dbe23571 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > @@ -440,7 +440,7 @@ fs_generator::generate_mov_indirect(fs_inst
> > > > *inst,
> > > > brw_MOV(p, dst, reg);
> > > > } else {
> > > > /* Prior to Broadwell, there are only 8 address
> > > > registers.
> > > > */
> > > > - assert(inst->exec_size == 8 || devinfo->gen >= 8);
> > > > + assert(inst->exec_size <= 8 || devinfo->gen >= 8);
> > > >
> > > > /* We use VxH indirect addressing, clobbering a0.0
> > > > through
> > > > a0.7. */
> > > > struct brw_reg addr = vec8(brw_address_reg(0));
> > > > @@ -478,7 +478,30 @@
> > > > fs_generator::generate_mov_indirect(fs_inst
> > > > *inst,
> > > > * code, using it saves us 0 instructions and would
> > > > require
> > > > quite a bit
> > > > * of case-by-case work. It's just not worth it.
> > > > */
> > > > - brw_ADD(p, addr, indirect_byte_offset,
> > > > brw_imm_uw(imm_byte_offset));
> > > > + if (devinfo->gen >= 8 || devinfo->is_haswell ||
> > > > type_sz(reg.type) < 8) {
> > > > + brw_ADD(p, addr, indirect_byte_offset,
> > > > brw_imm_uw(imm_byte_offset));
> > > > + } else {
> > > > + /* IVB reads two address register components per
> > > > channel
> > > > for
> > > > + * indirectly addressed 64-bit sources, so we need to
> > > > initialize
> > > > + * adjacent address components to consecutive dwords
> > > > of
> > > > the source
> > > > + * region by emitting two separate ADD
> > > > instructions. Found
> > > > + * empirically.
> > > > + */
> > > > + assert(inst->exec_size <= 4);
> > > > + brw_push_insn_state(p);
> > > > + brw_set_default_exec_size(p, cvt(inst->exec_size) -
> > > > 1);
> > > > +
> > > > + brw_ADD(p, spread(addr, 2), indirect_byte_offset,
> > > > + brw_imm_uw(imm_byte_offset));
> > > > + brw_inst_set_no_dd_clear(devinfo, brw_last_inst,
> > > > true);
> > > > +
> > > > + brw_ADD(p, spread(suboffset(addr, 1), 2),
> > > > indirect_byte_offset,
> > > > + brw_imm_uw(imm_byte_offset + 4));
> > > > + brw_inst_set_no_dd_check(devinfo, brw_last_inst,
> > > > true);
> > > > +
> > > > + brw_pop_insn_state(p);
> > > > + }
> > > > +
> > > > struct brw_reg ind_src = brw_VxH_indirect(0, 0);
> > > >
> > > > brw_inst *mov = brw_MOV(p, dst, retype(ind_src,
> > > > dst.type));
> > > > --
> > > > 2.11.0
> > > >
> > > > _______________________________________________
> > > > mesa-dev mailing list
> > > > mesa-dev at lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170213/6cf0888b/attachment-0001.sig>
More information about the mesa-dev
mailing list