[Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions
Iago Toral
itoral at igalia.com
Fri Mar 1 11:53:35 UTC 2019
On Fri, 2019-03-01 at 09:39 +0100, Iago Toral wrote:
> On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote:
> > Iago Toral <itoral at igalia.com> writes:
> >
> > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote:
> > > > Iago Toral <itoral at igalia.com> writes:
> > > >
> > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez wrote:
> > > > > > Iago Toral Quiroga <itoral at igalia.com> writes:
> > > > > >
> > > > > > > ---
> > > > > > > src/intel/compiler/brw_eu_validate.c | 64
> > > > > > > ++++++++++++-
> > > > > > > src/intel/compiler/test_eu_validate.cpp | 122
> > > > > > > ++++++++++++++++++++++++
> > > > > > > 2 files changed, 185 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/src/intel/compiler/brw_eu_validate.c
> > > > > > > b/src/intel/compiler/brw_eu_validate.c
> > > > > > > index 000a05cb6ac..203641fecb9 100644
> > > > > > > --- a/src/intel/compiler/brw_eu_validate.c
> > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
> > > > > > > @@ -531,7 +531,69 @@
> > > > > > > general_restrictions_based_on_operand_types(const struct
> > > > > > > gen_device_info *devinf
> > > > > > > exec_type_size == 8 && dst_type_size == 4)
> > > > > > > dst_type_size = 8;
> > > > > > >
> > > > > > > - if (exec_type_size > dst_type_size) {
> > > > > > > + /* From the BDW+ PRM:
> > > > > > > + *
> > > > > > > + * "There is no direct conversion from HF to DF or
> > > > > > > DF to
> > > > > > > HF.
> > > > > > > + * There is no direct conversion from HF to Q/UQ
> > > > > > > or
> > > > > > > Q/UQ to
> > > > > > > HF."
> > > > > > > + */
> > > > > > > + enum brw_reg_type src0_type =
> > > > > > > brw_inst_src0_type(devinfo,
> > > > > > > inst);
> > > > > > > + ERROR_IF(brw_inst_opcode(devinfo, inst) ==
> > > > > > > BRW_OPCODE_MOV
> > > > > > > &&
> > > > > >
> > > > > > Why is only the MOV instruction handled here and
> > > > > > below? Aren't
> > > > > > other
> > > > > > instructions able to do implicit conversions? Probably
> > > > > > means
> > > > > > you
> > > > > > need
> > > > > > to deal with two sources rather than one.
> > > > >
> > > > > This comes from the programming notes of the MOV instruction
> > > > > (Volume
> > > > > 2a, Command Reference - Instructions - MOV), so it is
> > > > > described
> > > > > specifically for the MOV instruction. I should probably have
> > > > > made
> > > > > this
> > > > > clear in the comment.
> > > > >
> > > >
> > > > Maybe the one above is specified in the MOV page only, probably
> > > > due
> > > > to
> > > > an oversight (If these restrictions were really specific to the
> > > > MOV
> > > > instruction, what would prevent you from implementing such
> > > > conversions
> > > > through a different instruction? E.g. "ADD dst:df, src:hf, 0"
> > > > which
> > > > would be substantially more efficient than what you're doing in
> > > > PATCH
> > > > 02)
> > >
> > > Instructions that take HF can only be strictly HF or mix F and HF
> > > (mixed mode float), with MOV being the only exception. That means
> > > that
> > > any instruction like the one you use above are invalid. Maybe we
> > > should
> > > validate explicitly that instructions that use HF are strictly HF
> > > or
> > > mixed-float mode only?
> > >
> >
> > So you're acknowledging that the conversions checked above are
> > illegal
> > whether the instruction is a MOV or something else. Where are we
> > preventing instructions other than MOV with such conversions from
> > being
> > accepted by the validator?
>
> We aren't, because the validator is not checking, in general, for
> accepted type combinations for specific instructions anywhere as far
> as
> I can see. If we want to start doing this with HF conversions
> specifically, I guess it is fine, but in case it is not clear, I
> think
> it won't bring any immediate benefits with the
> VK_KHR_shader_float16_int8 implementation since this series only ever
> emits conversions involving HF operands via MOV instructions, which
> is
> why I thought that validating that no direct MOV conversions from
> DF/Q
> types ever happen was useful, since we have code in the driver to
> handle this scenario.
>
> > > > but I see other restriction checks in this patch which are
> > > > certainly specified in the generic regioning restrictions page
> > > > and
> > > > you're limiting to the MOV instruction...
> > >
> > > There are two rules below:
> > >
> > > 1. The one about conversions between integer and half-float.
> > > Again,
> > > these can only happen through MOV for the same reasons, so I
> > > think
> > > this
> > > one should be fine.
> > >
> >
> > Why do you think that can only happen through a MOV
> > instruction? The
> > hardware spec lists the following as a valid example in the
> > register
> > region restrictions page:
> >
> > > add (8) r10.0<2>:hf r11.0<8;8,1>:w r12.0<8;8,1>:w
>
> I thought these instructions were not valid. I looked again at the
> BDW
> PRM and indeed, it lists that HF destination for ADD is only allowed
> with HF source. The same for all other instructions that can take HF
> except (S)MOV. However, I have just now realized that the SKL+ PRM is
> different and it allows ADD to do this, but as far as I can see, ADD
> is
> the only exception: you cannot do this with MUL, MAD or MATH for
> example. I am not sure what to make of this, but I guess that if we
> want to validate this it doesn't hurt to be safe and assume that we
> can
> have more instructions doing implicit conversions.
>
> If course, there is the fact that we never emit these kind of
> conversions anyway, at least not with this series (we don't even use
> mixed-float mode) but I guess it is fine to add that validation if
> only
> to be safe in the future if we ever decide to start using the extra
> flexibility allowed by the hardware.
I am just now realizing that in Align16 mode, we always expect
destination horizontal stride to be exactly 1, no matter the type. This
means that Align16 mixed float instructions with HF destination
(implicit conversion from F->HF) would also require this (actually the
docs make this requirement explicit for mixed float instructions
anyway). This would make it impossible to satisfy the rule for DWord
alignment on F->HF conversions for implicit conversions in mixed float
mode instructions. I guess these conversion restrictions are formulated
specifically for Align1 and should be skipped for Align16, even if that
is not explicitly stated in the docs. Does that make sense?
>
> > > 2. The one about word destinations (of which we are only really
> > > implementing conversions from F->HF). Here the rule is more
> > > generic
> > > and
> > > I agree that expanding this to include any other mixed float mode
> > > instruction would make sense. However, validation for mixed float
> > > mode
> > > has its own set rules, some of which are incompatible with the
> > > general
> > > region restrictions being validated here, so I think it is
> > > inconvenient
> > > to try and do that validation here (see patch 36 and then patch
> > > 37).
> > > What I would propose, if you agree, is that we only implement
> > > this
> > > for
> > > MOV here, and then for mixed float mode instructions, we
> > > implement
> > > the
> > > more generic version of this check (that would then go in patch
> > > 37).
> > > How does that sound?
> > >
> >
> > I still don't understand why you want to implement the same
> > restriction
> > twice, once for MOV and once for all other mixed-mode
> > instructions. How
> > is that more convenient?
>
> The restrictions based on operand types that are checked in the
> validator are specific to Byte or cases where the execution type is
> larger than the destination stride, for which mixed float has a
> different set of restrictions.
>
> For example, for mixed float we have this rule:
>
> "In Align1, destination stride can be smaller than execution type"
>
> Which overrides this other from "General restrictions based on
> operand
> types":
>
> "Destination stride must be equal to the ratio of the sizes of the
> execution data type to the destination type"
>
> So I thought that it would make things easier to keep all
> restrictions
> for mixed float separate and make sure that we skipped mixed float
> instructions in general_restrictions_based_on_operand_types() so we
> avoid having to add code to skip individual general restrictions that
> that are overriden for mixed float mode anyway.
>
> Anyway, I'll try to rework the patches to do more generic validation
> of
> HF conversions like you ask and see what kind of code we end up with.
>
> > > > > > > + ((dst_type == BRW_REGISTER_TYPE_HF &&
> > > > > > > type_sz(src0_type) == 8) ||
> > > > > > > + (dst_type_size == 8 && src0_type ==
> > > > > > > BRW_REGISTER_TYPE_HF)),
> > > > > > > + "There are no direct conversion between 64-
> > > > > > > bit
> > > > > > > types
> > > > > > > and HF");
> > > > > > > +
> > > > > > > + /* From the BDW+ PRM:
> > > > > > > + *
> > > > > > > + * "Conversion between Integer and HF (Half Float)
> > > > > > > must
> > > > > > > be
> > > > > > > + * DWord-aligned and strided by a DWord on the
> > > > > > > destination."
> > > > > > > + *
> > > > > > > + * But this seems to be expanded on CHV and SKL+ by:
> > > > > > > + *
> > > > > > > + * "There is a relaxed alignment rule for word
> > > > > > > destinations.
> > > > > > > When
> > > > > > > + * the destination type is word (UW, W, HF),
> > > > > > > destination
> > > > > > > data types
> > > > > > > + * can be aligned to either the lowest word or the
> > > > > > > second
> > > > > > > lowest
> > > > > > > + * word of the execution channel. This means the
> > > > > > > destination
> > > > > > > data
> > > > > > > + * words can be either all in the even word
> > > > > > > locations or
> > > > > > > all
> > > > > > > in the
> > > > > > > + * odd word locations."
> > > > > > > + *
> > > > > > > + * We do not implement the second rule as is though,
> > > > > > > since
> > > > > > > empirical testing
> > > > > > > + * shows inconsistencies:
> > > > > > > + * - It suggests that packed 16-bit is not allowed,
> > > > > > > which
> > > > > > > is
> > > > > > > not true.
> > > > > > > + * - It suggests that conversions from Q/DF to W
> > > > > > > (which
> > > > > > > need
> > > > > > > to be 64-bit
> > > > > > > + * aligned on the destination) are not possible,
> > > > > > > which
> > > > > > > is
> > > > > > > not true.
> > > > > > > + * - It suggests that conversions from 16-bit
> > > > > > > executions
> > > > > > > types to W need
> > > > > > > + * to be 32-bit aligned, which doesn't seem to be
> > > > > > > necessary.
> > > > > > > + *
> > > > > > > + * So from this rule we only validate the implication
> > > > > > > that
> > > > > > > conversion from
> > > > > > > + * F to HF needs to be DWord aligned too (in BDW this
> > > > > > > is
> > > > > > > limited to
> > > > > > > + * conversions from integer types).
> > > > > > > + */
> > > > > > > + bool is_half_float_conversion =
> > > > > > > + brw_inst_opcode(devinfo, inst) == BRW_OPCODE_MOV
> > > > > > > &&
> > > > > > > + dst_type != src0_type &&
> > > > > > > + (dst_type == BRW_REGISTER_TYPE_HF || src0_type ==
> > > > > > > BRW_REGISTER_TYPE_HF);
> > > > > > > +
> > > > > > > + if (is_half_float_conversion) {
> > > > > > > + assert(devinfo->gen >= 8);
> > > > > > > +
> > > > > > > + if ((dst_type == BRW_REGISTER_TYPE_HF &&
> > > > > > > brw_reg_type_is_integer(src0_type)) ||
> > > > > > > + (brw_reg_type_is_integer(dst_type) &&
> > > > > > > src0_type
> > > > > > > ==
> > > > > > > BRW_REGISTER_TYPE_HF)) {
> > > > > > > + ERROR_IF(dst_stride * dst_type_size != 4,
> > > > > > > + "Conversions between integer and half-
> > > > > > > float
> > > > > > > must
> > > > > > > be strided "
> > > > > > > + "by a DWord on the destination");
> > > > > > > +
> > > > > > > + unsigned subreg =
> > > > > > > brw_inst_dst_da1_subreg_nr(devinfo,
> > > > > > > inst);
> > > > > > > + ERROR_IF(subreg % 4 != 0,
> > > > > > > + "Conversions between integer and half-
> > > > > > > float
> > > > > > > must
> > > > > > > be aligned "
> > > > > > > + "to a DWord on the destination");
> > > > > > > + } else if ((devinfo->is_cherryview || devinfo->gen
> > > > > > > > = 9)
> > > > > > >
> > > > > > > &&
> > > > > > > + dst_type == BRW_REGISTER_TYPE_HF) {
> > > > > > > + ERROR_IF(dst_stride != 2,
> > > > > > > + "Conversions to HF must have either
> > > > > > > all
> > > > > > > words in
> > > > > > > even word "
> > > > > > > + "locations or all words in odd word
> > > > > > > locations");
> > > > > > > + }
> > > > > > > +
> > > > > > > + } else if (exec_type_size > dst_type_size) {
> > > > > > > if (!(dst_type_is_byte &&
> > > > > > > inst_is_raw_move(devinfo,
> > > > > > > inst)))
> > > > > > > {
> > > > > > > ERROR_IF(dst_stride * dst_type_size !=
> > > > > > > exec_type_size,
> > > > > > > "Destination stride must be equal to
> > > > > > > the
> > > > > > > ratio
> > > > > > > of the sizes "
> > > > > > > diff --git a/src/intel/compiler/test_eu_validate.cpp
> > > > > > > b/src/intel/compiler/test_eu_validate.cpp
> > > > > > > index 73300b23122..1557b6d2452 100644
> > > > > > > --- a/src/intel/compiler/test_eu_validate.cpp
> > > > > > > +++ b/src/intel/compiler/test_eu_validate.cpp
> > > > > > > @@ -848,6 +848,128 @@ TEST_P(validation_test,
> > > > > > > byte_destination_relaxed_alignment)
> > > > > > > }
> > > > > > > }
> > > > > > >
> > > > > > > +TEST_P(validation_test, half_float_conversion)
> > > > > > > +{
> > > > > > > + static const struct {
> > > > > > > + enum brw_reg_type dst_type;
> > > > > > > + enum brw_reg_type src_type;
> > > > > > > + unsigned dst_stride;
> > > > > > > + unsigned dst_subnr;
> > > > > > > + bool expected_result;
> > > > > > > + } inst[] = {
> > > > > > > +#define INST(dst_type, src_type, dst_stride, dst_subnr,
> > > > > > > expected_result) \
> > > > > > > + {
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > \
> > > > > > > + BRW_REGISTER_TYPE_##dst_type,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > \
> > > > > > > + BRW_REGISTER_TYPE_##src_type,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > \
> > > > > > > + BRW_HORIZONTAL_STRIDE_##dst_stride,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > \
> > > > > > > + dst_subnr,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > \
> > > > > > > + expected_result,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > \
> > > > > > > + }
> > > > > > > +
> > > > > > > + /* MOV to half-float destination (F source handled
> > > > > > > later) */
> > > > > > > + INST(HF, B, 1, 0, false),
> > > > > > > + INST(HF, W, 1, 0, false),
> > > > > > > + INST(HF, HF, 1, 0, true),
> > > > > > > + INST(HF, HF, 1, 2, true),
> > > > > > > + INST(HF, D, 1, 0, false),
> > > > > > > + INST(HF, Q, 1, 0, false),
> > > > > > > + INST(HF, DF, 1, 0, false),
> > > > > > > + INST(HF, B, 2, 0, true),
> > > > > > > + INST(HF, B, 2, 2, false),
> > > > > > > + INST(HF, W, 2, 0, true),
> > > > > > > + INST(HF, W, 2, 2, false),
> > > > > > > + INST(HF, HF, 2, 0, true),
> > > > > > > + INST(HF, HF, 2, 2, true),
> > > > > > > + INST(HF, D, 2, 0, true),
> > > > > > > + INST(HF, D, 2, 2, false),
> > > > > > > + INST(HF, Q, 2, 0, false),
> > > > > > > + INST(HF, DF, 2, 0, false),
> > > > > > > + INST(HF, B, 4, 0, false),
> > > > > > > + INST(HF, W, 4, 0, false),
> > > > > > > + INST(HF, HF, 4, 0, true),
> > > > > > > + INST(HF, HF, 4, 2, true),
> > > > > > > + INST(HF, D, 4, 0, false),
> > > > > > > + INST(HF, Q, 4, 0, false),
> > > > > > > + INST(HF, DF, 4, 0, false),
> > > > > > > +
> > > > > > > + /* MOV from half-float source */
> > > > > > > + INST( B, HF, 1, 0, false),
> > > > > > > + INST( W, HF, 1, 0, false),
> > > > > > > + INST( D, HF, 1, 0, true),
> > > > > > > + INST( D, HF, 1, 4, true),
> > > > > > > + INST( F, HF, 1, 0, true),
> > > > > > > + INST( F, HF, 1, 4, true),
> > > > > > > + INST( Q, HF, 1, 0, false),
> > > > > > > + INST(DF, HF, 1, 0, false),
> > > > > > > + INST( B, HF, 2, 0, false),
> > > > > > > + INST( W, HF, 2, 0, true),
> > > > > > > + INST( W, HF, 2, 2, false),
> > > > > > > + INST( D, HF, 2, 0, false),
> > > > > > > + INST( F, HF, 2, 0, true),
> > > > > > > + INST( F, HF, 2, 2, true),
> > > > > > > + INST( B, HF, 4, 0, true),
> > > > > > > + INST( B, HF, 4, 1, false),
> > > > > > > + INST( W, HF, 4, 0, false),
> > > > > > > +
> > > > > > > +#undef INST
> > > > > > > + };
> > > > > > > +
> > > > > > > + if (devinfo.gen < 8)
> > > > > > > + return;
> > > > > > > +
> > > > > > > + for (unsigned i = 0; i < sizeof(inst) /
> > > > > > > sizeof(inst[0]);
> > > > > > > i++) {
> > > > > > > + if (!devinfo.has_64bit_types &&
> > > > > > > + (type_sz(inst[i].src_type) == 8 ||
> > > > > > > type_sz(inst[i].dst_type) == 8)) {
> > > > > > > + continue;
> > > > > > > + }
> > > > > > > +
> > > > > > > + brw_MOV(p, retype(g0, inst[i].dst_type),
> > > > > > > retype(g0,
> > > > > > > inst[i].src_type));
> > > > > > > +
> > > > > > > + brw_inst_set_exec_size(&devinfo, last_inst,
> > > > > > > BRW_EXECUTE_4);
> > > > > > > +
> > > > > > > + brw_inst_set_dst_hstride(&devinfo, last_inst,
> > > > > > > inst[i].dst_stride);
> > > > > > > + brw_inst_set_dst_da1_subreg_nr(&devinfo,
> > > > > > > last_inst,
> > > > > > > inst[i].dst_subnr);
> > > > > > > +
> > > > > > > + if (inst[i].src_type == BRW_REGISTER_TYPE_B) {
> > > > > > > + brw_inst_set_src0_vstride(&devinfo, last_inst,
> > > > > > > BRW_VERTICAL_STRIDE_4);
> > > > > > > + brw_inst_set_src0_width(&devinfo, last_inst,
> > > > > > > BRW_WIDTH_2);
> > > > > > > + brw_inst_set_src0_hstride(&devinfo, last_inst,
> > > > > > > BRW_HORIZONTAL_STRIDE_2);
> > > > > > > + } else {
> > > > > > > + brw_inst_set_src0_vstride(&devinfo, last_inst,
> > > > > > > BRW_VERTICAL_STRIDE_4);
> > > > > > > + brw_inst_set_src0_width(&devinfo, last_inst,
> > > > > > > BRW_WIDTH_4);
> > > > > > > + brw_inst_set_src0_hstride(&devinfo, last_inst,
> > > > > > > BRW_HORIZONTAL_STRIDE_1);
> > > > > > > + }
> > > > > > > +
> > > > > > > + EXPECT_EQ(inst[i].expected_result, validate(p));
> > > > > > > +
> > > > > > > + clear_instructions(p);
> > > > > > > + }
> > > > > > > +
> > > > > > > + /* Conversion from F to HF has Dword destination
> > > > > > > alignment
> > > > > > > restriction
> > > > > > > + * on CHV and SKL+ only
> > > > > > > + */
> > > > > > > + brw_MOV(p, retype(g0, BRW_REGISTER_TYPE_HF),
> > > > > > > + retype(g0, BRW_REGISTER_TYPE_F));
> > > > > > > +
> > > > > > > + brw_inst_set_dst_hstride(&devinfo, last_inst,
> > > > > > > BRW_HORIZONTAL_STRIDE_1);
> > > > > > > +
> > > > > > > + if (devinfo.gen >= 9 || devinfo.is_cherryview) {
> > > > > > > + EXPECT_FALSE(validate(p));
> > > > > > > + } else {
> > > > > > > + assert(devinfo.gen == 8);
> > > > > > > + EXPECT_TRUE(validate(p));
> > > > > > > + }
> > > > > > > + clear_instructions(p);
> > > > > > > +
> > > > > > > + brw_MOV(p, retype(g0, BRW_REGISTER_TYPE_HF),
> > > > > > > + retype(g0, BRW_REGISTER_TYPE_F));
> > > > > > > +
> > > > > > > + brw_inst_set_dst_hstride(&devinfo, last_inst,
> > > > > > > BRW_HORIZONTAL_STRIDE_2);
> > > > > > > +
> > > > > > > + EXPECT_TRUE(validate(p));
> > > > > > > + clear_instructions(p);
> > > > > > > +}
> > > > > > > +
> > > > > > > TEST_P(validation_test,
> > > > > > > vector_immediate_destination_alignment)
> > > > > > > {
> > > > > > > static const struct {
> > > > > > > --
> > > > > > > 2.17.1
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > mesa-dev mailing list
> > > > > > > mesa-dev at lists.freedesktop.org
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list