[Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

Fri Mar 1 08:39:11 UTC 2019

On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote:
> Iago Toral <itoral at igalia.com> writes:
> 
> > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote:
> > > Iago Toral <itoral at igalia.com> writes:
> > > 
> > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez wrote:
> > > > > Iago Toral Quiroga <itoral at igalia.com> writes:
> > > > > 
> > > > > > ---
> > > > > >  src/intel/compiler/brw_eu_validate.c    |  64
> > > > > > ++++++++++++-
> > > > > >  src/intel/compiler/test_eu_validate.cpp | 122
> > > > > > ++++++++++++++++++++++++
> > > > > >  2 files changed, 185 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/src/intel/compiler/brw_eu_validate.c
> > > > > > b/src/intel/compiler/brw_eu_validate.c
> > > > > > index 000a05cb6ac..203641fecb9 100644
> > > > > > --- a/src/intel/compiler/brw_eu_validate.c
> > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
> > > > > > @@ -531,7 +531,69 @@
> > > > > > general_restrictions_based_on_operand_types(const struct
> > > > > > gen_device_info *devinf
> > > > > >         exec_type_size == 8 && dst_type_size == 4)
> > > > > >        dst_type_size = 8;
> > > > > >  
> > > > > > -   if (exec_type_size > dst_type_size) {
> > > > > > +   /* From the BDW+ PRM:
> > > > > > +    *
> > > > > > +    *    "There is no direct conversion from HF to DF or
> > > > > > DF to
> > > > > > HF.
> > > > > > +    *     There is no direct conversion from HF to Q/UQ or
> > > > > > Q/UQ to
> > > > > > HF."
> > > > > > +    */
> > > > > > +   enum brw_reg_type src0_type =
> > > > > > brw_inst_src0_type(devinfo,
> > > > > > inst);
> > > > > > +   ERROR_IF(brw_inst_opcode(devinfo, inst) ==
> > > > > > BRW_OPCODE_MOV
> > > > > > &&
> > > > > 
> > > > > Why is only the MOV instruction handled here and
> > > > > below?  Aren't
> > > > > other
> > > > > instructions able to do implicit conversions?  Probably means
> > > > > you
> > > > > need
> > > > > to deal with two sources rather than one.
> > > > 
> > > > This comes from the programming notes of the MOV instruction
> > > > (Volume
> > > > 2a, Command Reference - Instructions - MOV), so it is described
> > > > specifically for the MOV instruction. I should probably have
> > > > made
> > > > this
> > > > clear in the comment.
> > > > 
> > > 
> > > Maybe the one above is specified in the MOV page only, probably
> > > due
> > > to
> > > an oversight (If these restrictions were really specific to the
> > > MOV
> > > instruction, what would prevent you from implementing such
> > > conversions
> > > through a different instruction?  E.g. "ADD dst:df, src:hf, 0"
> > > which
> > > would be substantially more efficient than what you're doing in
> > > PATCH
> > > 02)
> > 
> > Instructions that take HF can only be strictly HF or mix F and HF
> > (mixed mode float), with MOV being the only exception. That means
> > that
> > any instruction like the one you use above are invalid. Maybe we
> > should
> > validate explicitly that instructions that use HF are strictly HF
> > or
> > mixed-float mode only?
> > 
> 
> So you're acknowledging that the conversions checked above are
> illegal
> whether the instruction is a MOV or something else.  Where are we
> preventing instructions other than MOV with such conversions from
> being
> accepted by the validator?

We aren't, because the validator is not checking, in general, for
accepted type combinations for specific instructions anywhere as far as
I can see. If we want to start doing this with HF conversions
specifically, I guess it is fine, but in case it is not clear, I think
it won't bring any immediate benefits with the
VK_KHR_shader_float16_int8 implementation since this series only ever
emits conversions involving HF operands via MOV instructions, which is
why I thought that validating that no direct MOV conversions from DF/Q
types ever happen was useful, since we have code in the driver to
handle this scenario.

> > >  but I see other restriction checks in this patch which are
> > > certainly specified in the generic regioning restrictions page
> > > and
> > > you're limiting to the MOV instruction...
> > 
> > There are two rules below:
> > 
> > 1. The one about conversions between integer and half-float. Again,
> > these can only happen through MOV for the same reasons, so I think
> > this
> > one should be fine.
> > 
> 
> Why do you think that can only happen through a MOV instruction?  The
> hardware spec lists the following as a valid example in the register
> region restrictions page:
> 
> > add (8) r10.0<2>:hf r11.0<8;8,1>:w r12.0<8;8,1>:w

I thought these instructions were not valid. I looked again at the BDW
PRM and indeed, it lists that HF destination for ADD is only allowed
with HF source. The same for all other instructions that can take HF
except (S)MOV. However, I have just now realized that the SKL+ PRM is
different and it allows ADD to do this, but as far as I can see, ADD is
the only exception: you cannot do this with MUL, MAD or MATH for
example. I am not sure what to make of this, but I guess that if we
want to validate this it doesn't hurt to be safe and assume that we can
have more instructions doing implicit conversions.

If course, there is the fact that we never emit these kind of
conversions anyway, at least not with this series (we don't even use
mixed-float mode) but I guess it is fine to add that validation if only
to be safe in the future if we ever decide to start using the extra
flexibility allowed by the hardware.

> > 2. The one about word destinations (of which we are only really
> > implementing conversions from F->HF). Here the rule is more generic
> > and
> > I agree that expanding this to include any other mixed float mode
> > instruction would make sense. However, validation for mixed float
> > mode
> > has its own set rules, some of which are incompatible with the
> > general
> > region restrictions being validated here, so I think it is
> > inconvenient
> > to try and do that validation  here (see patch 36 and then patch
> > 37).
> > What I would propose, if you agree, is that we only implement this
> > for
> > MOV here, and then for mixed float mode instructions, we implement
> > the
> > more generic version of this check (that would then go in patch
> > 37).
> > How does that sound?
> > 
> 
> I still don't understand why you want to implement the same
> restriction
> twice, once for MOV and once for all other mixed-mode
> instructions.  How
> is that more convenient?

The restrictions based on operand types that are checked in the
validator are specific to Byte or cases where the execution type is
larger than the destination stride, for which mixed float has a
different set of restrictions.

For example, for mixed float we have this rule:

"In Align1, destination stride can be smaller than execution type"

Which overrides this other from "General restrictions based on operand
types":

"Destination stride must be equal to the ratio of the sizes of the
execution data type to the destination type"

So I thought that it would make things easier to keep all restrictions
for mixed float separate and make sure that we skipped mixed float
instructions in general_restrictions_based_on_operand_types() so we
avoid having to add code to skip individual general restrictions that that are overriden for mixed float mode anyway.

Anyway, I'll try to rework the patches to do more generic validation of
HF conversions like you ask and see what kind of code we end up with.

> > > > > > +            ((dst_type == BRW_REGISTER_TYPE_HF &&
> > > > > > type_sz(src0_type) == 8) ||
> > > > > > +             (dst_type_size == 8 && src0_type ==
> > > > > > BRW_REGISTER_TYPE_HF)),
> > > > > > +            "There are no direct conversion between 64-bit
> > > > > > types
> > > > > > and HF");
> > > > > > +
> > > > > > +   /* From the BDW+ PRM:
> > > > > > +    *
> > > > > > +    *   "Conversion between Integer and HF (Half Float)
> > > > > > must
> > > > > > be
> > > > > > +    *    DWord-aligned and strided by a DWord on the
> > > > > > destination."
> > > > > > +    *
> > > > > > +    * But this seems to be expanded on CHV and SKL+ by:
> > > > > > +    *
> > > > > > +    *   "There is a relaxed alignment rule for word
> > > > > > destinations.
> > > > > > When
> > > > > > +    *    the destination type is word (UW, W, HF),
> > > > > > destination
> > > > > > data types
> > > > > > +    *    can be aligned to either the lowest word or the
> > > > > > second
> > > > > > lowest
> > > > > > +    *    word of the execution channel. This means the
> > > > > > destination
> > > > > > data
> > > > > > +    *    words can be either all in the even word
> > > > > > locations or
> > > > > > all
> > > > > > in the
> > > > > > +    *    odd word locations."
> > > > > > +    *
> > > > > > +    * We do not implement the second rule as is though,
> > > > > > since
> > > > > > empirical testing
> > > > > > +    * shows inconsistencies:
> > > > > > +    *   - It suggests that packed 16-bit is not allowed,
> > > > > > which
> > > > > > is
> > > > > > not true.
> > > > > > +    *   - It suggests that conversions from Q/DF to W
> > > > > > (which
> > > > > > need
> > > > > > to be 64-bit
> > > > > > +    *     aligned on the destination) are not possible,
> > > > > > which
> > > > > > is
> > > > > > not true.
> > > > > > +    *   - It suggests that conversions from 16-bit
> > > > > > executions
> > > > > > types to W need
> > > > > > +    *     to be 32-bit aligned, which doesn't seem to be
> > > > > > necessary.
> > > > > > +    *
> > > > > > +    * So from this rule we only validate the implication
> > > > > > that
> > > > > > conversion from
> > > > > > +    * F to HF needs to be DWord aligned too (in BDW this
> > > > > > is
> > > > > > limited to
> > > > > > +    * conversions from integer types).
> > > > > > +    */
> > > > > > +   bool is_half_float_conversion =
> > > > > > +       brw_inst_opcode(devinfo, inst) == BRW_OPCODE_MOV &&
> > > > > > +       dst_type != src0_type &&
> > > > > > +       (dst_type == BRW_REGISTER_TYPE_HF || src0_type ==
> > > > > > BRW_REGISTER_TYPE_HF);
> > > > > > +
> > > > > > +   if (is_half_float_conversion) {
> > > > > > +      assert(devinfo->gen >= 8);
> > > > > > +
> > > > > > +      if ((dst_type == BRW_REGISTER_TYPE_HF &&
> > > > > > brw_reg_type_is_integer(src0_type)) ||
> > > > > > +          (brw_reg_type_is_integer(dst_type) && src0_type
> > > > > > ==
> > > > > > BRW_REGISTER_TYPE_HF)) {
> > > > > > +         ERROR_IF(dst_stride * dst_type_size != 4,
> > > > > > +                  "Conversions between integer and half-
> > > > > > float
> > > > > > must
> > > > > > be strided "
> > > > > > +                  "by a DWord on the destination");
> > > > > > +
> > > > > > +         unsigned subreg =
> > > > > > brw_inst_dst_da1_subreg_nr(devinfo,
> > > > > > inst);
> > > > > > +         ERROR_IF(subreg % 4 != 0,
> > > > > > +                  "Conversions between integer and half-
> > > > > > float
> > > > > > must
> > > > > > be aligned "
> > > > > > +                  "to a DWord on the destination");
> > > > > > +      } else if ((devinfo->is_cherryview || devinfo->gen
> > > > > > >= 9)
> > > > > > &&
> > > > > > +                 dst_type == BRW_REGISTER_TYPE_HF) {
> > > > > > +         ERROR_IF(dst_stride != 2,
> > > > > > +                  "Conversions to HF must have either all
> > > > > > words in
> > > > > > even word "
> > > > > > +                  "locations or all words in odd word
> > > > > > locations");
> > > > > > +      }
> > > > > > +
> > > > > > +   } else if (exec_type_size > dst_type_size) {
> > > > > >        if (!(dst_type_is_byte && inst_is_raw_move(devinfo,
> > > > > > inst)))
> > > > > > {
> > > > > >           ERROR_IF(dst_stride * dst_type_size !=
> > > > > > exec_type_size,
> > > > > >                    "Destination stride must be equal to the
> > > > > > ratio
> > > > > > of the sizes "
> > > > > > diff --git a/src/intel/compiler/test_eu_validate.cpp
> > > > > > b/src/intel/compiler/test_eu_validate.cpp
> > > > > > index 73300b23122..1557b6d2452 100644
> > > > > > --- a/src/intel/compiler/test_eu_validate.cpp
> > > > > > +++ b/src/intel/compiler/test_eu_validate.cpp
> > > > > > @@ -848,6 +848,128 @@ TEST_P(validation_test,
> > > > > > byte_destination_relaxed_alignment)
> > > > > >     }
> > > > > >  }
> > > > > >  
> > > > > > +TEST_P(validation_test, half_float_conversion)
> > > > > > +{
> > > > > > +   static const struct {
> > > > > > +      enum brw_reg_type dst_type;
> > > > > > +      enum brw_reg_type src_type;
> > > > > > +      unsigned dst_stride;
> > > > > > +      unsigned dst_subnr;
> > > > > > +      bool expected_result;
> > > > > > +   } inst[] = {
> > > > > > +#define INST(dst_type, src_type, dst_stride, dst_subnr,
> > > > > > expected_result)  \
> > > > > > +      {                                                   
> > > > > >     
> > > > > >     
> > > > > >         \
> > > > > > +         BRW_REGISTER_TYPE_##dst_type,                    
> > > > > >     
> > > > > >     
> > > > > >         \
> > > > > > +         BRW_REGISTER_TYPE_##src_type,                    
> > > > > >     
> > > > > >     
> > > > > >         \
> > > > > > +         BRW_HORIZONTAL_STRIDE_##dst_stride,              
> > > > > >     
> > > > > >     
> > > > > >         \
> > > > > > +         dst_subnr,                                       
> > > > > >     
> > > > > >     
> > > > > >         \
> > > > > > +         expected_result,                                 
> > > > > >     
> > > > > >     
> > > > > >         \
> > > > > > +      }
> > > > > > +
> > > > > > +      /* MOV to half-float destination (F source handled
> > > > > > later) */
> > > > > > +      INST(HF,  B, 1, 0, false),
> > > > > > +      INST(HF,  W, 1, 0, false),
> > > > > > +      INST(HF, HF, 1, 0, true),
> > > > > > +      INST(HF, HF, 1, 2, true),
> > > > > > +      INST(HF,  D, 1, 0, false),
> > > > > > +      INST(HF,  Q, 1, 0, false),
> > > > > > +      INST(HF, DF, 1, 0, false),
> > > > > > +      INST(HF,  B, 2, 0, true),
> > > > > > +      INST(HF,  B, 2, 2, false),
> > > > > > +      INST(HF,  W, 2, 0, true),
> > > > > > +      INST(HF,  W, 2, 2, false),
> > > > > > +      INST(HF, HF, 2, 0, true),
> > > > > > +      INST(HF, HF, 2, 2, true),
> > > > > > +      INST(HF,  D, 2, 0, true),
> > > > > > +      INST(HF,  D, 2, 2, false),
> > > > > > +      INST(HF,  Q, 2, 0, false),
> > > > > > +      INST(HF, DF, 2, 0, false),
> > > > > > +      INST(HF,  B, 4, 0, false),
> > > > > > +      INST(HF,  W, 4, 0, false),
> > > > > > +      INST(HF, HF, 4, 0, true),
> > > > > > +      INST(HF, HF, 4, 2, true),
> > > > > > +      INST(HF,  D, 4, 0, false),
> > > > > > +      INST(HF,  Q, 4, 0, false),
> > > > > > +      INST(HF, DF, 4, 0, false),
> > > > > > +
> > > > > > +      /* MOV from half-float source */
> > > > > > +      INST( B, HF, 1, 0, false),
> > > > > > +      INST( W, HF, 1, 0, false),
> > > > > > +      INST( D, HF, 1, 0, true),
> > > > > > +      INST( D, HF, 1, 4, true),
> > > > > > +      INST( F, HF, 1, 0, true),
> > > > > > +      INST( F, HF, 1, 4, true),
> > > > > > +      INST( Q, HF, 1, 0, false),
> > > > > > +      INST(DF, HF, 1, 0, false),
> > > > > > +      INST( B, HF, 2, 0, false),
> > > > > > +      INST( W, HF, 2, 0, true),
> > > > > > +      INST( W, HF, 2, 2, false),
> > > > > > +      INST( D, HF, 2, 0, false),
> > > > > > +      INST( F, HF, 2, 0, true),
> > > > > > +      INST( F, HF, 2, 2, true),
> > > > > > +      INST( B, HF, 4, 0, true),
> > > > > > +      INST( B, HF, 4, 1, false),
> > > > > > +      INST( W, HF, 4, 0, false),
> > > > > > +
> > > > > > +#undef INST
> > > > > > +   };
> > > > > > +
> > > > > > +   if (devinfo.gen < 8)
> > > > > > +      return;
> > > > > > +
> > > > > > +   for (unsigned i = 0; i < sizeof(inst) /
> > > > > > sizeof(inst[0]);
> > > > > > i++) {
> > > > > > +      if (!devinfo.has_64bit_types &&
> > > > > > +          (type_sz(inst[i].src_type) == 8 ||
> > > > > > type_sz(inst[i].dst_type) == 8)) {
> > > > > > +         continue;
> > > > > > +      }
> > > > > > +
> > > > > > +      brw_MOV(p, retype(g0, inst[i].dst_type), retype(g0,
> > > > > > inst[i].src_type));
> > > > > > +
> > > > > > +      brw_inst_set_exec_size(&devinfo, last_inst,
> > > > > > BRW_EXECUTE_4);
> > > > > > +
> > > > > > +      brw_inst_set_dst_hstride(&devinfo, last_inst,
> > > > > > inst[i].dst_stride);
> > > > > > +      brw_inst_set_dst_da1_subreg_nr(&devinfo, last_inst,
> > > > > > inst[i].dst_subnr);
> > > > > > +
> > > > > > +      if (inst[i].src_type == BRW_REGISTER_TYPE_B) {
> > > > > > +         brw_inst_set_src0_vstride(&devinfo, last_inst,
> > > > > > BRW_VERTICAL_STRIDE_4);
> > > > > > +         brw_inst_set_src0_width(&devinfo, last_inst,
> > > > > > BRW_WIDTH_2);
> > > > > > +         brw_inst_set_src0_hstride(&devinfo, last_inst,
> > > > > > BRW_HORIZONTAL_STRIDE_2);
> > > > > > +      } else {
> > > > > > +         brw_inst_set_src0_vstride(&devinfo, last_inst,
> > > > > > BRW_VERTICAL_STRIDE_4);
> > > > > > +         brw_inst_set_src0_width(&devinfo, last_inst,
> > > > > > BRW_WIDTH_4);
> > > > > > +         brw_inst_set_src0_hstride(&devinfo, last_inst,
> > > > > > BRW_HORIZONTAL_STRIDE_1);
> > > > > > +      }
> > > > > > +
> > > > > > +      EXPECT_EQ(inst[i].expected_result, validate(p));
> > > > > > +
> > > > > > +      clear_instructions(p);
> > > > > > +   }
> > > > > > +
> > > > > > +   /* Conversion from F to HF has Dword destination
> > > > > > alignment
> > > > > > restriction
> > > > > > +    * on CHV and SKL+ only
> > > > > > +    */
> > > > > > +   brw_MOV(p, retype(g0, BRW_REGISTER_TYPE_HF),
> > > > > > +              retype(g0, BRW_REGISTER_TYPE_F));
> > > > > > +
> > > > > > +   brw_inst_set_dst_hstride(&devinfo, last_inst,
> > > > > > BRW_HORIZONTAL_STRIDE_1);
> > > > > > +
> > > > > > +   if (devinfo.gen >= 9 || devinfo.is_cherryview) {
> > > > > > +      EXPECT_FALSE(validate(p));
> > > > > > +   } else {
> > > > > > +      assert(devinfo.gen == 8);
> > > > > > +      EXPECT_TRUE(validate(p));
> > > > > > +   }
> > > > > > +   clear_instructions(p);
> > > > > > +
> > > > > > +   brw_MOV(p, retype(g0, BRW_REGISTER_TYPE_HF),
> > > > > > +              retype(g0, BRW_REGISTER_TYPE_F));
> > > > > > +
> > > > > > +   brw_inst_set_dst_hstride(&devinfo, last_inst,
> > > > > > BRW_HORIZONTAL_STRIDE_2);
> > > > > > +
> > > > > > +   EXPECT_TRUE(validate(p));
> > > > > > +   clear_instructions(p);
> > > > > > +}
> > > > > > +
> > > > > >  TEST_P(validation_test,
> > > > > > vector_immediate_destination_alignment)
> > > > > >  {
> > > > > >     static const struct {
> > > > > > -- 
> > > > > > 2.17.1
> > > > > > 
> > > > > > _______________________________________________
> > > > > > mesa-dev mailing list
> > > > > > mesa-dev at lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev