[Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

Wed Mar 6 08:21:32 UTC 2019

On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote:
> On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote:
> > Iago Toral <itoral at igalia.com> writes:
> > 
> > > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote:
> > > > Iago Toral <itoral at igalia.com> writes:
> > > > 
> > > > > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote:
> > > > > > Iago Toral <itoral at igalia.com> writes:
> > > > > > 
> > > > > > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote:
> > > > > > > > Iago Toral <itoral at igalia.com> writes:
> > > > > > > > 
> > > > > > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez
> > > > > > > > > wrote:
> > > > > > > > > > Iago Toral Quiroga <itoral at igalia.com> writes:
> > > > > > > > > > 
> > > > > > > > > > > ---
> > > > > > > > > > >  src/intel/compiler/brw_eu_validate.c    |  64
> > > > > > > > > > > ++++++++++++-
> > > > > > > > > > >  src/intel/compiler/test_eu_validate.cpp | 122
> > > > > > > > > > > ++++++++++++++++++++++++
> > > > > > > > > > >  2 files changed, 185 insertions(+), 1 deletion(-
> > > > > > > > > > > )
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/src/intel/compiler/brw_eu_validate.c
> > > > > > > > > > > b/src/intel/compiler/brw_eu_validate.c
> > > > > > > > > > > index 000a05cb6ac..203641fecb9 100644
> > > > > > > > > > > --- a/src/intel/compiler/brw_eu_validate.c
> > > > > > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
> > > > > > > > > > > @@ -531,7 +531,69 @@
> > > > > > > > > > > general_restrictions_based_on_operand_types(const
> > > > > > > > > > > struct
> > > > > > > > > > > gen_device_info *devinf
> > > > > > > > > > >         exec_type_size == 8 && dst_type_size ==
> > > > > > > > > > > 4)
> > > > > > > > > > >        dst_type_size = 8;
> > > > > > > > > > >  
> > > > > > > > > > > -   if (exec_type_size > dst_type_size) {
> > > > > > > > > > > +   /* From the BDW+ PRM:
> > > > > > > > > > > +    *
> > > > > > > > > > > +    *    "There is no direct conversion from HF
> > > > > > > > > > > to
> > > > > > > > > > > DF
> > > > > > > > > > > or
> > > > > > > > > > > DF to
> > > > > > > > > > > HF.
> > > > > > > > > > > +    *     There is no direct conversion from HF
> > > > > > > > > > > to
> > > > > > > > > > > Q/UQ or
> > > > > > > > > > > Q/UQ to
> > > > > > > > > > > HF."
> > > > > > > > > > > +    */
> > > > > > > > > > > +   enum brw_reg_type src0_type =
> > > > > > > > > > > brw_inst_src0_type(devinfo,
> > > > > > > > > > > inst);
> > > > > > > > > > > +   ERROR_IF(brw_inst_opcode(devinfo, inst) ==
> > > > > > > > > > > BRW_OPCODE_MOV
> > > > > > > > > > > &&
> > > > > > > > > > 
> > > > > > > > > > Why is only the MOV instruction handled here and
> > > > > > > > > > below?  Aren't
> > > > > > > > > > other
> > > > > > > > > > instructions able to do implicit
> > > > > > > > > > conversions?  Probably
> > > > > > > > > > means
> > > > > > > > > > you
> > > > > > > > > > need
> > > > > > > > > > to deal with two sources rather than one.
> > > > > > > > > 
> > > > > > > > > This comes from the programming notes of the MOV
> > > > > > > > > instruction
> > > > > > > > > (Volume
> > > > > > > > > 2a, Command Reference - Instructions - MOV), so it is
> > > > > > > > > described
> > > > > > > > > specifically for the MOV instruction. I should
> > > > > > > > > probably
> > > > > > > > > have
> > > > > > > > > made
> > > > > > > > > this
> > > > > > > > > clear in the comment.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > Maybe the one above is specified in the MOV page only,
> > > > > > > > probably
> > > > > > > > due
> > > > > > > > to
> > > > > > > > an oversight (If these restrictions were really
> > > > > > > > specific
> > > > > > > > to
> > > > > > > > the
> > > > > > > > MOV
> > > > > > > > instruction, what would prevent you from implementing
> > > > > > > > such
> > > > > > > > conversions
> > > > > > > > through a different instruction?  E.g. "ADD dst:df,
> > > > > > > > src:hf,
> > > > > > > > 0"
> > > > > > > > which
> > > > > > > > would be substantially more efficient than what you're
> > > > > > > > doing
> > > > > > > > in
> > > > > > > > PATCH
> > > > > > > > 02)
> > > > > > > 
> > > > > > > Instructions that take HF can only be strictly HF or mix
> > > > > > > F
> > > > > > > and
> > > > > > > HF
> > > > > > > (mixed mode float), with MOV being the only exception.
> > > > > > > That
> > > > > > > means
> > > > > > > that
> > > > > > > any instruction like the one you use above are invalid.
> > > > > > > Maybe
> > > > > > > we
> > > > > > > should
> > > > > > > validate explicitly that instructions that use HF are
> > > > > > > strictly
> > > > > > > HF
> > > > > > > or
> > > > > > > mixed-float mode only?
> > > > > > > 
> > > > > > 
> > > > > > So you're acknowledging that the conversions checked above
> > > > > > are
> > > > > > illegal
> > > > > > whether the instruction is a MOV or something else.  Where
> > > > > > are we
> > > > > > preventing instructions other than MOV with such
> > > > > > conversions
> > > > > > from
> > > > > > being
> > > > > > accepted by the validator?
> > > > > 
> > > > > We aren't, because the validator is not checking, in general,
> > > > > for
> > > > > accepted type combinations for specific instructions anywhere
> > > > > as
> > > > > far as
> > > > > I can see.
> > > > 
> > > > Luckily these type conversion restrictions aren't really
> > > > specific
> > > > to
> > > > any
> > > > instruction AFAICT, even though they only seem to be mentioned
> > > > explicitly for the MOV instruction...
> > > > 
> > > > > If we want to start doing this with HF conversions
> > > > > specifically, I
> > > > > guess it is fine, but in case it is not clear, I think it
> > > > > won't
> > > > > bring
> > > > > any immediate benefits with the VK_KHR_shader_float16_int8
> > > > > implementation since this series only ever emits conversions
> > > > > involving
> > > > > HF operands via MOV instructions,
> > > > 
> > > > Yes, I can see that's the intention of this series, but how can
> > > > you
> > > > make
> > > > sure that nothing in the optimizer is breaking your assumption
> > > > if
> > > > you
> > > > don't add some validator code to verify the claim of your last
> > > > paragraph?
> > > > 
> > > > > which is why I thought that validating that no direct MOV
> > > > > conversions
> > > > > from DF/Q types ever happen was useful, since we have code in
> > > > > the
> > > > > driver to handle this scenario.
> > > > > 
> > > > > [...]
> > > > > > 
> > > > > > I still don't understand why you want to implement the same
> > > > > > restriction
> > > > > > twice, once for MOV and once for all other mixed-mode
> > > > > > instructions.  How
> > > > > > is that more convenient?
> > > > > 
> > > > > The restrictions based on operand types that are checked in
> > > > > the
> > > > > validator are specific to Byte or cases where the execution
> > > > > type is
> > > > > larger than the destination stride, for which mixed float has
> > > > > a
> > > > > different set of restrictions.
> > > > > 
> > > > > For example, for mixed float we have this rule:
> > > > > 
> > > > > "In Align1, destination stride can be smaller than execution
> > > > > type"
> > > > > 
> > > > > Which overrides this other from "General restrictions based
> > > > > on
> > > > > operand
> > > > > types":
> > > > > 
> > > > > "Destination stride must be equal to the ratio of the sizes
> > > > > of
> > > > > the
> > > > > execution data type to the destination type"
> > > > > 
> > > > > So I thought that it would make things easier to keep all
> > > > > restrictions
> > > > > for mixed float separate and make sure that we skipped mixed
> > > > > float
> > > > > instructions in general_restrictions_based_on_operand_types()
> > > > > so we
> > > > > avoid having to add code to skip individual general
> > > > > restrictions
> > > > > that that are overriden for mixed float mode anyway.
> > > > > 
> > > > 
> > > > I'm fine with that, but it doesn't seem like what this patch is
> > > > doing...
> > > > Isn't it adding code to validate mixed-mode float MOV
> > > > instructions in
> > > > general_restrictions_based_on_operand_types()?
> > > 
> > > I guess this could be arguable, but I was not considering
> > > conversion
> > > MOVs to be mixed-float instructions. There are two reasons for
> > > this:
> > > 
> > > A conversion MOV involving F/HF doesn't seem to be fundamentally
> > > different from any other conversion instruction involving other
> > > types,
> > > other than the requirement of aligning the destination to a
> > > Dword,
> > > which is not a resriction explictly made for mixed-float mode.
> > > 
> > > Then, for mixed-float mode, there is this other rule:
> > > 
> > > "In Align1, destination stride can be smaller than execution
> > > type.
> > > When
> > > destination is stride of 1, 16 bit packed data
> > > is updated on the destination. However, output packed f16 data
> > > must
> > > be
> > > oword aligned, no oword crossing in
> > > packed f16."
> > > 
> > > Which contradicts the other rule that conversions from F to HF
> > > need
> > > to
> > > be DWord aligned on the destination.
> > > 
> > > So it seems to me that conversion MOVs are not following the same
> > > principles of mixed-float instructions and we should skip
> > > validating
> > > mixed-float restrictions for them. What do you think?
> > > 
> > 
> > That all seems fairly ambiguous...  And the restriction on DWORD
> > alignment for conversions includes a mixed-mode ADD instruction
> > among
> > the examples, so I'm skeptical that MOV could be truly
> > fundamentally
> > different.
> 
> Ok... in that case what do we do about the mixed-float restriction I
> quoted above? Since it is incompatible with the mixed-float MOV
> conversion I guess we only have two options: ether we don't validate
> it
> at all or we only validate it for mixed-float instructions that are
> not
> MOV. I guess we can do the latter?

Also, related to this, if we consider implicit conversions in 2-src
instructions to also be a target of the generic rules for conversions
to HF described for CHV and SKL+, then doesn't that imply that all
mixed-float instructions are conversions and subject to those rules?
For example:

ADD dst:hf  src0:f  src1:f
ADD dst:hf  src0:hf src1:f
ADD dst:hf  src0:f  src1:hf

In all 3 instructions above the execution type is F. Should we consider
all of them implicit conversions? That would mean that any mixed-float
instruction with HF destination is an implicit conversion from F to HF
and therefore would be subject to the generic rules for those
conversions, which at least in the case of CHV and SKL+ involves DWord
alignment on the destination, or in other words: no mixed-float
instruction can have packed fp16 destinations. But that seems to
contradict what various mixed-float restrictions present in the docs
that suggest that packed fp16 destinations are possible with mixed-
float mode. Here are some examples:

"No SIMD16 in mixed mode when destination is packed f16 for both Align1
and Align16"

"In Align1, destination stride can be smaller than execution type. When
destination is stride of 1, 16 bit packed data is updated on the
destination. However, output packed f16 data must be oword aligned, no
oword crossing in packed f16."

"When source is float or half float from accumulator register and
destination is half float with a stride of 1, the source must register
aligned. i.e., source must have offset zero."

Since I was only considering that conversion rules only applied to MOV
instructions this was not an issue in my series, but if we consider
that those rules also apply to implicit conversions from other
instructions then it looks like all these restrictions refer to
impossible situations, so I am not sure what to do about them... should
we just ignore them in the validator?

> > > > > Anyway, I'll try to rework the patches to do more generic
> > > > > validation of
> > > > > HF conversions like you ask and see what kind of code we end
> > > > > up
> > > > > with.
> > > > > 
> > > > 
> > > > Thanks.
> > > > 
> > > > > [...]