[Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

Wed Mar 13 16:58:05 UTC 2019

Iago Toral <itoral at igalia.com> writes:

> On Tue, 2019-03-12 at 15:44 -0700, Francisco Jerez wrote:
>> Iago Toral <itoral at igalia.com> writes:
>> 
>> > On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote:
>> > > On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote:
>> > > > Iago Toral <itoral at igalia.com> writes:
>> > > > 
>> > > > > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote:
>> > > > > > Iago Toral <itoral at igalia.com> writes:
>> > > > > > 
>> > > > > > > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote:
>> > > > > > > > Iago Toral <itoral at igalia.com> writes:
>> > > > > > > > 
>> > > > > > > > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez
>> > > > > > > > > wrote:
>> > > > > > > > > > Iago Toral <itoral at igalia.com> writes:
>> > > > > > > > > > 
>> > > > > > > > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco
>> > > > > > > > > > > Jerez
>> > > > > > > > > > > wrote:
>> > > > > > > > > > > > Iago Toral Quiroga <itoral at igalia.com> writes:
>> > > > > > > > > > > > 
>> > > > > > > > > > > > > ---
>> > > > > > > > > > > > >  src/intel/compiler/brw_eu_validate.c    |  6
>> > > > > > > > > > > > > 4
>> > > > > > > > > > > > > ++++++++++++-
>> > > > > > > > > > > > >  src/intel/compiler/test_eu_validate.cpp |
>> > > > > > > > > > > > > 122
>> > > > > > > > > > > > > ++++++++++++++++++++++++
>> > > > > > > > > > > > >  2 files changed, 185 insertions(+), 1
>> > > > > > > > > > > > > deletion(-
>> > > > > > > > > > > > > )
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > diff --git
>> > > > > > > > > > > > > a/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > > > b/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > > > index 000a05cb6ac..203641fecb9 100644
>> > > > > > > > > > > > > --- a/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > > > @@ -531,7 +531,69 @@
>> > > > > > > > > > > > > general_restrictions_based_on_operand_types(c
>> > > > > > > > > > > > > onst
>> > > > > > > > > > > > > struct
>> > > > > > > > > > > > > gen_device_info *devinf
>> > > > > > > > > > > > >         exec_type_size == 8 && dst_type_size
>> > > > > > > > > > > > > ==
>> > > > > > > > > > > > > 4)
>> > > > > > > > > > > > >        dst_type_size = 8;
>> > > > > > > > > > > > >  
>> > > > > > > > > > > > > -   if (exec_type_size > dst_type_size) {
>> > > > > > > > > > > > > +   /* From the BDW+ PRM:
>> > > > > > > > > > > > > +    *
>> > > > > > > > > > > > > +    *    "There is no direct conversion from
>> > > > > > > > > > > > > HF
>> > > > > > > > > > > > > to
>> > > > > > > > > > > > > DF
>> > > > > > > > > > > > > or
>> > > > > > > > > > > > > DF to
>> > > > > > > > > > > > > HF.
>> > > > > > > > > > > > > +    *     There is no direct conversion from
>> > > > > > > > > > > > > HF
>> > > > > > > > > > > > > to
>> > > > > > > > > > > > > Q/UQ or
>> > > > > > > > > > > > > Q/UQ to
>> > > > > > > > > > > > > HF."
>> > > > > > > > > > > > > +    */
>> > > > > > > > > > > > > +   enum brw_reg_type src0_type =
>> > > > > > > > > > > > > brw_inst_src0_type(devinfo,
>> > > > > > > > > > > > > inst);
>> > > > > > > > > > > > > +   ERROR_IF(brw_inst_opcode(devinfo, inst)
>> > > > > > > > > > > > > ==
>> > > > > > > > > > > > > BRW_OPCODE_MOV
>> > > > > > > > > > > > > &&
>> > > > > > > > > > > > 
>> > > > > > > > > > > > Why is only the MOV instruction handled here
>> > > > > > > > > > > > and
>> > > > > > > > > > > > below?  Aren't
>> > > > > > > > > > > > other
>> > > > > > > > > > > > instructions able to do implicit
>> > > > > > > > > > > > conversions?  Probably
>> > > > > > > > > > > > means
>> > > > > > > > > > > > you
>> > > > > > > > > > > > need
>> > > > > > > > > > > > to deal with two sources rather than one.
>> > > > > > > > > > > 
>> > > > > > > > > > > This comes from the programming notes of the MOV
>> > > > > > > > > > > instruction
>> > > > > > > > > > > (Volume
>> > > > > > > > > > > 2a, Command Reference - Instructions - MOV), so
>> > > > > > > > > > > it is
>> > > > > > > > > > > described
>> > > > > > > > > > > specifically for the MOV instruction. I should
>> > > > > > > > > > > probably
>> > > > > > > > > > > have
>> > > > > > > > > > > made
>> > > > > > > > > > > this
>> > > > > > > > > > > clear in the comment.
>> > > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > Maybe the one above is specified in the MOV page
>> > > > > > > > > > only,
>> > > > > > > > > > probably
>> > > > > > > > > > due
>> > > > > > > > > > to
>> > > > > > > > > > an oversight (If these restrictions were really
>> > > > > > > > > > specific
>> > > > > > > > > > to
>> > > > > > > > > > the
>> > > > > > > > > > MOV
>> > > > > > > > > > instruction, what would prevent you from
>> > > > > > > > > > implementing
>> > > > > > > > > > such
>> > > > > > > > > > conversions
>> > > > > > > > > > through a different instruction?  E.g. "ADD dst:df,
>> > > > > > > > > > src:hf,
>> > > > > > > > > > 0"
>> > > > > > > > > > which
>> > > > > > > > > > would be substantially more efficient than what
>> > > > > > > > > > you're
>> > > > > > > > > > doing
>> > > > > > > > > > in
>> > > > > > > > > > PATCH
>> > > > > > > > > > 02)
>> > > > > > > > > 
>> > > > > > > > > Instructions that take HF can only be strictly HF or
>> > > > > > > > > mix
>> > > > > > > > > F
>> > > > > > > > > and
>> > > > > > > > > HF
>> > > > > > > > > (mixed mode float), with MOV being the only
>> > > > > > > > > exception.
>> > > > > > > > > That
>> > > > > > > > > means
>> > > > > > > > > that
>> > > > > > > > > any instruction like the one you use above are
>> > > > > > > > > invalid.
>> > > > > > > > > Maybe
>> > > > > > > > > we
>> > > > > > > > > should
>> > > > > > > > > validate explicitly that instructions that use HF are
>> > > > > > > > > strictly
>> > > > > > > > > HF
>> > > > > > > > > or
>> > > > > > > > > mixed-float mode only?
>> > > > > > > > > 
>> > > > > > > > 
>> > > > > > > > So you're acknowledging that the conversions checked
>> > > > > > > > above
>> > > > > > > > are
>> > > > > > > > illegal
>> > > > > > > > whether the instruction is a MOV or something
>> > > > > > > > else.  Where
>> > > > > > > > are we
>> > > > > > > > preventing instructions other than MOV with such
>> > > > > > > > conversions
>> > > > > > > > from
>> > > > > > > > being
>> > > > > > > > accepted by the validator?
>> > > > > > > 
>> > > > > > > We aren't, because the validator is not checking, in
>> > > > > > > general,
>> > > > > > > for
>> > > > > > > accepted type combinations for specific instructions
>> > > > > > > anywhere
>> > > > > > > as
>> > > > > > > far as
>> > > > > > > I can see.
>> > > > > > 
>> > > > > > Luckily these type conversion restrictions aren't really
>> > > > > > specific
>> > > > > > to
>> > > > > > any
>> > > > > > instruction AFAICT, even though they only seem to be
>> > > > > > mentioned
>> > > > > > explicitly for the MOV instruction...
>> > > > > > 
>> > > > > > > If we want to start doing this with HF conversions
>> > > > > > > specifically, I
>> > > > > > > guess it is fine, but in case it is not clear, I think it
>> > > > > > > won't
>> > > > > > > bring
>> > > > > > > any immediate benefits with the
>> > > > > > > VK_KHR_shader_float16_int8
>> > > > > > > implementation since this series only ever emits
>> > > > > > > conversions
>> > > > > > > involving
>> > > > > > > HF operands via MOV instructions,
>> > > > > > 
>> > > > > > Yes, I can see that's the intention of this series, but how
>> > > > > > can
>> > > > > > you
>> > > > > > make
>> > > > > > sure that nothing in the optimizer is breaking your
>> > > > > > assumption
>> > > > > > if
>> > > > > > you
>> > > > > > don't add some validator code to verify the claim of your
>> > > > > > last
>> > > > > > paragraph?
>> > > > > > 
>> > > > > > > which is why I thought that validating that no direct MOV
>> > > > > > > conversions
>> > > > > > > from DF/Q types ever happen was useful, since we have
>> > > > > > > code in
>> > > > > > > the
>> > > > > > > driver to handle this scenario.
>> > > > > > > 
>> > > > > > > [...]
>> > > > > > > > 
>> > > > > > > > I still don't understand why you want to implement the
>> > > > > > > > same
>> > > > > > > > restriction
>> > > > > > > > twice, once for MOV and once for all other mixed-mode
>> > > > > > > > instructions.  How
>> > > > > > > > is that more convenient?
>> > > > > > > 
>> > > > > > > The restrictions based on operand types that are checked
>> > > > > > > in
>> > > > > > > the
>> > > > > > > validator are specific to Byte or cases where the
>> > > > > > > execution
>> > > > > > > type is
>> > > > > > > larger than the destination stride, for which mixed float
>> > > > > > > has
>> > > > > > > a
>> > > > > > > different set of restrictions.
>> > > > > > > 
>> > > > > > > For example, for mixed float we have this rule:
>> > > > > > > 
>> > > > > > > "In Align1, destination stride can be smaller than
>> > > > > > > execution
>> > > > > > > type"
>> > > > > > > 
>> > > > > > > Which overrides this other from "General restrictions
>> > > > > > > based
>> > > > > > > on
>> > > > > > > operand
>> > > > > > > types":
>> > > > > > > 
>> > > > > > > "Destination stride must be equal to the ratio of the
>> > > > > > > sizes
>> > > > > > > of
>> > > > > > > the
>> > > > > > > execution data type to the destination type"
>> > > > > > > 
>> > > > > > > So I thought that it would make things easier to keep all
>> > > > > > > restrictions
>> > > > > > > for mixed float separate and make sure that we skipped
>> > > > > > > mixed
>> > > > > > > float
>> > > > > > > instructions in
>> > > > > > > general_restrictions_based_on_operand_types()
>> > > > > > > so we
>> > > > > > > avoid having to add code to skip individual general
>> > > > > > > restrictions
>> > > > > > > that that are overriden for mixed float mode anyway.
>> > > > > > > 
>> > > > > > 
>> > > > > > I'm fine with that, but it doesn't seem like what this
>> > > > > > patch is
>> > > > > > doing...
>> > > > > > Isn't it adding code to validate mixed-mode float MOV
>> > > > > > instructions in
>> > > > > > general_restrictions_based_on_operand_types()?
>> > > > > 
>> > > > > I guess this could be arguable, but I was not considering
>> > > > > conversion
>> > > > > MOVs to be mixed-float instructions. There are two reasons
>> > > > > for
>> > > > > this:
>> > > > > 
>> > > > > A conversion MOV involving F/HF doesn't seem to be
>> > > > > fundamentally
>> > > > > different from any other conversion instruction involving
>> > > > > other
>> > > > > types,
>> > > > > other than the requirement of aligning the destination to a
>> > > > > Dword,
>> > > > > which is not a resriction explictly made for mixed-float
>> > > > > mode.
>> > > > > 
>> > > > > Then, for mixed-float mode, there is this other rule:
>> > > > > 
>> > > > > "In Align1, destination stride can be smaller than execution
>> > > > > type.
>> > > > > When
>> > > > > destination is stride of 1, 16 bit packed data
>> > > > > is updated on the destination. However, output packed f16
>> > > > > data
>> > > > > must
>> > > > > be
>> > > > > oword aligned, no oword crossing in
>> > > > > packed f16."
>> > > > > 
>> > > > > Which contradicts the other rule that conversions from F to
>> > > > > HF
>> > > > > need
>> > > > > to
>> > > > > be DWord aligned on the destination.
>> > > > > 
>> > > > > So it seems to me that conversion MOVs are not following the
>> > > > > same
>> > > > > principles of mixed-float instructions and we should skip
>> > > > > validating
>> > > > > mixed-float restrictions for them. What do you think?
>> > > > > 
>> > > > 
>> > > > That all seems fairly ambiguous...  And the restriction on
>> > > > DWORD
>> > > > alignment for conversions includes a mixed-mode ADD instruction
>> > > > among
>> > > > the examples, so I'm skeptical that MOV could be truly
>> > > > fundamentally
>> > > > different.
>> > > 
>> > > Ok... in that case what do we do about the mixed-float
>> > > restriction I
>> > > quoted above? Since it is incompatible with the mixed-float MOV
>> > > conversion I guess we only have two options: ether we don't
>> > > validate
>> > > it
>> > > at all or we only validate it for mixed-float instructions that
>> > > are
>> > > not
>> > > MOV. I guess we can do the latter?
>> > 
>> > Also, related to this, if we consider implicit conversions in 2-src
>> > instructions to also be a target of the generic rules for
>> > conversions
>> > to HF described for CHV and SKL+, then doesn't that imply that all
>> > mixed-float instructions are conversions and subject to those
>> > rules?
>> > For example:
>> > 
>> > ADD dst:hf  src0:f  src1:f
>> > ADD dst:hf  src0:hf src1:f
>> > ADD dst:hf  src0:f  src1:hf
>> > 
>> > In all 3 instructions above the execution type is F. Should we
>> > consider
>> > all of them implicit conversions?
>> 
>> Yes, I just verified that the simulator considers all of the above to
>> fall under the category "Format conversion to or from HF", and
>> requires
>> their destination components to be DWORD-aligned, but *only* on BDW.
>
> Thanks for testing checking this in the validator! I didn't find
> anything in the BDW docs that documents this behavior explicitly, at
> least in my copy of the BDW PRM, but it sounds like this behavior would
> be in line with the general rule that conversions to smaller types need
> to be aligned to the size of the largr type so that's proably why they
> didn't include anything specific for these conversions...
>
>> > That would mean that any mixed-float instruction with HF
>> > destination
>> > is an implicit conversion from F to HF and therefore would be
>> > subject
>> > to the generic rules for those conversions, which at least in the
>> > case
>> > of CHV and SKL+ involves DWord alignment on the destination, or in
>> > other words: no mixed-float instruction can have packed fp16
>> > destinations.
>> 
>> AFAIUI, your reasoning in the paragraph above is correct for BDW
>> *only*,
>> not for CHV and SKL+ which have an exception to the DWORD channel
>> alignment rule (whether the instruction is doing conversions or not)
>> in
>> the case where the destination is HF, has stride equal to 1, and has
>> sub-register offset aligned to 16B.
>
> You mean this rule listed for mixed float mode, right?:
>
> "In Align1, destination stride can be smaller than execution type. When
> destination is stride of 1, 16 bit packed data is updated on the
> destination. However, output packed f16 data must be oword aligned, no
> oword crossing in packed f16."
>
> That brings up another question: my copy of the BDW PRM doesn't list
> any restrictions for mixed float mode at all. It only has this:
>
> "Special Requirements for Handling Mixed Mode Float Operations
>
> There are some special requirements for handling mixed mode float
> operations for specific projects, described in the following table. If
> you are viewing a version of the BSpec limited to other particular
> projects, the table may appear with no data rows."
>
> But then there is no table and no restrictions listed... I have been
> assuming that this was just bogus since there are a lot of restrictions
> listed in the CHV and SKL+ documentation, but from what you say above,
> it looks like my guess was incorrect, since you say that the exception
> to the Dword channel alignment rule that I quote above doesn't apply to
> BDW, so I guess my question now is:
>
> Is my copy of the BDW PRM correct in that there are no restrictions for
> mixed-float mode in that platform?
>

My copy of that table looks empty as well...  Probably not because there
are no restrictions but because most of the restrictions listed on that
table only affect instructions with some non-trivial form of destination
packing, which arguably isn't supported on BDW.

>> > But that seems to contradict what various mixed-float restrictions
>> > present in the docs that suggest that packed fp16 destinations are
>> > possible with mixed- float mode. Here are some examples:
>> > 
>> 
>> AFAICT the reason for the apparent contradiction is that packed fp16
>> destinations are allowed for such instruction on CHV/SKL+ as an
>> exception to the more strict BDW rule.
>> 
>> > "No SIMD16 in mixed mode when destination is packed f16 for both
>> > Align1
>> > and Align16"
>> > 
>> > "In Align1, destination stride can be smaller than execution type.
>> > When
>> > destination is stride of 1, 16 bit packed data is updated on the
>> > destination. However, output packed f16 data must be oword aligned,
>> > no
>> > oword crossing in packed f16."
>> > 
>> > "When source is float or half float from accumulator register and
>> > destination is half float with a stride of 1, the source must
>> > register
>> > aligned. i.e., source must have offset zero."
>> > 
>> > Since I was only considering that conversion rules only applied to
>> > MOV
>> > instructions this was not an issue in my series, but if we consider
>> > that those rules also apply to implicit conversions from other
>> > instructions then it looks like all these restrictions refer to
>> > impossible situations, so I am not sure what to do about them...
>> > should
>> > we just ignore them in the validator?
>> > 
>> 
>> I don't see any reason to ignore them.  The last three restrictions
>> you've quoted above probably apply to MOV conversions in the same way
>> that they apply to other instructions.  Unfortunately only the last
>> one
>> you quoted seems to be enforced by the simulator, and it certainly
>> kicks
>> in for any instructions with an accumulator source and a packed
>> half-float destination, including MOV.
>
> Yes, I agree, all these would be compatible with the exception to the
> DWord alignment rule you mentioned for SKL+ and CHV, so I think this is
> clear now. Thanks!
>
>> > > > > > > Anyway, I'll try to rework the patches to do more generic
>> > > > > > > validation of
>> > > > > > > HF conversions like you ask and see what kind of code we
>> > > > > > > end
>> > > > > > > up
>> > > > > > > with.
>> > > > > > > 
>> > > > > > 
>> > > > > > Thanks.
>> > > > > > 
>> > > > > > > [...]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190313/d5d59262/attachment-0001.sig>