[Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

Tue Mar 12 22:44:45 UTC 2019

Iago Toral <itoral at igalia.com> writes:

> On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote:
>> On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote:
>> > Iago Toral <itoral at igalia.com> writes:
>> > 
>> > > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote:
>> > > > Iago Toral <itoral at igalia.com> writes:
>> > > > 
>> > > > > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote:
>> > > > > > Iago Toral <itoral at igalia.com> writes:
>> > > > > > 
>> > > > > > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote:
>> > > > > > > > Iago Toral <itoral at igalia.com> writes:
>> > > > > > > > 
>> > > > > > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez
>> > > > > > > > > wrote:
>> > > > > > > > > > Iago Toral Quiroga <itoral at igalia.com> writes:
>> > > > > > > > > > 
>> > > > > > > > > > > ---
>> > > > > > > > > > >  src/intel/compiler/brw_eu_validate.c    |  64
>> > > > > > > > > > > ++++++++++++-
>> > > > > > > > > > >  src/intel/compiler/test_eu_validate.cpp | 122
>> > > > > > > > > > > ++++++++++++++++++++++++
>> > > > > > > > > > >  2 files changed, 185 insertions(+), 1 deletion(-
>> > > > > > > > > > > )
>> > > > > > > > > > > 
>> > > > > > > > > > > diff --git a/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > b/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > index 000a05cb6ac..203641fecb9 100644
>> > > > > > > > > > > --- a/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > @@ -531,7 +531,69 @@
>> > > > > > > > > > > general_restrictions_based_on_operand_types(const
>> > > > > > > > > > > struct
>> > > > > > > > > > > gen_device_info *devinf
>> > > > > > > > > > >         exec_type_size == 8 && dst_type_size ==
>> > > > > > > > > > > 4)
>> > > > > > > > > > >        dst_type_size = 8;
>> > > > > > > > > > >  
>> > > > > > > > > > > -   if (exec_type_size > dst_type_size) {
>> > > > > > > > > > > +   /* From the BDW+ PRM:
>> > > > > > > > > > > +    *
>> > > > > > > > > > > +    *    "There is no direct conversion from HF
>> > > > > > > > > > > to
>> > > > > > > > > > > DF
>> > > > > > > > > > > or
>> > > > > > > > > > > DF to
>> > > > > > > > > > > HF.
>> > > > > > > > > > > +    *     There is no direct conversion from HF
>> > > > > > > > > > > to
>> > > > > > > > > > > Q/UQ or
>> > > > > > > > > > > Q/UQ to
>> > > > > > > > > > > HF."
>> > > > > > > > > > > +    */
>> > > > > > > > > > > +   enum brw_reg_type src0_type =
>> > > > > > > > > > > brw_inst_src0_type(devinfo,
>> > > > > > > > > > > inst);
>> > > > > > > > > > > +   ERROR_IF(brw_inst_opcode(devinfo, inst) ==
>> > > > > > > > > > > BRW_OPCODE_MOV
>> > > > > > > > > > > &&
>> > > > > > > > > > 
>> > > > > > > > > > Why is only the MOV instruction handled here and
>> > > > > > > > > > below?  Aren't
>> > > > > > > > > > other
>> > > > > > > > > > instructions able to do implicit
>> > > > > > > > > > conversions?  Probably
>> > > > > > > > > > means
>> > > > > > > > > > you
>> > > > > > > > > > need
>> > > > > > > > > > to deal with two sources rather than one.
>> > > > > > > > > 
>> > > > > > > > > This comes from the programming notes of the MOV
>> > > > > > > > > instruction
>> > > > > > > > > (Volume
>> > > > > > > > > 2a, Command Reference - Instructions - MOV), so it is
>> > > > > > > > > described
>> > > > > > > > > specifically for the MOV instruction. I should
>> > > > > > > > > probably
>> > > > > > > > > have
>> > > > > > > > > made
>> > > > > > > > > this
>> > > > > > > > > clear in the comment.
>> > > > > > > > > 
>> > > > > > > > 
>> > > > > > > > Maybe the one above is specified in the MOV page only,
>> > > > > > > > probably
>> > > > > > > > due
>> > > > > > > > to
>> > > > > > > > an oversight (If these restrictions were really
>> > > > > > > > specific
>> > > > > > > > to
>> > > > > > > > the
>> > > > > > > > MOV
>> > > > > > > > instruction, what would prevent you from implementing
>> > > > > > > > such
>> > > > > > > > conversions
>> > > > > > > > through a different instruction?  E.g. "ADD dst:df,
>> > > > > > > > src:hf,
>> > > > > > > > 0"
>> > > > > > > > which
>> > > > > > > > would be substantially more efficient than what you're
>> > > > > > > > doing
>> > > > > > > > in
>> > > > > > > > PATCH
>> > > > > > > > 02)
>> > > > > > > 
>> > > > > > > Instructions that take HF can only be strictly HF or mix
>> > > > > > > F
>> > > > > > > and
>> > > > > > > HF
>> > > > > > > (mixed mode float), with MOV being the only exception.
>> > > > > > > That
>> > > > > > > means
>> > > > > > > that
>> > > > > > > any instruction like the one you use above are invalid.
>> > > > > > > Maybe
>> > > > > > > we
>> > > > > > > should
>> > > > > > > validate explicitly that instructions that use HF are
>> > > > > > > strictly
>> > > > > > > HF
>> > > > > > > or
>> > > > > > > mixed-float mode only?
>> > > > > > > 
>> > > > > > 
>> > > > > > So you're acknowledging that the conversions checked above
>> > > > > > are
>> > > > > > illegal
>> > > > > > whether the instruction is a MOV or something else.  Where
>> > > > > > are we
>> > > > > > preventing instructions other than MOV with such
>> > > > > > conversions
>> > > > > > from
>> > > > > > being
>> > > > > > accepted by the validator?
>> > > > > 
>> > > > > We aren't, because the validator is not checking, in general,
>> > > > > for
>> > > > > accepted type combinations for specific instructions anywhere
>> > > > > as
>> > > > > far as
>> > > > > I can see.
>> > > > 
>> > > > Luckily these type conversion restrictions aren't really
>> > > > specific
>> > > > to
>> > > > any
>> > > > instruction AFAICT, even though they only seem to be mentioned
>> > > > explicitly for the MOV instruction...
>> > > > 
>> > > > > If we want to start doing this with HF conversions
>> > > > > specifically, I
>> > > > > guess it is fine, but in case it is not clear, I think it
>> > > > > won't
>> > > > > bring
>> > > > > any immediate benefits with the VK_KHR_shader_float16_int8
>> > > > > implementation since this series only ever emits conversions
>> > > > > involving
>> > > > > HF operands via MOV instructions,
>> > > > 
>> > > > Yes, I can see that's the intention of this series, but how can
>> > > > you
>> > > > make
>> > > > sure that nothing in the optimizer is breaking your assumption
>> > > > if
>> > > > you
>> > > > don't add some validator code to verify the claim of your last
>> > > > paragraph?
>> > > > 
>> > > > > which is why I thought that validating that no direct MOV
>> > > > > conversions
>> > > > > from DF/Q types ever happen was useful, since we have code in
>> > > > > the
>> > > > > driver to handle this scenario.
>> > > > > 
>> > > > > [...]
>> > > > > > 
>> > > > > > I still don't understand why you want to implement the same
>> > > > > > restriction
>> > > > > > twice, once for MOV and once for all other mixed-mode
>> > > > > > instructions.  How
>> > > > > > is that more convenient?
>> > > > > 
>> > > > > The restrictions based on operand types that are checked in
>> > > > > the
>> > > > > validator are specific to Byte or cases where the execution
>> > > > > type is
>> > > > > larger than the destination stride, for which mixed float has
>> > > > > a
>> > > > > different set of restrictions.
>> > > > > 
>> > > > > For example, for mixed float we have this rule:
>> > > > > 
>> > > > > "In Align1, destination stride can be smaller than execution
>> > > > > type"
>> > > > > 
>> > > > > Which overrides this other from "General restrictions based
>> > > > > on
>> > > > > operand
>> > > > > types":
>> > > > > 
>> > > > > "Destination stride must be equal to the ratio of the sizes
>> > > > > of
>> > > > > the
>> > > > > execution data type to the destination type"
>> > > > > 
>> > > > > So I thought that it would make things easier to keep all
>> > > > > restrictions
>> > > > > for mixed float separate and make sure that we skipped mixed
>> > > > > float
>> > > > > instructions in general_restrictions_based_on_operand_types()
>> > > > > so we
>> > > > > avoid having to add code to skip individual general
>> > > > > restrictions
>> > > > > that that are overriden for mixed float mode anyway.
>> > > > > 
>> > > > 
>> > > > I'm fine with that, but it doesn't seem like what this patch is
>> > > > doing...
>> > > > Isn't it adding code to validate mixed-mode float MOV
>> > > > instructions in
>> > > > general_restrictions_based_on_operand_types()?
>> > > 
>> > > I guess this could be arguable, but I was not considering
>> > > conversion
>> > > MOVs to be mixed-float instructions. There are two reasons for
>> > > this:
>> > > 
>> > > A conversion MOV involving F/HF doesn't seem to be fundamentally
>> > > different from any other conversion instruction involving other
>> > > types,
>> > > other than the requirement of aligning the destination to a
>> > > Dword,
>> > > which is not a resriction explictly made for mixed-float mode.
>> > > 
>> > > Then, for mixed-float mode, there is this other rule:
>> > > 
>> > > "In Align1, destination stride can be smaller than execution
>> > > type.
>> > > When
>> > > destination is stride of 1, 16 bit packed data
>> > > is updated on the destination. However, output packed f16 data
>> > > must
>> > > be
>> > > oword aligned, no oword crossing in
>> > > packed f16."
>> > > 
>> > > Which contradicts the other rule that conversions from F to HF
>> > > need
>> > > to
>> > > be DWord aligned on the destination.
>> > > 
>> > > So it seems to me that conversion MOVs are not following the same
>> > > principles of mixed-float instructions and we should skip
>> > > validating
>> > > mixed-float restrictions for them. What do you think?
>> > > 
>> > 
>> > That all seems fairly ambiguous...  And the restriction on DWORD
>> > alignment for conversions includes a mixed-mode ADD instruction
>> > among
>> > the examples, so I'm skeptical that MOV could be truly
>> > fundamentally
>> > different.
>> 
>> Ok... in that case what do we do about the mixed-float restriction I
>> quoted above? Since it is incompatible with the mixed-float MOV
>> conversion I guess we only have two options: ether we don't validate
>> it
>> at all or we only validate it for mixed-float instructions that are
>> not
>> MOV. I guess we can do the latter?
>
> Also, related to this, if we consider implicit conversions in 2-src
> instructions to also be a target of the generic rules for conversions
> to HF described for CHV and SKL+, then doesn't that imply that all
> mixed-float instructions are conversions and subject to those rules?
> For example:
>
> ADD dst:hf  src0:f  src1:f
> ADD dst:hf  src0:hf src1:f
> ADD dst:hf  src0:f  src1:hf
>
> In all 3 instructions above the execution type is F. Should we consider
> all of them implicit conversions?

Yes, I just verified that the simulator considers all of the above to
fall under the category "Format conversion to or from HF", and requires
their destination components to be DWORD-aligned, but *only* on BDW.

> That would mean that any mixed-float instruction with HF destination
> is an implicit conversion from F to HF and therefore would be subject
> to the generic rules for those conversions, which at least in the case
> of CHV and SKL+ involves DWord alignment on the destination, or in
> other words: no mixed-float instruction can have packed fp16
> destinations.

AFAIUI, your reasoning in the paragraph above is correct for BDW *only*,
not for CHV and SKL+ which have an exception to the DWORD channel
alignment rule (whether the instruction is doing conversions or not) in
the case where the destination is HF, has stride equal to 1, and has
sub-register offset aligned to 16B.

> But that seems to contradict what various mixed-float restrictions
> present in the docs that suggest that packed fp16 destinations are
> possible with mixed- float mode. Here are some examples:
>

AFAICT the reason for the apparent contradiction is that packed fp16
destinations are allowed for such instruction on CHV/SKL+ as an
exception to the more strict BDW rule.

> "No SIMD16 in mixed mode when destination is packed f16 for both Align1
> and Align16"
>
> "In Align1, destination stride can be smaller than execution type. When
> destination is stride of 1, 16 bit packed data is updated on the
> destination. However, output packed f16 data must be oword aligned, no
> oword crossing in packed f16."
>
> "When source is float or half float from accumulator register and
> destination is half float with a stride of 1, the source must register
> aligned. i.e., source must have offset zero."
>
> Since I was only considering that conversion rules only applied to MOV
> instructions this was not an issue in my series, but if we consider
> that those rules also apply to implicit conversions from other
> instructions then it looks like all these restrictions refer to
> impossible situations, so I am not sure what to do about them... should
> we just ignore them in the validator?
>

I don't see any reason to ignore them.  The last three restrictions
you've quoted above probably apply to MOV conversions in the same way
that they apply to other instructions.  Unfortunately only the last one
you quoted seems to be enforced by the simulator, and it certainly kicks
in for any instructions with an accumulator source and a packed
half-float destination, including MOV.

>> > > > > Anyway, I'll try to rework the patches to do more generic
>> > > > > validation of
>> > > > > HF conversions like you ask and see what kind of code we end
>> > > > > up
>> > > > > with.
>> > > > > 
>> > > > 
>> > > > Thanks.
>> > > > 
>> > > > > [...]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190312/f71abaf3/attachment.sig>