[Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions
Francisco Jerez
currojerez at riseup.net
Tue Mar 12 22:57:56 UTC 2019
Francisco Jerez <currojerez at riseup.net> writes:
> Iago Toral <itoral at igalia.com> writes:
>
>> On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote:
>>> On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote:
>>> > Iago Toral <itoral at igalia.com> writes:
>>> >
>>> > > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote:
>>> > > > Iago Toral <itoral at igalia.com> writes:
>>> > > >
>>> > > > > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote:
>>> > > > > > Iago Toral <itoral at igalia.com> writes:
>>> > > > > >
>>> > > > > > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote:
>>> > > > > > > > Iago Toral <itoral at igalia.com> writes:
>>> > > > > > > >
>>> > > > > > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez
>>> > > > > > > > > wrote:
>>> > > > > > > > > > Iago Toral Quiroga <itoral at igalia.com> writes:
>>> > > > > > > > > >
>>> > > > > > > > > > > ---
>>> > > > > > > > > > > src/intel/compiler/brw_eu_validate.c | 64
>>> > > > > > > > > > > ++++++++++++-
>>> > > > > > > > > > > src/intel/compiler/test_eu_validate.cpp | 122
>>> > > > > > > > > > > ++++++++++++++++++++++++
>>> > > > > > > > > > > 2 files changed, 185 insertions(+), 1 deletion(-
>>> > > > > > > > > > > )
>>> > > > > > > > > > >
>>> > > > > > > > > > > diff --git a/src/intel/compiler/brw_eu_validate.c
>>> > > > > > > > > > > b/src/intel/compiler/brw_eu_validate.c
>>> > > > > > > > > > > index 000a05cb6ac..203641fecb9 100644
>>> > > > > > > > > > > --- a/src/intel/compiler/brw_eu_validate.c
>>> > > > > > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
>>> > > > > > > > > > > @@ -531,7 +531,69 @@
>>> > > > > > > > > > > general_restrictions_based_on_operand_types(const
>>> > > > > > > > > > > struct
>>> > > > > > > > > > > gen_device_info *devinf
>>> > > > > > > > > > > exec_type_size == 8 && dst_type_size ==
>>> > > > > > > > > > > 4)
>>> > > > > > > > > > > dst_type_size = 8;
>>> > > > > > > > > > >
>>> > > > > > > > > > > - if (exec_type_size > dst_type_size) {
>>> > > > > > > > > > > + /* From the BDW+ PRM:
>>> > > > > > > > > > > + *
>>> > > > > > > > > > > + * "There is no direct conversion from HF
>>> > > > > > > > > > > to
>>> > > > > > > > > > > DF
>>> > > > > > > > > > > or
>>> > > > > > > > > > > DF to
>>> > > > > > > > > > > HF.
>>> > > > > > > > > > > + * There is no direct conversion from HF
>>> > > > > > > > > > > to
>>> > > > > > > > > > > Q/UQ or
>>> > > > > > > > > > > Q/UQ to
>>> > > > > > > > > > > HF."
>>> > > > > > > > > > > + */
>>> > > > > > > > > > > + enum brw_reg_type src0_type =
>>> > > > > > > > > > > brw_inst_src0_type(devinfo,
>>> > > > > > > > > > > inst);
>>> > > > > > > > > > > + ERROR_IF(brw_inst_opcode(devinfo, inst) ==
>>> > > > > > > > > > > BRW_OPCODE_MOV
>>> > > > > > > > > > > &&
>>> > > > > > > > > >
>>> > > > > > > > > > Why is only the MOV instruction handled here and
>>> > > > > > > > > > below? Aren't
>>> > > > > > > > > > other
>>> > > > > > > > > > instructions able to do implicit
>>> > > > > > > > > > conversions? Probably
>>> > > > > > > > > > means
>>> > > > > > > > > > you
>>> > > > > > > > > > need
>>> > > > > > > > > > to deal with two sources rather than one.
>>> > > > > > > > >
>>> > > > > > > > > This comes from the programming notes of the MOV
>>> > > > > > > > > instruction
>>> > > > > > > > > (Volume
>>> > > > > > > > > 2a, Command Reference - Instructions - MOV), so it is
>>> > > > > > > > > described
>>> > > > > > > > > specifically for the MOV instruction. I should
>>> > > > > > > > > probably
>>> > > > > > > > > have
>>> > > > > > > > > made
>>> > > > > > > > > this
>>> > > > > > > > > clear in the comment.
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > > > Maybe the one above is specified in the MOV page only,
>>> > > > > > > > probably
>>> > > > > > > > due
>>> > > > > > > > to
>>> > > > > > > > an oversight (If these restrictions were really
>>> > > > > > > > specific
>>> > > > > > > > to
>>> > > > > > > > the
>>> > > > > > > > MOV
>>> > > > > > > > instruction, what would prevent you from implementing
>>> > > > > > > > such
>>> > > > > > > > conversions
>>> > > > > > > > through a different instruction? E.g. "ADD dst:df,
>>> > > > > > > > src:hf,
>>> > > > > > > > 0"
>>> > > > > > > > which
>>> > > > > > > > would be substantially more efficient than what you're
>>> > > > > > > > doing
>>> > > > > > > > in
>>> > > > > > > > PATCH
>>> > > > > > > > 02)
>>> > > > > > >
>>> > > > > > > Instructions that take HF can only be strictly HF or mix
>>> > > > > > > F
>>> > > > > > > and
>>> > > > > > > HF
>>> > > > > > > (mixed mode float), with MOV being the only exception.
>>> > > > > > > That
>>> > > > > > > means
>>> > > > > > > that
>>> > > > > > > any instruction like the one you use above are invalid.
>>> > > > > > > Maybe
>>> > > > > > > we
>>> > > > > > > should
>>> > > > > > > validate explicitly that instructions that use HF are
>>> > > > > > > strictly
>>> > > > > > > HF
>>> > > > > > > or
>>> > > > > > > mixed-float mode only?
>>> > > > > > >
>>> > > > > >
>>> > > > > > So you're acknowledging that the conversions checked above
>>> > > > > > are
>>> > > > > > illegal
>>> > > > > > whether the instruction is a MOV or something else. Where
>>> > > > > > are we
>>> > > > > > preventing instructions other than MOV with such
>>> > > > > > conversions
>>> > > > > > from
>>> > > > > > being
>>> > > > > > accepted by the validator?
>>> > > > >
>>> > > > > We aren't, because the validator is not checking, in general,
>>> > > > > for
>>> > > > > accepted type combinations for specific instructions anywhere
>>> > > > > as
>>> > > > > far as
>>> > > > > I can see.
>>> > > >
>>> > > > Luckily these type conversion restrictions aren't really
>>> > > > specific
>>> > > > to
>>> > > > any
>>> > > > instruction AFAICT, even though they only seem to be mentioned
>>> > > > explicitly for the MOV instruction...
>>> > > >
>>> > > > > If we want to start doing this with HF conversions
>>> > > > > specifically, I
>>> > > > > guess it is fine, but in case it is not clear, I think it
>>> > > > > won't
>>> > > > > bring
>>> > > > > any immediate benefits with the VK_KHR_shader_float16_int8
>>> > > > > implementation since this series only ever emits conversions
>>> > > > > involving
>>> > > > > HF operands via MOV instructions,
>>> > > >
>>> > > > Yes, I can see that's the intention of this series, but how can
>>> > > > you
>>> > > > make
>>> > > > sure that nothing in the optimizer is breaking your assumption
>>> > > > if
>>> > > > you
>>> > > > don't add some validator code to verify the claim of your last
>>> > > > paragraph?
>>> > > >
>>> > > > > which is why I thought that validating that no direct MOV
>>> > > > > conversions
>>> > > > > from DF/Q types ever happen was useful, since we have code in
>>> > > > > the
>>> > > > > driver to handle this scenario.
>>> > > > >
>>> > > > > [...]
>>> > > > > >
>>> > > > > > I still don't understand why you want to implement the same
>>> > > > > > restriction
>>> > > > > > twice, once for MOV and once for all other mixed-mode
>>> > > > > > instructions. How
>>> > > > > > is that more convenient?
>>> > > > >
>>> > > > > The restrictions based on operand types that are checked in
>>> > > > > the
>>> > > > > validator are specific to Byte or cases where the execution
>>> > > > > type is
>>> > > > > larger than the destination stride, for which mixed float has
>>> > > > > a
>>> > > > > different set of restrictions.
>>> > > > >
>>> > > > > For example, for mixed float we have this rule:
>>> > > > >
>>> > > > > "In Align1, destination stride can be smaller than execution
>>> > > > > type"
>>> > > > >
>>> > > > > Which overrides this other from "General restrictions based
>>> > > > > on
>>> > > > > operand
>>> > > > > types":
>>> > > > >
>>> > > > > "Destination stride must be equal to the ratio of the sizes
>>> > > > > of
>>> > > > > the
>>> > > > > execution data type to the destination type"
>>> > > > >
>>> > > > > So I thought that it would make things easier to keep all
>>> > > > > restrictions
>>> > > > > for mixed float separate and make sure that we skipped mixed
>>> > > > > float
>>> > > > > instructions in general_restrictions_based_on_operand_types()
>>> > > > > so we
>>> > > > > avoid having to add code to skip individual general
>>> > > > > restrictions
>>> > > > > that that are overriden for mixed float mode anyway.
>>> > > > >
>>> > > >
>>> > > > I'm fine with that, but it doesn't seem like what this patch is
>>> > > > doing...
>>> > > > Isn't it adding code to validate mixed-mode float MOV
>>> > > > instructions in
>>> > > > general_restrictions_based_on_operand_types()?
>>> > >
>>> > > I guess this could be arguable, but I was not considering
>>> > > conversion
>>> > > MOVs to be mixed-float instructions. There are two reasons for
>>> > > this:
>>> > >
>>> > > A conversion MOV involving F/HF doesn't seem to be fundamentally
>>> > > different from any other conversion instruction involving other
>>> > > types,
>>> > > other than the requirement of aligning the destination to a
>>> > > Dword,
>>> > > which is not a resriction explictly made for mixed-float mode.
>>> > >
>>> > > Then, for mixed-float mode, there is this other rule:
>>> > >
>>> > > "In Align1, destination stride can be smaller than execution
>>> > > type.
>>> > > When
>>> > > destination is stride of 1, 16 bit packed data
>>> > > is updated on the destination. However, output packed f16 data
>>> > > must
>>> > > be
>>> > > oword aligned, no oword crossing in
>>> > > packed f16."
>>> > >
>>> > > Which contradicts the other rule that conversions from F to HF
>>> > > need
>>> > > to
>>> > > be DWord aligned on the destination.
>>> > >
>>> > > So it seems to me that conversion MOVs are not following the same
>>> > > principles of mixed-float instructions and we should skip
>>> > > validating
>>> > > mixed-float restrictions for them. What do you think?
>>> > >
>>> >
>>> > That all seems fairly ambiguous... And the restriction on DWORD
>>> > alignment for conversions includes a mixed-mode ADD instruction
>>> > among
>>> > the examples, so I'm skeptical that MOV could be truly
>>> > fundamentally
>>> > different.
>>>
>>> Ok... in that case what do we do about the mixed-float restriction I
>>> quoted above? Since it is incompatible with the mixed-float MOV
>>> conversion I guess we only have two options: ether we don't validate
>>> it
>>> at all or we only validate it for mixed-float instructions that are
>>> not
>>> MOV. I guess we can do the latter?
>>
>> Also, related to this, if we consider implicit conversions in 2-src
>> instructions to also be a target of the generic rules for conversions
>> to HF described for CHV and SKL+, then doesn't that imply that all
>> mixed-float instructions are conversions and subject to those rules?
>> For example:
>>
>> ADD dst:hf src0:f src1:f
>> ADD dst:hf src0:hf src1:f
>> ADD dst:hf src0:f src1:hf
>>
>> In all 3 instructions above the execution type is F. Should we consider
>> all of them implicit conversions?
>
> Yes, I just verified that the simulator considers all of the above to
> fall under the category "Format conversion to or from HF", and requires
> their destination components to be DWORD-aligned, but *only* on BDW.
>
>> That would mean that any mixed-float instruction with HF destination
>> is an implicit conversion from F to HF and therefore would be subject
>> to the generic rules for those conversions, which at least in the case
>> of CHV and SKL+ involves DWord alignment on the destination, or in
>> other words: no mixed-float instruction can have packed fp16
>> destinations.
>
> AFAIUI, your reasoning in the paragraph above is correct for BDW *only*,
> not for CHV and SKL+ which have an exception to the DWORD channel
> alignment rule (whether the instruction is doing conversions or not) in
And by "whether the instruction is doing conversions or not" I mean
conversions beyond the one done for the destination operand, in case it
wasn't clear from my explanation above.
> the case where the destination is HF, has stride equal to 1, and has
> sub-register offset aligned to 16B.
>
>> But that seems to contradict what various mixed-float restrictions
>> present in the docs that suggest that packed fp16 destinations are
>> possible with mixed- float mode. Here are some examples:
>>
>
> AFAICT the reason for the apparent contradiction is that packed fp16
> destinations are allowed for such instruction on CHV/SKL+ as an
> exception to the more strict BDW rule.
>
>> "No SIMD16 in mixed mode when destination is packed f16 for both Align1
>> and Align16"
>>
>> "In Align1, destination stride can be smaller than execution type. When
>> destination is stride of 1, 16 bit packed data is updated on the
>> destination. However, output packed f16 data must be oword aligned, no
>> oword crossing in packed f16."
>>
>> "When source is float or half float from accumulator register and
>> destination is half float with a stride of 1, the source must register
>> aligned. i.e., source must have offset zero."
>>
>> Since I was only considering that conversion rules only applied to MOV
>> instructions this was not an issue in my series, but if we consider
>> that those rules also apply to implicit conversions from other
>> instructions then it looks like all these restrictions refer to
>> impossible situations, so I am not sure what to do about them... should
>> we just ignore them in the validator?
>>
>
> I don't see any reason to ignore them. The last three restrictions
> you've quoted above probably apply to MOV conversions in the same way
> that they apply to other instructions. Unfortunately only the last one
> you quoted seems to be enforced by the simulator, and it certainly kicks
> in for any instructions with an accumulator source and a packed
> half-float destination, including MOV.
>
>>> > > > > Anyway, I'll try to rework the patches to do more generic
>>> > > > > validation of
>>> > > > > HF conversions like you ask and see what kind of code we end
>>> > > > > up
>>> > > > > with.
>>> > > > >
>>> > > >
>>> > > > Thanks.
>>> > > >
>>> > > > > [...]
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190312/419c2e89/attachment.sig>
More information about the mesa-dev
mailing list