[Mesa-dev] [PATCH] intel/compiler: update validator to account for half-float exec type promotion

Iago Toral itoral at igalia.com
Mon Jan 28 09:03:30 UTC 2019


On Fri, 2019-01-25 at 12:54 -0800, Francisco Jerez wrote:
> Iago Toral <itoral at igalia.com> writes:
> 
> > On Thu, 2019-01-24 at 11:45 -0800, Francisco Jerez wrote:
> > > Iago Toral <itoral at igalia.com> writes:
> > > 
> > > > On Wed, 2019-01-23 at 06:03 -0800, Francisco Jerez wrote:
> > > > > Iago Toral Quiroga <itoral at igalia.com> writes:

(...)

> > > I'd like to see the actual regioning restrictions for HF types
> > > implemented in the EU validator as part of your series.
> > 
> > Ok, let's see if we can agree on what restrictions should we
> > implement
> > then. I can implement this restriction as documented:
> > 
> > "Conversion between Integer and HF (Half Float) must be DWord-
> > aligned
> > and strided by a DWord on the destination"
> > 
> > Instead of trying to apply the general one that is currently
> > breaking.
> > That will fix the assertion issue. I guess my issue with it was
> > that
> > going by your analysis this restriction is not telling the full
> > picture, this is what you had to say about this restriction:
> > 
> > "I have a feeling that the reason for this may be that the 16-bit
> > pipeline lacks the ability to handle conversions from or to half-
> > float, 
> > so the execution type is implicitly promoted to the matching
> > (integer
> > or floating-point) 32-bit type where any HF conversion would be
> > needed.  And on those the usual alignment restriction of the
> > destination to a larger execution type applies."
> > 
> > But I guess your point is that we can ignore these details at the
> > validator level and just focus on validating strictly what the PRM
> > says. Fair enough.
> > 
> > Unfortunatley, you also mentioned that this restriction *seems* to
> > be
> > overriden by this one on all platforms but BDW (even though the
> > both
> > are listed, at least I see both in my SKL docs):
> > 
> > "There is a relaxed alignment rule for word destinations. When the
> > destination type is word (UW, W, HF), destination data types can be
> > aligned to either the lowest word or the second lowest word of the
> > execution channel. This means the destination data words can be
> > either
> > all in the even word locations or all in the odd word locations."
> > 
> 
> I'd implement this one since it seems like the most specific, except
> on
> BDW where only the previous (almost equivalent) restriction seems to
> apply.

So we apply the one about relaxed alignment on all platforms but BDW,
and on BDW we apply the other one. Ok.

>   There are shitloads of other restrictions we're missing for HF
> types BTW, check out the section "Special Requirements for Handling
> Mixed Mode Float Operations" of the hardware spec.

Yes, I'll go through those in my series too, but I want to have a clear
understanding about the others since they are the ones that triggered
this discussion.

> > Which you discussed didn't make sense to you and I agreed, if only
> > because my own experiments suggested that the implications that one
> > would derive from a strict interpretation of this (that packed 16-
> > bit
> > data is not supported) are not quite true going by the fact that we
> > are
> > emitting packed instructions on all platforms without any
> > indication
> > that this doesn't work. So what should we do about this one? If we
> > decide that we want to implement this then we have to re-think the
> > implementation we have.
> > 
> 
> The implication on packed HF instructions is likely accidental, I
> don't
> think we should enforce the rule except for conversion operations.

Ok, I'll implement this only for conversions then.

> > Should we just validate the one about the integer/float
> > conversions?
> > Should we do that only on BDW or on all platforms?
> > 
> > >   I don't see the
> > > need to introduce any empirical assumptions into the EU
> > > validator's
> > > execution_type() in order to achieve that, even if that means
> > > that
> > > get_exec_type() and execution_type() don't do the exact same
> > > calculation
> > > -- What you call an inconsistency is the consequence of
> > > execution_type()
> > > being the hardware spec's opinion on what the execution type is,
> > > which
> > > we assume is what we need to use while enforcing a regioning
> > > restriction
> > > that refers to the execution type of the instruction.
> > 
> > Mmm... I guess my question is: is the hardware really assuming that
> > execution type in all cases that involve HF instructions? What I
> > got
> > from your analysis of the whole situation is that the hardware is
> > actually promoting the execution size to 32-bit in some cases, and
> > if
> > that is the case, then does it make sense to implement code in the
> > validator that ignores that fact?
> 
> If the hardware spec chose to ignore that fact and instead gave us a
> different set of regioning restrictions for us to use on HF
> restrictions
> that don't rely on the definition of execution type, we should
> probably
> have the validator match the hardware spec literally.
> 

Ok, my initial understanding was that the hardware docs were not
exactly reliable when it came to the way in which they documented these
restrictions, that's why I was expecting that we would not be exactly
following the PRM to validate assembly code and that we would instead
rely on your findings about the execution type and validate code using
that instead. Anyway, I think I understand your point now.

> > I understand that you think this doesn't have any significance once
> > we
> > make sure that we only apply the restrictions specifically
> > documented
> > for HF from the PRM, and you are probably right, however I have to
> > admit that it feels a bit odd to me that we do that when we know
> > that
> > under the hood there is much more going on and we are producing FS
> > IR
> > code following different rules after all. But maybe it is just me
> > being too paranoid about this...
> > 
> 
> Wouldn't it be even odder to have the hardware restriction validator
> diverge from the documentation it's supposedly validating?

If that's the consequence of empirical testing suggesting that the
hardware doesn't exactly do what the documentation states, then I don't
think that would be that odd, particularly if that is clearly explained
with a comment in the code that adds that validation. But I understand
that we want to keep te validator strictly validating the restrictions
as closely as possible to the way in which they are formulated in the
PRM, even if we know that they are not telling the complete picture, I
get that now.

(...)

> >  IMHO you want to enforce the
> > > regioning
> > > restrictions that the hardware spec describes for HF types rather
> > > than
> > > the current (incorrect and incomplete) generalization of the
> > > standard
> > > restrictions.  With those in place the apparent inconsistency
> > > between
> > > get_exec_type() and the hardware spec's execution type would be
> > > harmless.
> > 
> > I agree with that. I still feel uneasy about having two different 
> > implementations of the execution type in FS IR and the validator,
> > even
> > if that doesn't lead to any issues when we do what you say here...
> > 
> 
> I feel uneasy about the validator not matching the hardware spec.
> Luckily its only purpose is validating the rest of the compiler, so
> if
> there is an actual inconsistency as you seem to think, it will come
> up.
> 
> > At the very least I think that deserves a comment in the validator
> > explaining why we think that is okay.
> > 
> 
> A quote of the hardware spec should be enough of a
> comment.  Right?  And
> there is already a comment in get_exec_type() explaining why its HF
> type
> handling logic is consistent with the hardware spec restrictions for
> HF
> types.

It explains why get_exec_type() is consistent with the spec
restrictions, but I don't think it explains why, for the same
instruction, get_exec_type() and execution_type() can return different
results (and why that is fine). Anyway, if you think a comment
explaining this is not necessary I won't add it.

> > > > >  if anything because currently
> > > > > it will reject working code that uses HF types.
> > > > 
> > > > Just for the sake of clarity, since that sentence above could
> > > > be
> > > > interpreted as if all HF code would be rejected: we have been
> > > > using
> > > > HF
> > > > types since we landed VK_KHR_16bit_storage, which includes
> > > > conversions
> > > > between HF and F and the EU validator is not complaining about
> > > > any
> > > > of
> > > > that. It is currently a problem only with conversions to
> > > > smaller
> > > > types
> > > > (so only conversions to Byte) because that's where we check for
> > > > that
> > > > restriction on the stride, which is based on the execution type
> > > > of
> > > > the
> > > > instruction.
> > > > 
> > > > > 
> > > > > > ---
> > > > > >  src/intel/compiler/brw_eu_validate.c | 27 ++++++++++++++
> > > > > > ----
> > > > > > ----
> > > > > > -----
> > > > > >  1 file changed, 14 insertions(+), 13 deletions(-)
> > > > > > 
> > > > > > diff --git a/src/intel/compiler/brw_eu_validate.c
> > > > > > b/src/intel/compiler/brw_eu_validate.c
> > > > > > index a25010b225c..3bb37677672 100644
> > > > > > --- a/src/intel/compiler/brw_eu_validate.c
> > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
> > > > > > @@ -325,17 +325,20 @@ execution_type(const struct
> > > > > > gen_device_info
> > > > > > *devinfo, const brw_inst *inst)
> > > > > >     unsigned num_sources = num_sources_from_inst(devinfo,
> > > > > > inst);
> > > > > >     enum brw_reg_type src0_exec_type, src1_exec_type;
> > > > > >  
> > > > > > -   /* Execution data type is independent of destination
> > > > > > data
> > > > > > type,
> > > > > > except in
> > > > > > -    * mixed F/HF instructions on CHV and SKL+.
> > > > > > +   /* Empirical testing suggests that type conversions
> > > > > > involving
> > > > > > half-float
> > > > > > +    * promote execution type to 32-bit. See
> > > > > > get_exec_type() in
> > > > > > brw_ir_fs.h.
> > > > > >      */
> > > > > >     enum brw_reg_type dst_exec_type =
> > > > > > brw_inst_dst_type(devinfo,
> > > > > > inst);
> > > > > >  
> > > > > >     src0_exec_type =
> > > > > > execution_type_for_type(brw_inst_src0_type(devinfo, inst));
> > > > > >     if (num_sources == 1) {
> > > > > > -      if ((devinfo->gen >= 9 || devinfo->is_cherryview) &&
> > > > > > -          src0_exec_type == BRW_REGISTER_TYPE_HF) {
> > > > > > -         return dst_exec_type;
> > > > > > +      if (type_sz(src0_exec_type) == 2 && dst_exec_type !=
> > > > > > src0_exec_type) {
> > > > > > +         if (src0_exec_type == BRW_REGISTER_TYPE_HF)
> > > > > > +            return BRW_REGISTER_TYPE_F;
> > > > > > +         else if (dst_exec_type == BRW_REGISTER_TYPE_HF)
> > > > > > +            return BRW_REGISTER_TYPE_D;
> > > > > >        }
> > > > > > +
> > > > > >        return src0_exec_type;
> > > > > >     }
> > > > > >  
> > > > > > @@ -367,14 +370,12 @@ execution_type(const struct
> > > > > > gen_device_info
> > > > > > *devinfo, const brw_inst *inst)
> > > > > >         src1_exec_type == BRW_REGISTER_TYPE_DF)
> > > > > >        return BRW_REGISTER_TYPE_DF;
> > > > > >  
> > > > > > -   if (devinfo->gen >= 9 || devinfo->is_cherryview) {
> > > > > > -      if (dst_exec_type == BRW_REGISTER_TYPE_F ||
> > > > > > -          src0_exec_type == BRW_REGISTER_TYPE_F ||
> > > > > > -          src1_exec_type == BRW_REGISTER_TYPE_F) {
> > > > > > -         return BRW_REGISTER_TYPE_F;
> > > > > > -      } else {
> > > > > > -         return BRW_REGISTER_TYPE_HF;
> > > > > > -      }
> > > > > > +   if (dst_exec_type == BRW_REGISTER_TYPE_F ||
> > > > > > +       src0_exec_type == BRW_REGISTER_TYPE_F ||
> > > > > > +       src1_exec_type == BRW_REGISTER_TYPE_F) {
> > > > > > +      return BRW_REGISTER_TYPE_F;
> > > > > > +   } else {
> > > > > > +      return BRW_REGISTER_TYPE_HF;
> > > > > >     }
> > > > > >  
> > > > > >     assert(src0_exec_type == BRW_REGISTER_TYPE_F);
> > > > > > -- 
> > > > > > 2.17.1



More information about the mesa-dev mailing list