[Mesa-dev] [PATCH] intel/compiler: update validator to account for half-float exec type promotion

Thu Jan 24 19:45:18 UTC 2019

Iago Toral <itoral at igalia.com> writes:

> On Wed, 2019-01-23 at 06:03 -0800, Francisco Jerez wrote:
>> Iago Toral Quiroga <itoral at igalia.com> writes:
>> 
>> > Commit c84ec70b3a72 implemented execution type promotion to 32-bit
>> > for
>> > conversions involving half-float registers, which empirical testing
>> > suggested
>> > was required, but it did not incorporate this change into the
>> > assembly validator
>> > logic. This commits adds that, preventing validation errors like
>> > this:
>> > 
>> 
>> I don't think we should be validating empirical assumptions in the EU
>> validator.
>
> I am not sure I get your point, isn't c84ec70b3a72 also based on
> empirical testing after all?
>

To some extent, but it doesn't attempt to enforce ISA restrictions based
on information obtained empirically.

>
>> > mov(16)  g9<4>B   g3<16,8,2>HF     { align1 1H };
>> > ERROR: Destination stride must be equal to the ratio of the sizes
>> > of the
>> >        execution data type to the destination type
>> > 
>> > Fixes: c84ec70b3a72 "intel/fs: Promote execution type to 32-bit
>> > when any half-float conversion is needed."
>> 
>> I don't think this "fixes" anything that ever worked.
>
> It is true that the code in that trace above is not something we can
> produce right now, because it is a conversion from HF to B and that
> should only happen within the context of VK_KHR_shader_float16_int8,
> however, this is a consequence of the fact that since c84ec70b3a72
> there is an inconsistency between what we do at the IR level regarding
> execution size of HF conversions and what the EU validator is doing,
> and from that perspective this is really fixing an inconsistency that
> didn't exist before, and I thought we would want to address that sooner
> rather than later and track it down to the original change that
> introduced that inconsistency so we know where this is coming from.
>

The "inconsistency" between the IR's get_exec_type() and the EU
validator's execution_type() has existed ever since a05b6f25bf4bfad7
removed the HF assert from get_exec_type() without actually implementing
the code required to handle HF operands (which is what my commit
c84ec70b3a72 did).

> Anyway, that was my rationale for the Fixes tag, but if you think this
> is not useful I am happy to drop this patch and just include it as part
> of my series without the tag.
>

I'd like to see the actual regioning restrictions for HF types
implemented in the EU validator as part of your series.  I don't see the
need to introduce any empirical assumptions into the EU validator's
execution_type() in order to achieve that, even if that means that
get_exec_type() and execution_type() don't do the exact same calculation
-- What you call an inconsistency is the consequence of execution_type()
being the hardware spec's opinion on what the execution type is, which
we assume is what we need to use while enforcing a regioning restriction
that refers to the execution type of the instruction.

>>   The validator is
>> still missing an implementation of the quirky HF restrictions, and it
>> wasn't the purpose of c84ec70b3a72 to do such a thing.
>
> While this is true in general, the EU validator does consider the
> execution type of the instruction to validate general rules such as the
> one I mentioned in the commit message in this patch. And that part of
> the validator is inconsistent with c84ec70b3a72.

That part of the validator was also inconsistent with the code generated
by your original VK_KHR_shader_float16_int8 series even before I
committed c84ec70b3a72.  The reason is that it is trying to validate a
restriction that rejects working code, because the "general" rule it's
trying to enforce isn't supposed to apply to instructions with HF
operands, which is the real bug.

> In fact, the EU validator is accounting for execution size promotion
> of HF instructions to 32-bit in SKL+ and CHV only, for conversions
> from HF->F and mixed float mode instructions... which is part of what
> c84ec70b3a72 addresses at the IR level, which it actually does for all
> hardware platforms and in more cases.
>

I'm fine with fixing execution_type() to do the right thing in more
cases and platforms, but I don't think that should involve making
empirical assumptions in the validator.

>>   You *should*
>> definitely implement those restrictions (as they're stated in the
>> hardware spec, without empirical assumptions) in the validator as
>> part
>> of your VK_KHR_shader_float16_int8 series,
>
> Again, I am not sure what you mean by "without empirical assumptions".

I was just paraphrasing your comment.  If you feel the need to write a
comment justifying the existence of some code in the validator based on
empirical testing, it probably doesn't belong in the validator.

> According to the commit message in c84ec70b3a72 "the docs are fairly
> imcomplete and inconsistent" and you explained here that you had to do
> some experimentation of your own using the simulator (where you found
> its results to also be inconsistent with the hardware docs) to try and
> guess what is happening. That's why I was using the term "empirical"
> here, since the hardware docs alone aren't clear or correct enough to
> understand what is really happening, when and in what platforms.
>

If some hardware restriction is described in a contradictory or
ambiguous way by the hardware spec we should probably avoid enforcing it
altogether in the EU validator.

> Anyway, if you don't like the term "empirical" to refer to all this,
> that's fine by me, but what I need to know is if we agree that we need
> to implement the same type promotion rules in the validator that we are
> implementing in the IR, indepedently of whether refer to them as "based
> on empirical testing" or not.
>

I don't think we agree on that.  IMHO you want to enforce the regioning
restrictions that the hardware spec describes for HF types rather than
the current (incorrect and incomplete) generalization of the standard
restrictions.  With those in place the apparent inconsistency between
get_exec_type() and the hardware spec's execution type would be
harmless.

>>  if anything because currently
>> it will reject working code that uses HF types.
>
> Just for the sake of clarity, since that sentence above could be
> interpreted as if all HF code would be rejected: we have been using HF
> types since we landed VK_KHR_16bit_storage, which includes conversions
> between HF and F and the EU validator is not complaining about any of
> that. It is currently a problem only with conversions to smaller types
> (so only conversions to Byte) because that's where we check for that
> restriction on the stride, which is based on the execution type of the
> instruction.
>
>> 
>> > ---
>> >  src/intel/compiler/brw_eu_validate.c | 27 ++++++++++++++--------
>> > -----
>> >  1 file changed, 14 insertions(+), 13 deletions(-)
>> > 
>> > diff --git a/src/intel/compiler/brw_eu_validate.c
>> > b/src/intel/compiler/brw_eu_validate.c
>> > index a25010b225c..3bb37677672 100644
>> > --- a/src/intel/compiler/brw_eu_validate.c
>> > +++ b/src/intel/compiler/brw_eu_validate.c
>> > @@ -325,17 +325,20 @@ execution_type(const struct gen_device_info
>> > *devinfo, const brw_inst *inst)
>> >     unsigned num_sources = num_sources_from_inst(devinfo, inst);
>> >     enum brw_reg_type src0_exec_type, src1_exec_type;
>> >  
>> > -   /* Execution data type is independent of destination data type,
>> > except in
>> > -    * mixed F/HF instructions on CHV and SKL+.
>> > +   /* Empirical testing suggests that type conversions involving
>> > half-float
>> > +    * promote execution type to 32-bit. See get_exec_type() in
>> > brw_ir_fs.h.
>> >      */
>> >     enum brw_reg_type dst_exec_type = brw_inst_dst_type(devinfo,
>> > inst);
>> >  
>> >     src0_exec_type =
>> > execution_type_for_type(brw_inst_src0_type(devinfo, inst));
>> >     if (num_sources == 1) {
>> > -      if ((devinfo->gen >= 9 || devinfo->is_cherryview) &&
>> > -          src0_exec_type == BRW_REGISTER_TYPE_HF) {
>> > -         return dst_exec_type;
>> > +      if (type_sz(src0_exec_type) == 2 && dst_exec_type !=
>> > src0_exec_type) {
>> > +         if (src0_exec_type == BRW_REGISTER_TYPE_HF)
>> > +            return BRW_REGISTER_TYPE_F;
>> > +         else if (dst_exec_type == BRW_REGISTER_TYPE_HF)
>> > +            return BRW_REGISTER_TYPE_D;
>> >        }
>> > +
>> >        return src0_exec_type;
>> >     }
>> >  
>> > @@ -367,14 +370,12 @@ execution_type(const struct gen_device_info
>> > *devinfo, const brw_inst *inst)
>> >         src1_exec_type == BRW_REGISTER_TYPE_DF)
>> >        return BRW_REGISTER_TYPE_DF;
>> >  
>> > -   if (devinfo->gen >= 9 || devinfo->is_cherryview) {
>> > -      if (dst_exec_type == BRW_REGISTER_TYPE_F ||
>> > -          src0_exec_type == BRW_REGISTER_TYPE_F ||
>> > -          src1_exec_type == BRW_REGISTER_TYPE_F) {
>> > -         return BRW_REGISTER_TYPE_F;
>> > -      } else {
>> > -         return BRW_REGISTER_TYPE_HF;
>> > -      }
>> > +   if (dst_exec_type == BRW_REGISTER_TYPE_F ||
>> > +       src0_exec_type == BRW_REGISTER_TYPE_F ||
>> > +       src1_exec_type == BRW_REGISTER_TYPE_F) {
>> > +      return BRW_REGISTER_TYPE_F;
>> > +   } else {
>> > +      return BRW_REGISTER_TYPE_HF;
>> >     }
>> >  
>> >     assert(src0_exec_type == BRW_REGISTER_TYPE_F);
>> > -- 
>> > 2.17.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190124/9effb0de/attachment-0001.sig>