[Mesa-dev] [PATCH v7 28/35] intel/compiler: implement SIMD16 restrictions for mixed-float instructions
Francisco Jerez
currojerez at riseup.net
Wed Apr 10 21:20:20 UTC 2019
Iago Toral <itoral at igalia.com> writes:
> On Mon, 2019-04-08 at 12:00 -0700, Francisco Jerez wrote:
>> "Juan A. Suarez Romero" <jasuarez at igalia.com> writes:
>>
>> > From: Iago Toral Quiroga <itoral at igalia.com>
>> >
>> > v2: f32to16/f16to32 can use a :W destination (Curro)
>> > ---
>> > src/intel/compiler/brw_fs.cpp | 71
>> > +++++++++++++++++++++++++++++++++++
>> > 1 file changed, 71 insertions(+)
>> >
>> > diff --git a/src/intel/compiler/brw_fs.cpp
>> > b/src/intel/compiler/brw_fs.cpp
>> > index d4803c63b93..48b5cc6c403 100644
>> > --- a/src/intel/compiler/brw_fs.cpp
>> > +++ b/src/intel/compiler/brw_fs.cpp
>> > @@ -5606,6 +5606,48 @@ fs_visitor::lower_logical_sends()
>> > return progress;
>> > }
>> >
>> > +static bool
>> > +is_mixed_float_with_fp32_dst(const fs_inst *inst)
>> > +{
>> > + /* This opcode sometimes uses :W type on the source even if the
>> > operand is
>> > + * a :HF, because in gen7 there is no support for :HF, and thus
>> > it uses :W.
>> > + */
>> > + if (inst->opcode == BRW_OPCODE_F16TO32)
>> > + return true;
>> > +
>> > + if (inst->dst.type != BRW_REGISTER_TYPE_F)
>> > + return false;
>> > +
>> > + for (int i = 0; i < inst->sources; i++) {
>> > + if (inst->src[i].type == BRW_REGISTER_TYPE_HF)
>> > + return true;
>> > + }
>> > +
>> > + return false;
>> > +}
>> > +
>> > +static bool
>> > +is_mixed_float_with_packed_fp16_dst(const fs_inst *inst)
>> > +{
>> > + /* This opcode sometimes uses :W type on the destination even
>> > if the
>> > + * destination is a :HF, because in gen7 there is no support
>> > for :HF, and
>> > + * thus it uses :W.
>> > + */
>> > + if (inst->opcode == BRW_OPCODE_F32TO16)
>>
>> Don't you need to check whether the destination is packed here?
>
> Yes, we also need that, like in the code below.
>
Patch would be Reviewed-by: Francisco Jerez <currojerez at riseup.net> with
that change.
>> > + return true;
>> > +
>> > + if (inst->dst.type != BRW_REGISTER_TYPE_HF ||
>> > + inst->dst.stride != 1)
>> > + return false;
>> > +
>> > + for (int i = 0; i < inst->sources; i++) {
>> > + if (inst->src[i].type == BRW_REGISTER_TYPE_F)
>> > + return true;
>> > + }
>> > +
>> > + return false;
>> > +}
>> > +
>> > /**
>> > * Get the closest allowed SIMD width for instruction \p inst
>> > accounting for
>> > * some common regioning and execution control restrictions that
>> > apply to FPU
>> > @@ -5768,6 +5810,35 @@ get_fpu_lowered_simd_width(const struct
>> > gen_device_info *devinfo,
>> > max_width = MIN2(max_width, 4);
>> > }
>> >
>> > + /* From the SKL PRM, Special Restrictions for Handling Mixed
>> > Mode
>> > + * Float Operations:
>> > + *
>> > + * "No SIMD16 in mixed mode when destination is f32.
>> > Instruction
>> > + * execution size must be no more than 8."
>> > + *
>> > + * FIXME: the simulator doesn't seem to complain if we don't do
>> > this and
>> > + * empirical testing with existing CTS tests show that they
>> > pass just fine
>> > + * without implementing this, however, since our interpretation
>> > of the PRM
>> > + * is that conversion MOVs between HF and F are still mixed-
>> > float
>> > + * instructions (and therefore subject to this restriction) we
>> > decided to
>> > + * split them to be safe. Might be useful to do additional
>> > investigation to
>> > + * lift the restriction if we can ensure that it is safe
>> > though, since these
>> > + * conversions are common when half-float types are involved
>> > since many
>> > + * instructions do not support HF types and conversions from/to
>> > F are
>> > + * required.
>> > + */
>> > + if (is_mixed_float_with_fp32_dst(inst))
>> > + max_width = MIN2(max_width, 8);
>> > +
>> > + /* From the SKL PRM, Special Restrictions for Handling Mixed
>> > Mode
>> > + * Float Operations:
>> > + *
>> > + * "No SIMD16 in mixed mode when destination is packed f16
>> > for both
>> > + * Align1 and Align16."
>> > + */
>> > + if (is_mixed_float_with_packed_fp16_dst(inst))
>> > + max_width = MIN2(max_width, 8);
>> > +
>> > /* Only power-of-two execution sizes are representable in the
>> > instruction
>> > * control fields.
>> > */
>> > --
>> > 2.20.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190410/2953cd4a/attachment.sig>
More information about the mesa-dev
mailing list