[Mesa-dev] [PATCH 6/9] i965/fs: add stride restrictions for copy propagation

Matt Turner mattst88 at gmail.com
Thu Nov 19 03:54:01 PST 2015


On Thu, Nov 19, 2015 at 2:05 AM, Iago Toral Quiroga <itoral at igalia.com> wrote:
> From: Connor Abbott <connor.w.abbott at intel.com>
>
> There are various restrictions on what the hstride can be that depend on
> the Gen, and now that we're using hstride == 2 for packing/unpacking
> doubles, we're going to run into these restrictions a lot more often.
> Pull them out into a separate function, and move the one restriction we
> checked previously into it.
> ---
>  .../drivers/dri/i965/brw_fs_copy_propagation.cpp   | 58 +++++++++++++++++++++-
>  1 file changed, 57 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> index 426ea57..3ae5ead 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> @@ -275,6 +275,61 @@ is_logic_op(enum opcode opcode)
>             opcode == BRW_OPCODE_NOT);
>  }
>
> +static bool
> +can_take_stride(fs_inst *inst, unsigned arg, unsigned stride,
> +                const brw_device_info *devinfo)
> +{
> +   if (stride > 4)
> +      return false;
> +
> +   /* 3-source instructions can only be Align16, which restricts what strides
> +    * they can take. They can only take a stride of 1 (the usual case), or 0
> +    * with a special "repctrl" bit. But the repctrl bit doesn't work for
> +    * 64-bit datatypes, so if the source type is 64-bit then only a stride of
> +    * 1 is allowed. From the Broadwell PRM, Volume 7 "3D Media GPGPU", page
> +    * 944:
> +    *
> +    *    This is applicable to 32b datatypes and 16b datatype. 64b datatypes
> +    *    cannot use the replicate control.
> +    */
> +

Remove this blank line.

> +   if (inst->is_3src()) {
> +      if (type_sz(inst->src[arg].type) > 4)
> +         return stride == 1;
> +      else
> +         return stride == 1 || stride == 0;
> +   }
> +
> +   /* From the Broadwell PRM, Volume 2a "Command Reference - Instructions",
> +    * page 391 ("Extended Math Function"):
> +    *
> +    *     The following restrictions apply for align1 mode: Scalar source is
> +    *     supported. Source and destination horizontal stride must be the
> +    *     same.
> +    *
> +    * From the Haswell PRM Volume 2b "Command Reference - Instructions", page
> +    * 134 ("Extended Math Function"):
> +    *
> +    *    Scalar source is supported. Source and destination horizontal stride
> +    *    must be 1.
> +    *
> +    * and similar language exists for IVB and SNB. Pre-SNB, math instructions
> +    * are sends, so the sources are moved to MRF's and there are no
> +    * restrictions.
> +    */
> +

Remove this blank line.

> +   if (inst->is_math()) {
> +      if (devinfo->gen == 6 || devinfo->gen == 7) {
> +         assert(inst->dst.stride == 1);
> +         return stride == 1 || stride == 0;
> +      } else if (devinfo->gen >= 8) {
> +         return stride == inst->dst.stride || stride == 0;
> +      }
> +   }
> +
> +   return true;
> +}
> +
>  bool
>  fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
>  {
> @@ -326,7 +381,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
>     /* Bail if the result of composing both strides would exceed the
>      * hardware limit.
>      */
> -   if (entry->src.stride * inst->src[arg].stride > 4)
> +   if (!can_take_stride(inst, arg, entry->src.stride * inst->src[arg].stride,
> +                        devinfo))
>        return false;
>
>     /* Bail if the instruction type is larger than the execution type of the
> --

Reviewed-by: Matt Turner <mattst88 at gmail.com>

That all looks right to me and splitting it out into a separate
function is a good idea.


More information about the mesa-dev mailing list