[Mesa-dev] [PATCH v2 03/20] i965/fs: double regioning parameters and execsize for DF in IVB/BYT

Francisco Jerez currojerez at riseup.net
Tue Jan 17 22:04:04 UTC 2017


Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:

> From: "Juan A. Suarez Romero" <jasuarez at igalia.com>
>
> In IVB and BYT, both regioning parameters and execution sizes are measured as
> floats.
>
> So when we have something like:
>
> mov(8) g2<1>DF g3<4,4,1>DF
>
> We are not actually moving 8 doubles (our intention), but 4 doubles.
>
> We need to double the parameters to cope with this issue.
>
> v2:
> - Use devinfo directly (Matt).
> - Use Baytrail instead of Valleview (Matt).
> - Use IvyBridge instead of Ivy (Matt)
> - Double the exec_size in code emission (Curro)
>
> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
> ---
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 52 ++++++++++++++++++++++----
>  1 file changed, 45 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index 0710be932a5..45881e3ec95 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -54,13 +54,14 @@ brw_file_from_reg(fs_reg *reg)
>  }
>  
>  static struct brw_reg
> -brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg, unsigned gen, bool compressed)
> +brw_reg_from_fs_reg(const struct gen_device_info *devinfo, fs_inst *inst,
> +                    fs_reg *reg, bool compressed)
>  {
>     struct brw_reg brw_reg;
>  
>     switch (reg->file) {
>     case MRF:
> -      assert((reg->nr & ~BRW_MRF_COMPR4) < BRW_MAX_MRF(gen));
> +      assert((reg->nr & ~BRW_MRF_COMPR4) < BRW_MAX_MRF(devinfo->gen));
>        /* Fallthrough */
>     case VGRF:
>        if (reg->stride == 0) {
> @@ -93,6 +94,36 @@ brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg, unsigned gen, bool compressed)
>           const unsigned width = MIN2(reg_width, phys_width);
>           brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg), reg->nr, 0);
>           brw_reg = stride(brw_reg, width * reg->stride, width, reg->stride);
> +         /* From the IvyBridge PRM (EU Changes by Processor Generation, page 13):
> +          *  "Each DF (Double Float) operand uses an element size of 4 rather
> +          *   than 8 and all regioning parameters are twice what the values
> +          *   would be based on the true element size: ExecSize, Width,
> +          *   HorzStride, and VertStride. Each DF operand uses a pair of
> +          *   channels and all masking and swizzing should be adjusted
> +          *   appropriately."
> +          *
> +          * From the IvyBridge PRM (Special Requirements for Handling Double
> +          * Precision Data Types, page 71):
> +          *  "In Align1 mode, all regioning parameters like stride, execution
> +          *   size, and width must use the syntax of a pair of packed
> +          *   floats. The offsets for these data types must be 64-bit
> +          *   aligned. The execution size and regioning parameters are in terms
> +          *   of floats."
> +          *
> +          * Summarized: when handling DF-typed arguments, ExecSize,
> +          * VertStride, and Width must be doubled, and HorzStride must be
> +          * doubled when the region is not scalar.
> +          *
> +          * It applies to BayTrail too.
> +          */
> +         if (devinfo->gen == 7 && !devinfo->is_haswell &&
> +             type_sz(reg->type) == 8) {
> +            brw_reg.width++;
> +            if (brw_reg.vstride > 0)
> +               brw_reg.vstride++;
> +            if (brw_reg.hstride > 1)
> +               brw_reg.hstride++;

AFAIUI, horizontal strides don't behave as they're supposed to on IVB
for DF regions, they will cause each 32-bit half of DF sources to be
strided individually, and doubling the value won't fix anything.  This
should probably be 'assert(brw_reg.hstride == BRW_HORIZONTAL_STRIDE_1)'.
(Also please fix the comment above accordingly)

> +         }
>        }
>  
>        brw_reg = retype(brw_reg, reg->type);
> @@ -1586,9 +1617,8 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>        brw_set_default_group(p, inst->group);
>  
>        for (unsigned int i = 0; i < inst->sources; i++) {
> -         src[i] = brw_reg_from_fs_reg(inst, &inst->src[i], devinfo->gen,
> -                                      compressed);
> -
> +         src[i] = brw_reg_from_fs_reg(compiler->devinfo, inst,

s/compiler->devinfo/devinfo/

> +                                      &inst->src[i], compressed);
>  	 /* The accumulator result appears to get used for the
>  	  * conditional modifier generation.  When negating a UD
>  	  * value, there is a 33rd bit generated for the sign in the
> @@ -1599,7 +1629,8 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>  		inst->src[i].type != BRW_REGISTER_TYPE_UD ||
>  		!inst->src[i].negate);
>        }
> -      dst = brw_reg_from_fs_reg(inst, &inst->dst, devinfo->gen, compressed);
> +      dst = brw_reg_from_fs_reg(compiler->devinfo, inst,
> +                                &inst->dst, compressed);
>  
>        brw_set_default_access_mode(p, BRW_ALIGN_1);
>        brw_set_default_predicate_control(p, inst->predicate);
> @@ -1608,7 +1639,14 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>        brw_set_default_saturate(p, inst->saturate);
>        brw_set_default_mask_control(p, inst->force_writemask_all);
>        brw_set_default_acc_write_control(p, inst->writes_accumulator);
> -      brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
> +
> +      unsigned exec_size = inst->exec_size;
> +      if (devinfo->gen == 7 && !devinfo->is_haswell &&
> +          (get_exec_type_size(inst) == 8 || type_sz(inst->dst.type) == 8)) {
> +         exec_size *= 2;
> +      }
> +
> +      brw_set_default_exec_size(p, cvt(exec_size) - 1);
>  
>        assert(inst->force_writemask_all || inst->exec_size >= 4);
>        assert(inst->force_writemask_all || inst->group % inst->exec_size == 0);
> -- 
> 2.11.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170117/597dc5c3/attachment.sig>


More information about the mesa-dev mailing list