[Mesa-dev] [PATCH 09/12] i965/fs: make SIMD-splitting respect the original stride/offset

Pohjolainen, Topi topi.pohjolainen at intel.com
Mon Aug 17 00:09:21 PDT 2015


On Fri, Aug 14, 2015 at 03:30:18PM -0700, Connor Abbott wrote:
> In some cases, we need to emit ALU instructions with a certain stride
> due to a HW limitation. When splitting that instruction, we need to
> respect the original stride when creating the temporaries we load from
> and store into. Otherwise, we'll reintroduce the problem we were trying
> to work around.
> 
> Signed-off-by: Connor Abbott <connor.w.abbott at intel.com>
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 812648f..386e9a2 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -2370,12 +2370,14 @@ fs_visitor::opt_register_renaming()
>  
>        if (depth == 0 &&
>            inst->dst.file == GRF &&
> -          alloc.sizes[inst->dst.reg] == inst->exec_size / 8 &&
> +          alloc.sizes[inst->dst.reg] ==
> +            inst->dst.component_size(inst->exec_size) &&
>            !inst->is_partial_write()) {
>           if (remap[dst] == -1) {
>              remap[dst] = dst;
>           } else {
> -            remap[dst] = alloc.allocate(inst->exec_size / 8);
> +            remap[dst] =
> +               alloc.allocate(inst->dst.component_size(inst->exec_size));
>              inst->dst.reg = remap[dst];
>              progress = true;
>           }
> @@ -4334,6 +4336,8 @@ fs_visitor::lower_simd_width()
>                     * temporary passed as source to the lowered instruction.
>                     */
>                    split_inst.src[j] = lbld.vgrf(inst->src[j].type, src_size);
> +                  split_inst.src[j].subreg_offset = inst->src[j].subreg_offset;

The fixes for the wider component size (64-bits) and the stride are clearly
needed for doubles. I'm wondering though why the sub-register offset hasn't
caused us any problems before. That change is not needed just for doubles,
is it?

> +                  split_inst.src[j].stride = inst->src[j].stride;
>                    emit_transpose(lbld.group(copy_width, 0),
>                                   split_inst.src[j], &src, 1, src_size, n);
>                 }
> @@ -4343,8 +4347,10 @@ fs_visitor::lower_simd_width()
>                 /* Allocate enough space to hold the result of the lowered
>                  * instruction and fix up the number of registers written.
>                  */
> -               split_inst.dst = dsts[i] =
> -                  lbld.vgrf(inst->dst.type, dst_size);
> +               fs_reg dst = lbld.vgrf(inst->dst.type, dst_size);
> +               dst.stride = inst->dst.stride;
> +               dst.subreg_offset = inst->dst.subreg_offset;
> +               split_inst.dst = dsts[i] = dst;
>                 split_inst.regs_written =
>                    DIV_ROUND_UP(inst->regs_written * lower_width,
>                                 inst->exec_size);
> -- 
> 2.4.3
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list