[Mesa-dev] [PATCH 18/22] i965/fs: Add infrastructure for generating CSEL instructions.

Matt Turner mattst88 at gmail.com
Tue Mar 6 00:10:24 UTC 2018


On Fri, Feb 23, 2018 at 3:56 PM, Ian Romanick <idr at freedesktop.org> wrote:
> From: Kenneth Graunke <kenneth at whitecape.org>
>
> v2 (idr): Don't allow CSEL with a non-float src2.
>
> v3 (idr): Add CSEL to fs_inst::flags_written.  Suggested by Matt.
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
> ---
>  src/intel/compiler/brw_eu.h             |  1 +
>  src/intel/compiler/brw_eu_emit.c        |  1 +
>  src/intel/compiler/brw_fs.cpp           |  1 +
>  src/intel/compiler/brw_fs_builder.h     | 22 +++++++++++++++++++++-
>  src/intel/compiler/brw_fs_generator.cpp |  5 +++++
>  5 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
> index 2d0f56f..3201624 100644
> --- a/src/intel/compiler/brw_eu.h
> +++ b/src/intel/compiler/brw_eu.h
> @@ -171,6 +171,7 @@ ALU2(SHR)
>  ALU2(SHL)
>  ALU1(DIM)
>  ALU2(ASR)
> +ALU3(CSEL)
>  ALU1(F32TO16)
>  ALU1(F16TO32)
>  ALU2(ADD)
> diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
> index c25d8d6..305a759 100644
> --- a/src/intel/compiler/brw_eu_emit.c
> +++ b/src/intel/compiler/brw_eu_emit.c
> @@ -953,6 +953,7 @@ ALU2(SHR)
>  ALU2(SHL)
>  ALU1(DIM)
>  ALU2(ASR)
> +ALU3(CSEL)
>  ALU1(FRC)
>  ALU1(RNDD)
>  ALU2(MAC)
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index bed632d..accae1b 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -929,6 +929,7 @@ unsigned
>  fs_inst::flags_written() const
>  {
>     if ((conditional_mod && (opcode != BRW_OPCODE_SEL &&
> +                            opcode != BRW_OPCODE_CSEL &&
>                              opcode != BRW_OPCODE_IF &&
>                              opcode != BRW_OPCODE_WHILE)) ||
>         opcode == FS_OPCODE_MOV_DISPATCH_TO_FLAGS) {
> diff --git a/src/intel/compiler/brw_fs_builder.h b/src/intel/compiler/brw_fs_builder.h
> index 87394bc..47eac42 100644
> --- a/src/intel/compiler/brw_fs_builder.h
> +++ b/src/intel/compiler/brw_fs_builder.h
> @@ -458,7 +458,6 @@ namespace brw {
>        ALU1(BFREV)
>        ALU1(CBIT)
>        ALU2(CMPN)
> -      ALU3(CSEL)
>        ALU1(DIM)
>        ALU2(DP2)
>        ALU2(DP3)
> @@ -534,6 +533,27 @@ namespace brw {
>        }
>
>        /**
> +       * CSEL: dst = src2 <op> 0.0f ? src0 : src1
> +       */
> +      instruction *
> +      CSEL(const dst_reg &dst, const src_reg &src0, const src_reg &src1,
> +           const src_reg &src2, brw_conditional_mod condition) const
> +      {
> +         /* CSEL only operates on floats, so we can't do integer </<=/>=/>
> +          * comparisons.  Zero/non-zero (== and !=) comparisons almost work.
> +          * 0x80000000 fails because it is -0.0, and -0.0 == 0.0.
> +          */
> +         assert(src2.type == BRW_REGISTER_TYPE_F);
> +
> +         return set_condmod(condition,
> +                            emit(BRW_OPCODE_CSEL,
> +                                 retype(dst, BRW_REGISTER_TYPE_F),
> +                                 retype(src0, BRW_REGISTER_TYPE_F),
> +                                 retype(src1, BRW_REGISTER_TYPE_F),
> +                                 src2));
> +      }
> +
> +      /**
>         * Emit a linear interpolation instruction.
>         */
>        instruction *
> diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp
> index cd5be05..9157367 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -1826,6 +1826,11 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>        case BRW_OPCODE_SEL:
>          brw_SEL(p, dst, src[0], src[1]);
>          break;
> +      case BRW_OPCODE_CSEL:
> +         brw_set_default_access_mode(p, BRW_ALIGN_16);
> +         brw_CSEL(p, dst, src[0], src[1], src[2]);
> +         brw_set_default_access_mode(p, BRW_ALIGN_1);
> +         break;

This is not okay, actually. Gen11 gets rid of Align16 mode, and even
on Gen10 I've switched all 3-src instructions to use align1 mode. Just
make it work like MAD/LRP/etc and it'll be fine.


More information about the mesa-dev mailing list