[Mesa-dev] [PATCH v2 48/52] nir, intel/compiler: Use a fixed subgroup size
Lionel Landwerlin
lionel.g.landwerlin at intel.com
Fri Oct 13 10:55:32 UTC 2017
Acked-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
On 13/10/17 06:48, Jason Ekstrand wrote:
> The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared
> as a uniform. This means that it cannot change across an invocation
> such as a draw call or a compute dispatch. For compute shaders, we're
> ok because we only ever use one dispatch size. For fragment, however,
> the hardware dynamically chooses between SIMD8 and SIMD16 which violates
> the spec. Instead, let's just pick a subgroup size based on the shader
> stage. The fixed size we choose for compute shaders is a bit higher
> than strictly needed but there's no real harm in that. The advantage is
> that, if they do anything interesting with the value, NIR will see it as
> an immediate and can optimize better.
> ---
> src/compiler/nir/nir.h | 1 +
> src/compiler/nir/nir_lower_subgroups.c | 5 +++++
> src/intel/compiler/brw_fs_nir.cpp | 4 ----
> src/intel/compiler/brw_nir.c | 2 ++
> 4 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index 47c3f21..1a87d66 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2465,6 +2465,7 @@ bool nir_lower_samplers_as_deref(nir_shader *shader,
> const struct gl_shader_program *shader_program);
>
> typedef struct nir_lower_subgroups_options {
> + uint8_t subgroup_size;
> uint8_t ballot_bit_size;
> bool lower_to_scalar:1;
> bool lower_vote_trivial:1;
> diff --git a/src/compiler/nir/nir_lower_subgroups.c b/src/compiler/nir/nir_lower_subgroups.c
> index 1969740..f9424c9 100644
> --- a/src/compiler/nir/nir_lower_subgroups.c
> +++ b/src/compiler/nir/nir_lower_subgroups.c
> @@ -109,6 +109,11 @@ lower_subgroups_intrin(nir_builder *b, nir_intrinsic_instr *intrin,
> return nir_imm_int(b, NIR_TRUE);
> break;
>
> + case nir_intrinsic_load_subgroup_size:
> + if (options->subgroup_size)
> + return nir_imm_int(b, options->subgroup_size);
> + break;
> +
> case nir_intrinsic_read_invocation:
> case nir_intrinsic_read_first_invocation:
> if (options->lower_to_scalar)
> diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp
> index b0dacb1..58f2698 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -4183,10 +4183,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
> break;
> }
>
> - case nir_intrinsic_load_subgroup_size:
> - bld.MOV(retype(dest, BRW_REGISTER_TYPE_D), brw_imm_d(dispatch_width));
> - break;
> -
> case nir_intrinsic_load_subgroup_invocation:
> bld.MOV(retype(dest, BRW_REGISTER_TYPE_D),
> nir_system_values[SYSTEM_VALUE_SUBGROUP_INVOCATION]);
> diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
> index 57f8de7..560b2f2 100644
> --- a/src/intel/compiler/brw_nir.c
> +++ b/src/intel/compiler/brw_nir.c
> @@ -637,6 +637,8 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir)
> OPT(nir_lower_system_values);
>
> const nir_lower_subgroups_options subgroups_options = {
> + .subgroup_size = nir->stage == MESA_SHADER_COMPUTE ? 32 :
> + nir->stage == MESA_SHADER_FRAGMENT ? 16 : 8,
> .ballot_bit_size = 32,
> .lower_to_scalar = true,
> .lower_subgroup_masks = true,
More information about the mesa-dev
mailing list