[Mesa-dev] [PATCH 2/7] radeonsi: compute shaders w/out LDS/barriers don't have lower register limit
Samuel Pitoiset
samuel.pitoiset at gmail.com
Thu May 25 21:11:10 UTC 2017
On 05/25/2017 07:04 PM, Marek Olšák wrote:
> From: Marek Olšák <marek.olsak at amd.com>
>
> Or do they? This doesn't hang, so it seems right, but I'm not 100% sure.
> Setting VGPRS=256 (i.e. above the limit) with big threadgroups works fine.
>
> shader-db: Spilled VGPRs: 107 -> 50 (-53.27 %)
>
> DiRT Showdown and GRID Autosport have 100% reduction in VGPR spilling.
> There are no other changes for shader-db.
That would be awesome, maybe this will fix the performance issue with
advanced lighting and DiRT Showdown? Did you check?
Either way, patches 1, 3-5 are:
Reviewed-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
> ---
> src/gallium/drivers/radeonsi/si_shader.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c
> index 61f1384..0ffe402 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -4040,20 +4040,29 @@ static unsigned si_get_max_workgroup_size(const struct si_shader *shader)
> properties[TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH] *
> properties[TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT] *
> properties[TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH];
>
> if (!max_work_group_size) {
> /* This is a variable group size compute shader,
> * compile it for the maximum possible group size.
> */
> max_work_group_size = SI_MAX_VARIABLE_THREADS_PER_BLOCK;
> }
> +
> + /* Compute shader threadgroups without LDS usage and barriers don't
> + * have to be stuck on the same compute unit, and so register usage
> + * doesn't have to be limited.
> + */
> + if (!shader->selector->local_size &&
> + !shader->selector->info.uses_barrier)
> + return MIN2(64, max_work_group_size);
> +
> return max_work_group_size;
> }
>
> static void declare_per_stage_desc_pointers(struct si_shader_context *ctx,
> LLVMTypeRef *params,
> unsigned *num_params,
> bool assign_params)
> {
> params[(*num_params)++] = si_const_array(ctx->v4i32,
> SI_NUM_SHADER_BUFFERS + SI_NUM_CONST_BUFFERS);
>
More information about the mesa-dev
mailing list