[Mesa-dev] [PATCH 2/7] radeonsi: compute shaders w/out LDS/barriers don't have lower register limit

Samuel Pitoiset samuel.pitoiset at gmail.com
Thu May 25 21:11:10 UTC 2017



On 05/25/2017 07:04 PM, Marek Olšák wrote:
> From: Marek Olšák <marek.olsak at amd.com>
> 
> Or do they? This doesn't hang, so it seems right, but I'm not 100% sure.
> Setting VGPRS=256 (i.e. above the limit) with big threadgroups works fine.
> 
> shader-db: Spilled VGPRs: 107 -> 50 (-53.27 %)
> 
> DiRT Showdown and GRID Autosport have 100% reduction in VGPR spilling.
> There are no other changes for shader-db.

That would be awesome, maybe this will fix the performance issue with 
advanced lighting and DiRT Showdown? Did you check?

Either way, patches 1, 3-5 are:

Reviewed-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>

> ---
>   src/gallium/drivers/radeonsi/si_shader.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c
> index 61f1384..0ffe402 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -4040,20 +4040,29 @@ static unsigned si_get_max_workgroup_size(const struct si_shader *shader)
>   	               properties[TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH] *
>   	               properties[TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT] *
>   	               properties[TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH];
>   
>   	if (!max_work_group_size) {
>   		/* This is a variable group size compute shader,
>   		 * compile it for the maximum possible group size.
>   		 */
>   		max_work_group_size = SI_MAX_VARIABLE_THREADS_PER_BLOCK;
>   	}
> +
> +	/* Compute shader threadgroups without LDS usage and barriers don't
> +	 * have to be stuck on the same compute unit, and so register usage
> +	 * doesn't have to be limited.
> +	 */
> +	if (!shader->selector->local_size &&
> +	    !shader->selector->info.uses_barrier)
> +		return MIN2(64, max_work_group_size);
> +
>   	return max_work_group_size;
>   }
>   
>   static void declare_per_stage_desc_pointers(struct si_shader_context *ctx,
>   					    LLVMTypeRef *params,
>   					    unsigned *num_params,
>   					    bool assign_params)
>   {
>   	params[(*num_params)++] = si_const_array(ctx->v4i32,
>   						 SI_NUM_SHADER_BUFFERS + SI_NUM_CONST_BUFFERS);
> 


More information about the mesa-dev mailing list