[Freedreno] [PATCH] freedreno/ir3: Make imageStore use num components from image format
Rob Clark
robdclark at gmail.com
Tue Dec 18 13:31:09 UTC 2018
On Mon, Dec 17, 2018 at 3:41 PM Eduardo Lima Mitev <elima at igalia.com> wrote:
>
> emit_intrinsic_store_image() is always using 4 components when
> collecting registers for the value. When image has less than
> 4 components (e.g, r32f, r32i, r32ui) this results in extra mov
> instructions.
>
> This patch uses the actual number of components from the image format.
>
> For example, in a shader like:
>
> layout (r32f, binding=0) writeonly uniform imageBuffer u_image;
> ...
> void main(void) {
> ...
> imageStore (u_image, some_offset, vec4(1.0));
> ...
> }
>
> instruction count is reduced in at least 3 instructions (note image
> format is r32f, 1 component only).
>
> This obviously reduces register pressure as well.
> ---
> src/freedreno/ir3/ir3_compiler_nir.c | 34 ++++++++++++++++++++++++++--
> 1 file changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/src/freedreno/ir3/ir3_compiler_nir.c b/src/freedreno/ir3/ir3_compiler_nir.c
> index 85f14f354d2..cc00602c249 100644
> --- a/src/freedreno/ir3/ir3_compiler_nir.c
> +++ b/src/freedreno/ir3/ir3_compiler_nir.c
> @@ -1251,6 +1251,35 @@ emit_intrinsic_load_image(struct ir3_context *ctx, nir_intrinsic_instr *intr,
> ir3_split_dest(b, dst, sam, 0, 4);
> }
>
> +/* Get the number of components of the different image formats supported
> + * by the GLES 3.1 spec.
> + */
> +static unsigned
> +get_num_components_for_glformat(GLuint format)
> +{
> + switch (format) {
> + case GL_R32F:
> + case GL_R32I:
> + case GL_R32UI:
> + return 1;
> +
> + case GL_RGBA32F:
> + case GL_RGBA16F:
> + case GL_RGBA8:
> + case GL_RGBA8_SNORM:
> + case GL_RGBA32I:
> + case GL_RGBA16I:
> + case GL_RGBA8I:
> + case GL_RGBA32UI:
> + case GL_RGBA16UI:
> + case GL_RGBA8UI:
> + return 4;
> +
> + default:
> + assert(!"Unsupported GL format for image");
One thought, default could still return 4, at least for release
builds.. that should always be a safe fallback. But I suppose
addition of new formats maybe doesn't happen so much. (There is no
glsl_types helper for this?)
If we do have an assert, compile_assert() or ir3_context_error() would
be nice, since (at least for debug builds) should print out the nir
annotated w/ location of error, which I find convenient for debugging.
BR,
-R
> + }
> +}
> +
> /* src[] = { deref, coord, sample_index, value }. const_index[] = {} */
> static void
> emit_intrinsic_store_image(struct ir3_context *ctx, nir_intrinsic_instr *intr)
> @@ -1262,6 +1291,7 @@ emit_intrinsic_store_image(struct ir3_context *ctx, nir_intrinsic_instr *intr)
> struct ir3_instruction * const *coords = ir3_get_src(ctx, &intr->src[1]);
> unsigned ncoords = get_image_coords(var, NULL);
> unsigned tex_idx = get_image_slot(ctx, nir_src_as_deref(intr->src[0]));
> + unsigned ncomp = get_num_components_for_glformat(var->data.image.format);
>
> /* src0 is value
> * src1 is coords
> @@ -1276,10 +1306,10 @@ emit_intrinsic_store_image(struct ir3_context *ctx, nir_intrinsic_instr *intr)
> */
>
> stib = ir3_STIB(b, create_immed(b, tex_idx), 0,
> - ir3_create_collect(ctx, value, 4), 0,
> + ir3_create_collect(ctx, value, ncomp), 0,
> ir3_create_collect(ctx, coords, ncoords), 0,
> offset, 0);
> - stib->cat6.iim_val = 4;
> + stib->cat6.iim_val = ncomp;
> stib->cat6.d = ncoords;
> stib->cat6.type = get_image_type(var);
> stib->cat6.typed = true;
> --
> 2.19.2
>
> _______________________________________________
> Freedreno mailing list
> Freedreno at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno
More information about the Freedreno
mailing list