[Mesa-dev] [PATCH V2 5/7] i965: add support for image AoA
Francisco Jerez
currojerez at riseup.net
Mon Nov 2 02:43:04 PST 2015
Timothy Arceri <timothy.arceri at collabora.com> writes:
> From: Timothy Arceri <t_arceri at yahoo.com.au>
>
> V2: avoid useless zero-initialization and addition for the first AoA level,
> avoid redundant temporary, make use of type_size_scalar(), rename aoa_size
> to element_size, assign the indirect indexing temporary directly to
> image.reladdr, and replace while loop with a for loop. All suggested
> by Francisco Jerez.
>
> Cc: Francisco Jerez <currojerez at riseup.net>
> ---
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 30 ++++++++++++++------------
> src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 2 ++
> 2 files changed, 18 insertions(+), 14 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 24ff5af..5254c2d 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1061,18 +1061,17 @@ fs_visitor::get_nir_image_deref(const nir_deref_var *deref)
> fs_reg image(UNIFORM, deref->var->data.driver_location,
> BRW_REGISTER_TYPE_UD);
>
> - if (deref->deref.child) {
> - const nir_deref_array *deref_array =
> - nir_deref_as_array(deref->deref.child);
> - assert(deref->deref.child->deref_type == nir_deref_type_array &&
> - deref_array->deref.child == NULL);
> - const unsigned size = glsl_get_length(deref->var->type);
> + for (const nir_deref *tail = deref->deref.child; tail;
> + tail = tail->child) {
> + const nir_deref_array *deref_array = nir_deref_as_array(tail);
> + assert(tail->deref_type == nir_deref_type_array);
> + const unsigned size = glsl_get_length(tail->type);
IIUC tail->type is going to be the type of the *inner* contained array
which isn't the one you need to clamp to... You probably want to
iterate on the containing array derefs instead so you have the size
bound available, like:
| for (const nir_deref *tail = &deref->deref; tail->child;
| tail = tail->child) {
| const nir_deref_array *deref_array = nir_deref_as_array(tail->child);
| assert(tail->child->deref_type == nir_deref_type_array);
(I suggested otherwise in my reply to your v1 because you were only
using tail->child from within the loop, but that was only due to this
mistake.)
> + const unsigned element_size = type_size_scalar(tail->type);
Then change this to use "deref_array->deref.type".
> const unsigned base = MIN2(deref_array->base_offset, size - 1);
> -
> - image = offset(image, bld, base * BRW_IMAGE_PARAM_SIZE);
> + image = offset(image, bld, base * element_size);
>
> if (deref_array->deref_array_type == nir_deref_array_type_indirect) {
> - fs_reg *tmp = new(mem_ctx) fs_reg(vgrf(glsl_type::int_type));
> + fs_reg tmp = vgrf(glsl_type::int_type);
>
> if (devinfo->gen == 7 && !devinfo->is_haswell) {
> /* IVB hangs when trying to access an invalid surface index with
> @@ -1083,15 +1082,18 @@ fs_visitor::get_nir_image_deref(const nir_deref_var *deref)
> * of the possible outcomes of the hang. Clamp the index to
> * prevent access outside of the array bounds.
> */
> - bld.emit_minmax(*tmp, retype(get_nir_src(deref_array->indirect),
> - BRW_REGISTER_TYPE_UD),
> + bld.emit_minmax(tmp, retype(get_nir_src(deref_array->indirect),
> + BRW_REGISTER_TYPE_UD),
> fs_reg(size - base - 1), BRW_CONDITIONAL_L);
> } else {
> - bld.MOV(*tmp, get_nir_src(deref_array->indirect));
> + bld.MOV(tmp, get_nir_src(deref_array->indirect));
> }
>
> - bld.MUL(*tmp, *tmp, fs_reg(BRW_IMAGE_PARAM_SIZE));
> - image.reladdr = tmp;
> + bld.MUL(tmp, tmp, fs_reg(element_size));
> + if (image.reladdr)
> + bld.ADD(*image.reladdr, *image.reladdr, tmp);
> + else
> + image.reladdr = new(mem_ctx) fs_reg(tmp);
> }
> }
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
> index d3326e9..87b3839 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
> @@ -98,6 +98,8 @@ brw_nir_setup_glsl_uniform(gl_shader_stage stage, nir_variable *var,
> if (storage->type->is_image()) {
> brw_setup_image_uniform_values(stage, stage_prog_data,
> uniform_index, storage);
> + uniform_index +=
> + BRW_IMAGE_PARAM_SIZE * MAX2(storage->array_elements, 1);
> } else {
> gl_constant_value *components = storage->storage;
> unsigned vector_count = (MAX2(storage->array_elements, 1) *
> --
> 2.4.3
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151102/486b85b9/attachment.sig>
More information about the mesa-dev
mailing list