[Mesa-dev] [PATCH] i965: Fix output register sizes when variable ranges are interleaved
Caio Marcelo de Oliveira Filho
caio.oliveira at intel.com
Tue Jul 3 21:37:45 UTC 2018
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira at intel.com>
Also consider including a fixes tag.
Fixes: 6f5abf31466 "i965: Fix output register sizes when multiple variables share a slot."
On Fri, May 18, 2018 at 03:31:00PM +0200, Neil Roberts wrote:
> In 6f5abf31466aed this code was fixed to calculate the maximum size of
> an attribute in a seperate pass and then allocate the registers to
> that size. However this wasn’t taking into account ranges that overlap
> but don’t have the same starting location. For example:
>
> layout(location = 0, component = 0) out float a[4];
> layout(location = 2, component = 1) out float b[4];
>
> Previously, if ‘a’ was processed first then it would allocate a
> register of size 4 for location 0 and it wouldn’t allocate another
> register for location 2 because it would already be covered by the
> range of 0. Then if something tries to write to b[2] it would try to
> write past the end of the register allocated for ‘a’ and it would hit
> an assert.
>
> This patch changes it to scan for any overlapping ranges that start
> within each range to calculate the maximum extent and allocate that
> instead.
>
> Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/
> vs-fs-array-interleave-range.shader_test
Verified this particular test is failing in master and the patch makes
it pass.
> ---
>
> For what it’s worth I also made a simplified test for this for
> VkRunner here:
>
> https://github.com/Igalia/vkrunner/blob/tests/examples/overlap-outputs.shader_test
>
> src/intel/compiler/brw_fs_nir.cpp | 25 ++++++++++++++++++-------
> 1 file changed, 18 insertions(+), 7 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp
> index 1ce89520bf1..b179018e5e8 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -67,14 +67,25 @@ fs_visitor::nir_setup_outputs()
> vec4s[loc] = MAX2(vec4s[loc], var_vec4s);
> }
>
> - nir_foreach_variable(var, &nir->outputs) {
> - const int loc = var->data.driver_location;
> - if (outputs[loc].file == BAD_FILE) {
> - fs_reg reg = bld.vgrf(BRW_REGISTER_TYPE_F, 4 * vec4s[loc]);
> - for (unsigned i = 0; i < vec4s[loc]; i++) {
> - outputs[loc + i] = offset(reg, bld, 4 * i);
> - }
> + for (unsigned loc = 0; loc < ARRAY_SIZE(vec4s);) {
> + if (vec4s[loc] == 0) {
> + loc++;
> + continue;
> }
> +
> + unsigned reg_size = vec4s[loc];
> +
> + /* Check if there are any ranges that start within this range and extend
> + * past it. If so, include them in this allocation.
> + */
> + for (unsigned i = 1; i < reg_size; i++)
> + reg_size = MAX2(vec4s[i + loc] + i, reg_size);
> +
> + fs_reg reg = bld.vgrf(BRW_REGISTER_TYPE_F, 4 * reg_size);
> + for (unsigned i = 0; i < reg_size; i++)
> + outputs[loc + i] = offset(reg, bld, 4 * i);
> +
> + loc += reg_size;
> }
> }
>
> --
> 2.14.3
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list