Mesa (master): intel/compiler: Properly consider UBO loads that cross 32B boundaries.
GitLab Mirror
gitlab-mirror at kemper.freedesktop.org
Thu Jun 14 21:59:53 UTC 2018
Module: Mesa
Branch: master
Commit: f6898f2b554e88255909bdd6bd0f7a91d99c446b
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=f6898f2b554e88255909bdd6bd0f7a91d99c446b
Author: Kenneth Graunke <kenneth at whitecape.org>
Date: Fri Jun 8 14:24:16 2018 -0700
intel/compiler: Properly consider UBO loads that cross 32B boundaries.
The UBO push analysis pass incorrectly assumed that all values would fit
within a 32B chunk, and only recorded a bit for the 32B chunk containing
the starting offset.
For example, if a UBO contained the following, tightly packed:
vec4 a; // [0, 16)
float b; // [16, 20)
vec4 c; // [20, 36)
then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
which means that we ought to record two 32B chunks in the bitfield.
Similarly, dvec4s would suffer from the same problem.
v2: Rewrite the accounting, my calculations were wrong.
v3: Write a comment about partial values (requested by Jason).
Reviewed-by: Rafael Antognolli <rafael.antognolli at intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason at jlekstrand.net> [v3]
---
src/intel/compiler/brw_nir_analyze_ubo_ranges.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
index d58fe3dd2e..cd5137da06 100644
--- a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
+++ b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
@@ -137,14 +137,26 @@ analyze_ubos_block(struct ubo_analysis_state *state, nir_block *block)
const int block = block_const->u32[0];
const int offset = offset_const->u32[0] / 32;
- /* Won't fit in our bitfield */
+ /* Avoid shifting by larger than the width of our bitfield, as this
+ * is undefined in C. Even if we require multiple bits to represent
+ * the entire value, it's OK to record a partial value - the backend
+ * is capable of falling back to pull loads for later components of
+ * vectors, as it has to shrink ranges for other reasons anyway.
+ */
if (offset >= 64)
continue;
+ /* The value might span multiple 32-byte chunks. */
+ const int bytes = nir_intrinsic_dest_components(intrin) *
+ (nir_dest_bit_size(intrin->dest) / 8);
+ const int start = ROUND_DOWN_TO(offset_const->u32[0], 32);
+ const int end = ALIGN(offset_const->u32[0] + bytes, 32);
+ const int chunks = (end - start) / 32;
+
/* TODO: should we count uses in loops as higher benefit? */
struct ubo_block_info *info = get_block_info(state, block);
- info->offsets |= 1ull << offset;
+ info->offsets |= ((1ull << chunks) - 1) << offset;
info->uses[offset]++;
}
}
More information about the mesa-commit
mailing list