<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Jun 13, 2018 at 6:14 PM, Kenneth Graunke <span dir="ltr"><<a href="mailto:kenneth@whitecape.org" target="_blank">kenneth@whitecape.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The UBO push analysis pass incorrectly assumed that all values would fit<br> within a 32B chunk, and only recorded a bit for the 32B chunk containing<br> the starting offset.<br> <br> For example, if a UBO contained the following, tightly packed:<br> <br> vec4 a; // [0, 16)<br> float b; // [16, 20)<br> vec4 c; // [20, 36)<br> <br> then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,<br> which means that we ought to record two 32B chunks in the bitfield.<br> <br> Similarly, dvec4s would suffer from the same problem.<br> <br> v2: Rewrite the accounting, my calculations were wrong.<br> <br> Reviewed-by: Rafael Antognolli <<a href="mailto:rafael.antognolli@intel.com">rafael.antognolli@intel.com</a>> [v1]<br> ---<br> src/intel/compiler/brw_nir_<wbr>analyze_ubo_ranges.c | 9 ++++++++-<br> 1 file changed, 8 insertions(+), 1 deletion(-)<br> <br> diff --git a/src/intel/compiler/brw_nir_<wbr>analyze_ubo_ranges.c b/src/intel/compiler/brw_nir_<wbr>analyze_ubo_ranges.c<br> index d58fe3dd2e3..31026ca65ba 100644<br> --- a/src/intel/compiler/brw_nir_<wbr>analyze_ubo_ranges.c<br> +++ b/src/intel/compiler/brw_nir_<wbr>analyze_ubo_ranges.c<br> @@ -141,10 +141,17 @@ analyze_ubos_block(struct ubo_analysis_state *state, nir_block *block)<br> if (offset >= 64)<br> continue;<br></blockquote><div><br></div><div>This should take end into account, not just offset.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> + /* The value might span multiple 32-byte chunks. */<br> + const int bytes = nir_intrinsic_dest_components(<wbr>intrin) *<br> + (nir_dest_bit_size(intrin-><wbr>dest) / 8);<br> + const int start = ROUND_DOWN_TO(offset_const-><wbr>u32[0], 32);<br> + const int end = ALIGN(offset_const->u32[0] + bytes, 32);<br></blockquote><div><br></div><div>You could probably use / and DIV_ROUND_UP here too. I'm not sure which is better TBH.</div><div><br></div><div>--Jason<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> + const int chunks = (end - start) / 32;<br> +<br> /* TODO: should we count uses in loops as higher benefit? */<br> <br> struct ubo_block_info *info = get_block_info(state, block);<br> - info->offsets |= 1ull << offset;<br> + info->offsets |= ((1ull << chunks) - 1) << offset;<br> info->uses[offset]++;<br> }<br> }<br> <span class="HOEnZb"><font color="#888888">-- <br> 2.17.0<br> <br> ______________________________<wbr>_________________<br> mesa-dev mailing list<br> <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br> </font></span></blockquote></div><br></div></div>