<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, May 16, 2016 at 9:23 PM, Francisco Jerez <span dir="ltr"><<a href="mailto:currojerez@riseup.net" target="_blank">currojerez@riseup.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Currently the spilling code attempts to guess the scratch message<br>
block size from the dispatch width of the shader, which is plain wrong<br>
for SIMD-lowered instructions (frequently but not exclusively<br>
encountered in SIMD32 shaders) or for instructions with register<br>
region data types of size other than 32 bit.<br>
<br>
Instead try to use the SIMD component size of the instruction which in<br>
some cases will allow the dataport to apply the correct channel mask<br>
to the scratch data read or written. In the spill case the block size<br>
needs to be clamped to the number of MRF registers reserved for<br>
spilling. In the unspill case I didn't even bother because we<br>
currently have no 100% accurate way to determine whether a source<br>
region is per-channel or whether it contains things like headers that<br>
don't respect channel boundaries -- That's fine, because the unspill<br>
is marked force_writemask_all we can just use the largest allowable<br>
scratch message size.<br>
---<br>
src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 20 +++++++++++++++++---<br>
1 file changed, 17 insertions(+), 3 deletions(-)<br>
<br>
diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp<br>
index 5cb2013..ff40c42 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp<br>
+++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp<br>
@@ -942,8 +942,14 @@ fs_visitor::spill_reg(int spill_reg)<br>
inst->src[i].nr = <a href="http://unspill_dst.nr" rel="noreferrer" target="_blank">unspill_dst.nr</a>;<br>
inst->src[i].reg_offset = 0;<br>
<br>
+ /* We read the largest power-of-two divisor of the register count<br>
+ * (because only POT scratch read blocks are allowed by the<br>
+ * hardware) up to the maximum supported block size.<br>
+ * XXX - Bump the limit when the generator code is ready for<br>
+ * 32-wide spills.<br>
+ */<br>
const unsigned width =<br>
- dispatch_width == 16 && regs_read % 2 == 0 ? 16 : 8;<br>
+ MIN2(16, 1u << (ffs(MAX2(1, regs_read) * 8) - 1));<br></blockquote><div><br></div><div>Can regs_read ever be zero? Seems kind of odd.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
/* Set exec_all() on unspill messages under the (rather<br>
* pessimistic) assumption that there is no one-to-one<br>
@@ -974,8 +980,16 @@ fs_visitor::spill_reg(int spill_reg)<br>
inst->no_dd_clear = false;<br>
inst->no_dd_check = false;<br>
<br>
- const unsigned width =<br>
- dispatch_width == 16 && inst->regs_written % 2 == 0 ? 16 : 8;<br>
+ /* Calculate the execution width of the scratch messages (which work<br>
+ * in terms of 32 bit components so we have a fixed number of eight<br>
+ * channels per spilled register). We attempt to write one<br>
+ * exec_size-wide component of the variable at a time without<br>
+ * exceeding the maximum number of (fake) MRF registers reserved for<br>
+ * spills.<br>
+ */<br>
+ const unsigned width = 8 * MIN2(<br>
+ DIV_ROUND_UP(inst->dst.component_size(inst->exec_size), REG_SIZE),<br>
+ spill_max_size(this));<br>
<br>
/* Spills should only write data initialized by the instruction for<br>
* whichever channels are enabled in the excution mask. If that's<br>
<span class="HOEnZb"><font color="#888888">--<br>
2.7.3<br>
<br>
_______________________________________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</font></span></blockquote></div><br></div></div>