<p dir="ltr">Another comment: Please fix Vulkan while you're at it. I don't want to have to debug this twice.</p>
<div class="gmail_quote">On Jun 9, 2016 1:10 AM, "Kenneth Graunke" <<a href="mailto:kenneth@whitecape.org">kenneth@whitecape.org</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Skylake changes the representation of shared local memory size:<br>
<br>
Size | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB |<br>
-------------------------------------------------------------------<br>
Gen7-8 | 0 | none | none | 1 | 2 | 3 | 4 | 5 |<br>
-------------------------------------------------------------------<br>
Gen9+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |<br>
<br>
The old formula would substantially underallocate the amount of space.<br>
This fixes GPU hangs on Skylake when running with full thread counts.<br>
<br>
Cc: "12.0" <<a href="mailto:mesa-stable@lists.freedesktop.org">mesa-stable@lists.freedesktop.org</a>><br>
Signed-off-by: Kenneth Graunke <<a href="mailto:kenneth@whitecape.org">kenneth@whitecape.org</a>><br>
---<br>
src/mesa/drivers/dri/i965/gen7_cs_state.c | 15 ++++++++++-----<br>
1 file changed, 10 insertions(+), 5 deletions(-)<br>
<br>
diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c b/src/mesa/drivers/dri/i965/gen7_cs_state.c<br>
index 750aa2c..aff1f4e 100644<br>
--- a/src/mesa/drivers/dri/i965/gen7_cs_state.c<br>
+++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c<br>
@@ -150,11 +150,16 @@ brw_upload_cs_state(struct brw_context *brw)<br>
assert(prog_data->total_shared <= 64 * 1024);<br>
uint32_t slm_size = 0;<br>
if (prog_data->total_shared > 0) {<br>
- /* slm_size is in 4k increments, but must be a power of 2. */<br>
- slm_size = 4 * 1024;<br>
- while (slm_size < prog_data->total_shared)<br>
- slm_size <<= 1;<br>
- slm_size /= 4 * 1024;<br>
+ /* Shared Local Memory Size is specified as powers of two. */<br>
+ slm_size = util_next_power_of_two(prog_data->total_shared);<br>
+<br>
+ if (brw->gen >= 9) {<br>
+ /* Use a minimum of 1kB; turn an exponent of 10 (1024 kB) into 1. */<br>
+ slm_size = ffs(MAX2(slm_size, 1024)) - 10;<br>
+ } else {<br>
+ /* Use a minimum of 4kB; convert to the pre-Gen9 representation. */<br>
+ slm_size = MAX2(slm_size, 4096) / 4096;<br>
+ }<br>
}<br>
<br>
desc[dw++] =<br>
--<br>
2.8.3<br>
<br>
_______________________________________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</blockquote></div>