<div dir="ltr"><div>I don't think "anv" is the right prefix for this patch. Really, it's adding local_id_index to cs_prog_data.<br><br></div><div>By and large, I think the code at the end of the series is what we want. I made a few comments but not many. However, I think we could make it a bit more bisectable if we wanted. If you don't want to go through the effort, feel free to just land the series as is (with the couple of suggestions taken into account) with<br><br></div><div>Reviewed-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br><br></div><div>If you want to be a pedant, I think we could do something like the following:<br><br> 1) Patch 7 with the anv changes as well<br></div><div> 2) This patch without the anv bits and with local_id_index always initialized to -1 by brw_compile_cs<br></div><div> 3) Add the NIR pass<br></div><div> 4) Patch 8 only with a hack to set cross_thread_supported to false always<br></div><div> 5) Hook things up in i965<br></div><div> 6) Hook thnigs up in anv<br></div><div> 7) A single patch that properly sets local_id_index, runs the NIR pass, and sets cross_thread_supported based on gen.<br></div><div><br></div>That would make the 'flip the switch" patch very small, probably a half-dozen lines. If you're going to be a pedant don't be a pedant on your holiday... :-)<br><div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, May 29, 2016 at 3:38 PM, Jordan Justen <span dir="ltr"><<a href="mailto:jordan.l.justen@intel.com" target="_blank">jordan.l.justen@intel.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Signed-off-by: Jordan Justen <<a href="mailto:jordan.l.justen@intel.com">jordan.l.justen@intel.com</a>><br>
---<br>
src/intel/vulkan/anv_pipeline.c | 4 ++++<br>
src/mesa/drivers/dri/i965/brw_compiler.h | 1 +<br>
src/mesa/drivers/dri/i965/brw_cs.c | 3 +++<br>
src/mesa/drivers/dri/i965/brw_fs.cpp | 8 ++++++++<br>
4 files changed, 16 insertions(+)<br>
<br>
diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c<br>
index 789bc1a..504f0be 100644<br>
--- a/src/intel/vulkan/anv_pipeline.c<br>
+++ b/src/intel/vulkan/anv_pipeline.c<br>
@@ -338,6 +338,10 @@ anv_pipeline_compile(struct anv_pipeline *pipeline,<br>
pipeline->needs_data_cache = true;<br>
}<br>
<br>
+ if (stage == MESA_SHADER_COMPUTE)<br>
+ ((struct brw_cs_prog_data *)prog_data)->thread_local_id_index =<br>
+ prog_data->nr_params++; /* The CS Thread ID uniform */<br>
+<br>
if (nir->info.num_ssbos > 0)<br>
pipeline->needs_data_cache = true;<br>
<br>
diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h b/src/mesa/drivers/dri/i965/brw_compiler.h<br>
index 0844694..bed969c 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_compiler.h<br>
+++ b/src/mesa/drivers/dri/i965/brw_compiler.h<br>
@@ -433,6 +433,7 @@ struct brw_cs_prog_data {<br>
bool uses_barrier;<br>
bool uses_num_work_groups;<br>
unsigned local_invocation_id_regs;<br>
+ int thread_local_id_index;<br>
<br>
struct {<br>
/** @{<br>
diff --git a/src/mesa/drivers/dri/i965/brw_cs.c b/src/mesa/drivers/dri/i965/brw_cs.c<br>
index a9cbde9..2a25584 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_cs.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_cs.c<br>
@@ -93,6 +93,9 @@ brw_codegen_cs_prog(struct brw_context *brw,<br>
*/<br>
int param_count = cp->program.Base.nir->num_uniforms / 4;<br>
<br>
+ /* The backend also sometimes add a param for the thread local id. */<br>
+ prog_data.thread_local_id_index = param_count++;<br>
+<br>
/* The backend also sometimes adds params for texture size. */<br>
param_count += 2 * ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits;<br>
prog_data.base.param =<br>
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp<br>
index f5add6e..f7753db 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp<br>
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp<br>
@@ -6469,6 +6469,14 @@ brw_compile_cs(const struct brw_compiler *compiler, void *log_data,<br>
true);<br>
brw_nir_lower_cs_shared(shader);<br>
prog_data->base.total_shared += shader->num_shared;<br>
+<br>
+ /* Now that we cloned the nir_shader, we can update num_uniforms based on<br>
+ * the thread_local_id_index.<br>
+ */<br>
+ shader->num_uniforms =<br>
+ MAX2(shader->num_uniforms,<br>
+ (unsigned)4 * (prog_data->thread_local_id_index + 1));<br>
+<br>
shader = brw_postprocess_nir(shader, compiler->devinfo, true);<br>
<br>
prog_data->local_size[0] = shader->info.cs.local_size[0];<br>
<span class=""><font color="#888888">--<br>
2.8.1<br>
<br>
_______________________________________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</font></span></blockquote></div><br></div></div></div></div></div>