<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<pre>Hello,
Thanks for your feedback.
I made changes which you requested.
I hope that I understood you right.
The solution which you suggested "VUE allocation unconditionally" fixes this bug
and we have avoided the endif in the program end.
<blockquote type="cite"><div>Ok, then we are in agreement.</div><div>
</div><blockquote type="cite" style="margin:0 0 0 .8ex; border-left:2px #729fcf solid;padding-left:1ex"><pre><pre><pre><tt> Section </tt>1.6.5.4 VUE Allocation:
" The following description is applicable only to the GS stage.
The threads are not passed an initial handle.
In stead, they request a first handle (if any) via the URB
shared function’s FF_SYNC message (see Shared Functions).
If additional handles are required,
the URB_WRITE allocate mechanism (mentioned above) is used."<tt>
If GS doesn't allocate/request VUEs then GS shouldn't use
the Dereference (COMPLETE + UNUSED) message.
So when GS produces no output GS doesn't allocate VUEs at all
and GS shouldn't use Dereference message.
</tt></pre></pre></pre></blockquote><div>
</div><div>
</div><div>Agreed as well. But do notice that none of this is pre-ILK as far as the
documentation goes, it is the same across all supported platforms up to
SNB.</div></blockquote>
But do notice that according to
Section "1.6.5.2 VUE Allocation (GS, CLIP) [Pre-DevIL] " (vol2, part1)
" The following description is applicable only to the GS, CLIP stages.
The GS and CLIP threads are passed a single, initial destination VUE handle.
These threads may be required to output more than one destination VUE, and
therefore they are provided with a mechanism to
request additional handles.
......"
So if we want to support Pre-ILK we have to implement the different solutions for ILK, SNB (1) and for Pre-ILK (2):
1. The "make the FF_SYNC happen unconditionally" solution will work correctly
for ILK, SNB according to "1.6.5.3 VUE Allocation (GS, CLIP) [DevIL]"
and "1.6.5.4 VUE Allocation (GS) [DevSNB+]" (vol2, part1).
2. The current GS implementation will work correctly
for Pre-ILK according to "1.6.5.2 VUE Allocation (GS, CLIP) [Pre-DevIL]"
</pre>
<pre>Must I implement something additional for Pre-ILK?
Regards,
Andrii.
</pre>
<div class="moz-cite-prefix">
<pre>On 21.06.18 16:40, Andrii Simiklit wrote:</pre>
</div>
<blockquote type="cite"
cite="mid:1529588435-29287-1-git-send-email-andrii.simiklit@globallogic.com">
<pre wrap="">We can not use the VUE Dereference flags combination for EOT
message under ILK and SNB because the threads are not initialized
there with initial VUE handle unlike Pre-IL.
So to avoid GPU hangs on SNB and ILK we need
to avoid usage of the VUE Dereference flags combination.
(Was tested only on SNB but according to the specification
SNB Volume 2 Part 1: 1.6.5.3, 1.6.5.6
the ILK must behave itself in the similar way)
v2: Approach to fix this issue was changed.
Instead of different EOT flags in the program end
we will create VUE every time even if GS produces no output.
Signed-off-by: Andrii Simiklit <a class="moz-txt-link-rfc2396E" href="mailto:andrii.simiklit@globallogic.com"><andrii.simiklit@globallogic.com></a>
---
src/intel/compiler/gen6_gs_visitor.cpp | 88 +++++++++-------------------------
1 file changed, 23 insertions(+), 65 deletions(-)
diff --git a/src/intel/compiler/gen6_gs_visitor.cpp b/src/intel/compiler/gen6_gs_visitor.cpp
index ac3ba55..b831d33 100644
--- a/src/intel/compiler/gen6_gs_visitor.cpp
+++ b/src/intel/compiler/gen6_gs_visitor.cpp
@@ -300,11 +300,10 @@ gen6_gs_visitor::emit_urb_write_opcode(bool complete, int base_mrf,
/* Otherwise we always request to allocate a new VUE handle. If this is
* the last write before the EOT message and the new handle never gets
* used it will be dereferenced when we send the EOT message. This is
- * necessary to avoid different setups (under Pre-IL only) for the EOT message (one for the
+ * necessary to avoid different setups for the EOT message (one for the
* case when there is no output and another for the case when there is)
* which would require to end the program with an IF/ELSE/ENDIF block,
- * something we do not want.
- * But for ILK and SNB we can not avoid the end the program with an IF/ELSE/ENDIF block.
+ * something we do not want.
*/
inst = emit(GS_OPCODE_URB_WRITE_ALLOCATE);
inst->urb_write_flags = BRW_URB_WRITE_COMPLETE;
@@ -351,27 +350,27 @@ gen6_gs_visitor::emit_thread_end()
int max_usable_mrf = FIRST_SPILL_MRF(devinfo->gen);
/* Issue the FF_SYNC message and obtain the initial VUE handle. */
- emit(CMP(dst_null_ud(), this->vertex_count, brw_imm_ud(0u), BRW_CONDITIONAL_G));
- emit(IF(BRW_PREDICATE_NORMAL));
- {
- this->current_annotation = "gen6 thread end: ff_sync";
+ this->current_annotation = "gen6 thread end: ff_sync";
- vec4_instruction *inst;
- if (prog->info.has_transform_feedback_varyings) {
+ vec4_instruction *inst = NULL;
+ if (prog->info.has_transform_feedback_varyings) {
src_reg sol_temp(this, glsl_type::uvec4_type);
emit(GS_OPCODE_FF_SYNC_SET_PRIMITIVES,
- dst_reg(this->svbi),
- this->vertex_count,
- this->prim_count,
- sol_temp);
+ dst_reg(this->svbi),
+ this->vertex_count,
+ this->prim_count,
+ sol_temp);
inst = emit(GS_OPCODE_FF_SYNC,
dst_reg(this->temp), this->prim_count, this->svbi);
- } else {
+ } else {
inst = emit(GS_OPCODE_FF_SYNC,
dst_reg(this->temp), this->prim_count, brw_imm_ud(0u));
- }
- inst->base_mrf = base_mrf;
+ }
+ inst->base_mrf = base_mrf;
+ emit(CMP(dst_null_ud(), this->vertex_count, brw_imm_ud(0u), BRW_CONDITIONAL_G));
+ emit(IF(BRW_PREDICATE_NORMAL));
+ {
/* Loop over all buffered vertices and emit URB write messages */
this->current_annotation = "gen6 thread end: urb writes init";
src_reg vertex(this, glsl_type::uint_type);
@@ -415,7 +414,7 @@ gen6_gs_visitor::emit_thread_end()
dst_reg reg = dst_reg(MRF, mrf);
reg.type = output_reg[varying][0].type;
data.type = reg.type;
- vec4_instruction *inst = emit(MOV(reg, data));
+ inst = emit(MOV(reg, data));
inst->force_writemask_all = true;
mrf++;
@@ -450,11 +449,8 @@ gen6_gs_visitor::emit_thread_end()
if (prog->info.has_transform_feedback_varyings)
xfb_write();
}
- const bool common_eot_approach_can_be_used = (devinfo->gen < 5);
- if(common_eot_approach_can_be_used)
- {
- emit(BRW_OPCODE_ENDIF);
- }
+ emit(BRW_OPCODE_ENDIF);
+
/* Finally, emit EOT message.
*
* In gen6 we need to end the thread differently depending on whether we have
@@ -464,35 +460,11 @@ gen6_gs_visitor::emit_thread_end()
*
* However, this would lead us to end the program with an ENDIF opcode,
* which we want to avoid, so what we do is that we always request a new
- * VUE handle every time we do a URB WRITE, even for the last vertex we emit.
+ * VUE handle every time, even if GS produces no output.
* With this we make sure that whether we have emitted at least one vertex
* or none at all, we have to finish the thread without writing to the URB,
- * which works for both cases (but only under Pre-IL) by setting
- * the COMPLETE and UNUSED flags in the EOT message.
- *
- * But under ILK or SNB we must not use combination COMPLETE and UNUSED
- * because this combination could be used only for already allocated VUE.
- * But unlike Pre-IL in the ILK and SNB
- * the initial VUE is not passed to threads.
- * This behaver mentioned in specification:
- * SNB Volume 2 Part 1:
- * "1.6.5.3 VUE Allocation (GS, CLIP) [DevIL]"
- * "1.6.5.4 VUE Allocation (GS) [DevSNB+]"
- * "The threads are not passed an initial handle.
- * Instead, they request a first handle (if any)
- * via the URB shared function’s FF_SYNC message (see Shared Functions).
- * If additional handles are required,
- * the URB_WRITE allocate mechanism (mentioned above) is used."
- *
- * So for ILK and for SNB we must use only UNUSED flag.
- * This is accepteble combination according to:
- * SNB Volume 4 Part 2:
- * "2.4.2 Message Descriptor"
- * "Table lists the valid and invalid combinations of
- * the Complete, Used, Allocate and EOT bits"
- * "Thread terminate non-write of URB"
- * SNB Volume 2 Part 1:
- * "1.6.5.6 Thread Termination"
+ * which works for both cases by setting the COMPLETE and UNUSED flags in
+ * the EOT message.
*/
this->current_annotation = "gen6 thread end: EOT";
@@ -504,26 +476,12 @@ gen6_gs_visitor::emit_thread_end()
emit(GS_OPCODE_SET_DWORD_2, dst_reg(MRF, base_mrf), data);
}
- vec4_instruction *inst = emit(GS_OPCODE_THREAD_END);
+ inst = emit(GS_OPCODE_THREAD_END);
inst->urb_write_flags = BRW_URB_WRITE_COMPLETE | BRW_URB_WRITE_UNUSED;
inst->base_mrf = base_mrf;
inst->mlen = 1;
-
- if(!common_eot_approach_can_be_used)
- {
- emit(BRW_OPCODE_ELSE);
-
- this->current_annotation = "gen6 thread end: EOT";
-
- vec4_instruction *unused_urb_inst = emit(GS_OPCODE_THREAD_END);
- unused_urb_inst->urb_write_flags = BRW_URB_WRITE_UNUSED;
- unused_urb_inst->base_mrf = base_mrf;
- unused_urb_inst->mlen = 1;
-
- emit(BRW_OPCODE_ENDIF);
- }
}
-
+
void
gen6_gs_visitor::setup_payload()
{
</pre>
</blockquote>
<br>
</body>
</html>