<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">On Jul 23, 2017, at 1:42 PM, Rowley, Timothy O <<a href="mailto:timothy.o.rowley@intel.com" class="">timothy.o.rowley@intel.com</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">
<br class="Apple-interchange-newline">
On Jul 23, 2017, at 11:08 AM, George Kyriazis <<a href="mailto:george.kyriazis@intel.com" class="">george.kyriazis@intel.com</a>> wrote:<br class="">
<br class="">
The shader that is used to copy vertex data out of the vs/gs shaders to<br class="">
the user-specified buffer (streamout os SO shader) was not using the<br class="">
correct offsets.<br class="">
<br class="">
Adjust the offsets that are used just for the SO shader:<br class="">
- Make sure that position is handled in the same special way<br class="">
as in the vs/gs shaders<br class="">
- Use the correct offset to be passed in the core<br class="">
- consolidate register slot mapping logic into one function, since it's<br class="">
been calculated in 2 different places (one for calcuating the slot mask,<br class="">
and one for the register offsets themselves<br class="">
<br class="">
Also make room for all attibutes in the backend vertex area.<br class="">
</blockquote>
<br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Add
 a comment to the commit indicating that as Ilia states, this is not a complete solution.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
</div>
</blockquote>
Ok</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">
<br class="">
Fixes:<br class="">
- all vtk GL2PS tests<br class="">
- 18 piglit tests (16 ext_transform_feedback tests,<br class="">
arb-quads-follow-provoking-vertex and primitive-type gl_points<br class="">
---<br class="">
src/gallium/drivers/swr/swr_draw.cpp  | 11 ++++++++---<br class="">
src/gallium/drivers/swr/swr_state.cpp | 31 +++++++++++++++++++++++++++++--<br class="">
src/gallium/drivers/swr/swr_state.h   |  3 +++<br class="">
3 files changed, 40 insertions(+), 5 deletions(-)<br class="">
<br class="">
diff --git a/src/gallium/drivers/swr/swr_draw.cpp b/src/gallium/drivers/swr/swr_draw.cpp<br class="">
index 62ad3f7..218de0f 100644<br class="">
--- a/src/gallium/drivers/swr/swr_draw.cpp<br class="">
+++ b/src/gallium/drivers/swr/swr_draw.cpp<br class="">
@@ -26,6 +26,7 @@<br class="">
#include "swr_resource.h"<br class="">
#include "swr_fence.h"<br class="">
#include "swr_query.h"<br class="">
+#include "swr_state.h"<br class="">
#include "jit_api.h"<br class="">
<br class="">
#include "util/u_draw.h"<br class="">
@@ -81,8 +82,11 @@ swr_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)<br class="">
              offsets[output_buffer] = so->output[i].dst_offset;<br class="">
           }<br class="">
<br class="">
+            unsigned attrib_slot = so->output[i].register_index;<br class="">
+            attrib_slot = swr_so_adjust_attrib(attrib_slot, ctx->vs);<br class="">
+<br class="">
           state.stream.decl[num].bufferIndex = output_buffer;<br class="">
-            state.stream.decl[num].attribSlot = so->output[i].register_index - 1;<br class="">
+            state.stream.decl[num].attribSlot = attrib_slot;<br class="">
           state.stream.decl[num].componentMask =<br class="">
              ((1 << so->output[i].num_components) - 1)<br class="">
              << so->output[i].start_component;<br class="">
@@ -130,9 +134,10 @@ swr_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)<br class="">
  SWR_FRONTEND_STATE feState = {0};<br class="">
<br class="">
  feState.vsVertexSize =<br class="">
-      VERTEX_ATTRIB_START_SLOT +<br class="">
+      VERTEX_ATTRIB_START_SLOT<br class="">
     + ctx->vs->info.base.num_outputs<br class="">
-      - (ctx->vs->info.base.writes_position ? 1 : 0);<br class="">
+      - (ctx->vs->info.base.writes_position ? 1 : 0)<br class="">
+      + ctx->fs->info.base.num_outputs;<br class="">
</blockquote>
<br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Sizing
 vsVertexSize to essentially vs->num_outputs + fs->num_outputs seems odd, as the fe shouldn’t care about the number of outputs of the fs (inputs, maybe).</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
</div>
</blockquote>
<div>The clipper/binner code uses the stride (which originates from vsVertexSize) to create the intermediate vertices for the backend.  The sizes for the FE and BE are bound by the same size because of that.  In order to size the vertices to the absolute minimum
 size, you’ll have to do more intrusive changes in the core to decouple the two.</div>
<br class="">
<blockquote type="cite" class="">
<div class="">
<blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">
  if (ctx->rasterizer->flatshade_first) {<br class="">
     feState.provokingVertex = {1, 0, 0};<br class="">
diff --git a/src/gallium/drivers/swr/swr_state.cpp b/src/gallium/drivers/swr/swr_state.cpp<br class="">
index 501fdea..3e07929 100644<br class="">
--- a/src/gallium/drivers/swr/swr_state.cpp<br class="">
+++ b/src/gallium/drivers/swr/swr_state.cpp<br class="">
@@ -345,13 +345,15 @@ swr_create_vs_state(struct pipe_context *pipe,<br class="">
     // soState.streamToRasterizer not used<br class="">
<br class="">
     for (uint32_t i = 0; i < stream_output->num_outputs; i++) {<br class="">
+         unsigned attrib_slot = stream_output->output[i].register_index;<br class="">
+         attrib_slot = swr_so_adjust_attrib(attrib_slot, swr_vs);<br class="">
        swr_vs->soState.streamMasks[stream_output->output[i].stream] |=<br class="">
-            1 << (stream_output->output[i].register_index - 1);<br class="">
+            (1 << attrib_slot);<br class="">
     }<br class="">
     for (uint32_t i = 0; i < MAX_SO_STREAMS; i++) {<br class="">
       swr_vs->soState.streamNumEntries[i] =<br class="">
            _mm_popcnt_u32(swr_vs->soState.streamMasks[i]);<br class="">
-        swr_vs->soState.vertexAttribOffset[i] = VERTEX_ATTRIB_START_SLOT; // TODO: optimize<br class="">
+        swr_vs->soState.vertexAttribOffset[i] = 0;<br class="">
      }<br class="">
  }<br class="">
<br class="">
@@ -1777,6 +1779,31 @@ swr_update_derived(struct pipe_context *pipe,<br class="">
  ctx->dirty = post_update_dirty_flags;<br class="">
}<br class="">
<br class="">
+unsigned<br class="">
+swr_so_adjust_attrib(unsigned in_attrib,<br class="">
+                     swr_vertex_shader *swr_vs)<br class="">
+{<br class="">
+   ubyte semantic_name;<br class="">
+   unsigned attrib;<br class="">
+<br class="">
+   attrib = in_attrib + VERTEX_ATTRIB_START_SLOT;<br class="">
+<br class="">
+   if (swr_vs) {<br class="">
+      semantic_name = swr_vs->info.base.output_semantic_name[in_attrib];<br class="">
+      if (semantic_name == TGSI_SEMANTIC_POSITION) {<br class="">
+         attrib = VERTEX_POSITION_SLOT;<br class="">
+      } else {<br class="">
+         for (int i = 0; i < PIPE_MAX_SHADER_OUTPUTS; i++) {<br class="">
+            if (swr_vs->info.base.output_semantic_name[i] == TGSI_SEMANTIC_POSITION) {<br class="">
+               attrib--;<br class="">
+               break;<br class="">
+            }<br class="">
+         }<br class="">
</blockquote>
<br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Couldn’t
 this for loop be replaced with a “if (swr_vs->info.base.writes_position) attrib—;”?</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
</div>
</blockquote>
You are right.  Fixed.</div>
<div><br class="">
</div>
<div>Should I send out a v2, or OK to check in with those changes?</div>
<div><br class="">
</div>
<div>Thanks,</div>
<div><br class="">
</div>
<div>George</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">
+      }<br class="">
+   }<br class="">
+<br class="">
+   return attrib;<br class="">
+}<br class="">
<br class="">
static struct pipe_stream_output_target *<br class="">
swr_create_so_target(struct pipe_context *pipe,<br class="">
diff --git a/src/gallium/drivers/swr/swr_state.h b/src/gallium/drivers/swr/swr_state.h<br class="">
index 7940a96..8cbd463 100644<br class="">
--- a/src/gallium/drivers/swr/swr_state.h<br class="">
+++ b/src/gallium/drivers/swr/swr_state.h<br class="">
@@ -110,6 +110,9 @@ struct swr_derived_state {<br class="">
void swr_update_derived(struct pipe_context *,<br class="">
                       const struct pipe_draw_info * = nullptr);<br class="">
<br class="">
+unsigned swr_so_adjust_attrib(unsigned in_attrib,<br class="">
+                              swr_vertex_shader *swr_vs);<br class="">
+<br class="">
/*<br class="">
* Conversion functions: Convert mesa state defines to SWR.<br class="">
*/<br class="">
--<span class="Apple-converted-space"> </span><br class="">
2.7.4<br class="">
<br class="">
_______________________________________________<br class="">
mesa-dev mailing list<br class="">
<a href="mailto:mesa-dev@lists.freedesktop.org" class="">mesa-dev@lists.freedesktop.org</a><br class="">
https://lists.freedesktop.org/mailman/listinfo/mesa-dev</blockquote>
</div>
</blockquote>
</div>
<br class="">
</body>
</html>