[Mesa-dev] [PATCH] swr: fix transform feedback logic
Kyriazis, George
george.kyriazis at intel.com
Sun Jul 23 18:56:36 UTC 2017
On Jul 23, 2017, at 1:42 PM, Rowley, Timothy O <timothy.o.rowley at intel.com<mailto:timothy.o.rowley at intel.com>> wrote:
On Jul 23, 2017, at 11:08 AM, George Kyriazis <george.kyriazis at intel.com<mailto:george.kyriazis at intel.com>> wrote:
The shader that is used to copy vertex data out of the vs/gs shaders to
the user-specified buffer (streamout os SO shader) was not using the
correct offsets.
Adjust the offsets that are used just for the SO shader:
- Make sure that position is handled in the same special way
as in the vs/gs shaders
- Use the correct offset to be passed in the core
- consolidate register slot mapping logic into one function, since it's
been calculated in 2 different places (one for calcuating the slot mask,
and one for the register offsets themselves
Also make room for all attibutes in the backend vertex area.
Add a comment to the commit indicating that as Ilia states, this is not a complete solution.
Ok
Fixes:
- all vtk GL2PS tests
- 18 piglit tests (16 ext_transform_feedback tests,
arb-quads-follow-provoking-vertex and primitive-type gl_points
---
src/gallium/drivers/swr/swr_draw.cpp | 11 ++++++++---
src/gallium/drivers/swr/swr_state.cpp | 31 +++++++++++++++++++++++++++++--
src/gallium/drivers/swr/swr_state.h | 3 +++
3 files changed, 40 insertions(+), 5 deletions(-)
diff --git a/src/gallium/drivers/swr/swr_draw.cpp b/src/gallium/drivers/swr/swr_draw.cpp
index 62ad3f7..218de0f 100644
--- a/src/gallium/drivers/swr/swr_draw.cpp
+++ b/src/gallium/drivers/swr/swr_draw.cpp
@@ -26,6 +26,7 @@
#include "swr_resource.h"
#include "swr_fence.h"
#include "swr_query.h"
+#include "swr_state.h"
#include "jit_api.h"
#include "util/u_draw.h"
@@ -81,8 +82,11 @@ swr_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
offsets[output_buffer] = so->output[i].dst_offset;
}
+ unsigned attrib_slot = so->output[i].register_index;
+ attrib_slot = swr_so_adjust_attrib(attrib_slot, ctx->vs);
+
state.stream.decl[num].bufferIndex = output_buffer;
- state.stream.decl[num].attribSlot = so->output[i].register_index - 1;
+ state.stream.decl[num].attribSlot = attrib_slot;
state.stream.decl[num].componentMask =
((1 << so->output[i].num_components) - 1)
<< so->output[i].start_component;
@@ -130,9 +134,10 @@ swr_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
SWR_FRONTEND_STATE feState = {0};
feState.vsVertexSize =
- VERTEX_ATTRIB_START_SLOT +
+ VERTEX_ATTRIB_START_SLOT
+ ctx->vs->info.base.num_outputs
- - (ctx->vs->info.base.writes_position ? 1 : 0);
+ - (ctx->vs->info.base.writes_position ? 1 : 0)
+ + ctx->fs->info.base.num_outputs;
Sizing vsVertexSize to essentially vs->num_outputs + fs->num_outputs seems odd, as the fe shouldn’t care about the number of outputs of the fs (inputs, maybe).
The clipper/binner code uses the stride (which originates from vsVertexSize) to create the intermediate vertices for the backend. The sizes for the FE and BE are bound by the same size because of that. In order to size the vertices to the absolute minimum size, you’ll have to do more intrusive changes in the core to decouple the two.
if (ctx->rasterizer->flatshade_first) {
feState.provokingVertex = {1, 0, 0};
diff --git a/src/gallium/drivers/swr/swr_state.cpp b/src/gallium/drivers/swr/swr_state.cpp
index 501fdea..3e07929 100644
--- a/src/gallium/drivers/swr/swr_state.cpp
+++ b/src/gallium/drivers/swr/swr_state.cpp
@@ -345,13 +345,15 @@ swr_create_vs_state(struct pipe_context *pipe,
// soState.streamToRasterizer not used
for (uint32_t i = 0; i < stream_output->num_outputs; i++) {
+ unsigned attrib_slot = stream_output->output[i].register_index;
+ attrib_slot = swr_so_adjust_attrib(attrib_slot, swr_vs);
swr_vs->soState.streamMasks[stream_output->output[i].stream] |=
- 1 << (stream_output->output[i].register_index - 1);
+ (1 << attrib_slot);
}
for (uint32_t i = 0; i < MAX_SO_STREAMS; i++) {
swr_vs->soState.streamNumEntries[i] =
_mm_popcnt_u32(swr_vs->soState.streamMasks[i]);
- swr_vs->soState.vertexAttribOffset[i] = VERTEX_ATTRIB_START_SLOT; // TODO: optimize
+ swr_vs->soState.vertexAttribOffset[i] = 0;
}
}
@@ -1777,6 +1779,31 @@ swr_update_derived(struct pipe_context *pipe,
ctx->dirty = post_update_dirty_flags;
}
+unsigned
+swr_so_adjust_attrib(unsigned in_attrib,
+ swr_vertex_shader *swr_vs)
+{
+ ubyte semantic_name;
+ unsigned attrib;
+
+ attrib = in_attrib + VERTEX_ATTRIB_START_SLOT;
+
+ if (swr_vs) {
+ semantic_name = swr_vs->info.base.output_semantic_name[in_attrib];
+ if (semantic_name == TGSI_SEMANTIC_POSITION) {
+ attrib = VERTEX_POSITION_SLOT;
+ } else {
+ for (int i = 0; i < PIPE_MAX_SHADER_OUTPUTS; i++) {
+ if (swr_vs->info.base.output_semantic_name[i] == TGSI_SEMANTIC_POSITION) {
+ attrib--;
+ break;
+ }
+ }
Couldn’t this for loop be replaced with a “if (swr_vs->info.base.writes_position) attrib—;”?
You are right. Fixed.
Should I send out a v2, or OK to check in with those changes?
Thanks,
George
+ }
+ }
+
+ return attrib;
+}
static struct pipe_stream_output_target *
swr_create_so_target(struct pipe_context *pipe,
diff --git a/src/gallium/drivers/swr/swr_state.h b/src/gallium/drivers/swr/swr_state.h
index 7940a96..8cbd463 100644
--- a/src/gallium/drivers/swr/swr_state.h
+++ b/src/gallium/drivers/swr/swr_state.h
@@ -110,6 +110,9 @@ struct swr_derived_state {
void swr_update_derived(struct pipe_context *,
const struct pipe_draw_info * = nullptr);
+unsigned swr_so_adjust_attrib(unsigned in_attrib,
+ swr_vertex_shader *swr_vs);
+
/*
* Conversion functions: Convert mesa state defines to SWR.
*/
--
2.7.4
_______________________________________________
mesa-dev mailing list
mesa-dev at lists.freedesktop.org<mailto:mesa-dev at lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170723/8863f663/attachment-0001.html>
More information about the mesa-dev
mailing list