[Mesa-dev] [PATCH 0/8] EXT_transform_feedback & ARB_transform_feedback2 core support

Mon Dec 5 11:02:40 PST 2011


----- Original Message -----
> On 11/19/2011 09:44 PM, Marek Olšák wrote:
> > Hi everyone,
> > 
> > this patch series implements all the core Mesa and Gallium support
> > for EXT_transform_feedback and ARB_transform_feedback2. It's been
> > tested by me on Radeons and by Christoph Bumiller on Nouveau. I
> > have verified that all transform feedback piglit tests (except the
> > one that requires GLSL 1.3) pass with this code.
> > 
> > Since we were discussing this last time, the Gallium interface has
> > been slightly modified to make it easier to implement by drivers.
> > The Gallium docs have been updated too.
> > 
> > Some streamout-related softpipe and llvmpipe code for the old
> > interface has been disabled. I didn't look at what needs to be
> > done to support the new one.
> > 
> > Please review.
> > 
> 
> 
> No one cares ? Could we just push these now ?

I'm not the ideal person to review (I was not involved in the original gallium implementation nor previous discussions about, and I don't know much about transform feedback in general), but I went through the changes FWIW.


>    gallium: interface changes necessary to implement transform feedback (v4)
[...]

>    The most notable change is the stream-out info must be linked
>    with a vertex or geometry shader and cannot be set independently.
>    This is due to limitations of existing hardware (special shader
>    instructions must be used to write into stream-out buffers),
>    and it's also how OpenGL works (stream outputs must be specified
>    prior to linking shaders).

Is this true for all current gallium hardware?


@@ -164,9 +170,28 @@ struct pipe_clip_state
 };
 
 
+/**
+ * Stream output for vertex transform feedback.
+ */
+struct pipe_stream_output_info
+{
+   /** number of the output buffer to insert each element into */
+   unsigned output_buffer[PIPE_MAX_SHADER_OUTPUTS];
+   /** which register OUT[i] to grab each output from */
+   unsigned register_index[PIPE_MAX_SHADER_OUTPUTS];
+   /** TGSI_WRITEMASK signifying which components to output */
+   ubyte register_mask[PIPE_MAX_SHADER_OUTPUTS];
+   /** number of outputs */
+   unsigned num_outputs;


+   /** stride for an entire vertex, only used if all output_buffers are 0 */
+   unsigned stride;
+};
+

This structure layout could be more space efficient. May be:

    unsigned stride;    
    unsigned num_outputs; 
    struct foo {
        unsigned output_buffer:???;
        unsigned register_index:???;
        unsigned register_mask:4;
    } [PIPE_MAX_SHADER_OUTPUTS];

Especially if less than 30bit ints are sufficient for register_index and output_bffer. This would allow to easily copy/hash only a portion of the state.

+
 struct pipe_shader_state
 {
    const struct tgsi_token *tokens;
+   struct pipe_stream_output_info stream_output;
 };

Is this really necessary for all shader types? Would it be enough to mandate geometry shader, and add a new parameter to create_gs_state? Or maybe create a separate pipe_shader_state structure for this.  Either wat looks cleaner and more light-weight than taxing all current pipe_shader_state users with this stage specific info.

BTW, there was at one point the plan to remove "struct pipe_shader_state", and have shader state to be simply "struct tgsi_token *". Even if we drop that plan now, I think that were several places that make this assumption and dropped the contents of pipe_shader_state, so reference to pipe_shader_state needs to be audited.

Jose