[Mesa-dev] draw_stream_output seems to be broken by design

Mon Sep 20 08:10:59 PDT 2010

> The difference between this and the D3D semantics is that in D3D you bind the
> buffer explicitely and in GL implicitly i.e. the buffer associated with the
> stream output object id is bound for you. So for GL the state tracker would
> have to bind the appropriate buffers on DrawTransformFeedback.

So the idea would be to store the stream output count in the stream
output buffer, and then, after the user binds it as a vertex buffer,
get it from there in draw_stream_output?

I'm not sure this works, since it seems that in OpenGL you can have
multiple transform feedback objects, each with its own count, all with
the same vertex buffer bound.

So as far as I can tell, you need to store the stream output count
either in the stream output CSO, or in another kind of object.

As a different option from the CSO you could add a query object
parameter to draw_stream_output where you would pass in a
SO_STATISTICS query.
However, this won't work if the transform feedback object is paused,
unless a mechanism to pause/resume queries is also added.

Furthermore, if storing the count in the CSO, a
reset_stream_output_count() seems to be necessary, since OpenGL
actually seems to count across multiple draw calls from
BeginTransformFeedback to EndTransformFeedback(), except when paused.
Alternatively, a mechanism for pausing the bound stream output CSO,
and a reset on unbind would work too.

In other words, getting it right is not as trivial as it seems on
first sight (and D3D11/ARB_transform_feedback3 adds more things to
worry about).
Finding out how i965, r600 and nv50 do this will probably help too.

BTW, in GL you still have to bind the vertex buffers, as they
explicitly say in issue #9.

> draw_stream_output is the reason for this. If we assumed that we always need
> to collect the number of verts written to a buffer we wouldn't need

Note that there is a huge difference between the GPU writing the count
in a buffer, and the CPU doing so.
Having the GPU write it allows you to do a draw call using it without
stalling the pipeline, either using an hardware specific method or
DrawIndirect from D3D11.
ARB_transform_feedback2 clearly requires the existence of such an
hardware specific method.