[Mesa-dev] [PATCH 6/9] mesa: Add a new GetTransformFeedbackVertexCount() driver hook.

Marek Olšák maraeo at gmail.com
Sat Oct 26 00:26:16 CEST 2013


On Fri, Oct 25, 2013 at 10:28 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On 10/22/2013 04:30 AM, Marek Olšák wrote:
>> On Fri, Oct 18, 2013 at 8:09 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
>>> DrawTransformFeedback() needs to obtain the number of vertices written
>>> to a particular stream during the last Begin/EndTransformFeedback block.
>>> The new driver hook returns exactly that information.
>>>
>>> Gallium drivers already implement this functionality by passing the
>>> transform feedback object to the drawing function.  I prefer to avoid
>>> this for two reasons:
>>>
>>> 1. Complexity:
>>>
>>> Normally, the drawing function takes an array of _mesa_prim objects,
>>> each of which specifies a vertex count.  If tfb_vertcount != NULL,
>>> however, there will only be one _mesa_prim object with an invalid
>>> vertex count (of 1), so it needs to be ignored.
>>>
>>> Since the _mesa_prim pointers are const, you can't even override it to
>>> the proper value; you need to pass around extra "ignore that, here's
>>> the real count" parameters.
>>>
>>> The drawing function is already terribly complicated, so I don't want to
>>> make it even more complicated.
>>
>> I don't understand this. Are you saying that the software emulation of
>> the feature is always better because of complexity the real
>> hardware-accelerated solution would have?
>
> On Ivybridge hardware, I think that a GPU-only implementation of
> DrawTransformFeedback would be very complicated, and probably less
> efficient than this (extremely simple) software solution.  It might be
> possible to do a reasonable GPU-only implementation on Haswell, but I
> haven't looked into the details yet.  (See my reply to Eric.)
>
> At least for Ivybridge, I think I want this software path 100% of the
> time.  We may want to remove the stall on Haswell as a later optimization.

I'd like to have a dedicated flag for this fallback like we have
Const.PrimitiveRestartInSoftware, in case we need to implement the
query for something else.

>
> It sounds like for Gallium, you already have a decent GPU-only solution.
>  I tried to follow that code to understand how it works, and got lost
> after jumping through around 5 files...which is probably just my poor
> understanding of the Gallium architecture.

Gallium doesn't do anything, the interface is pretty much the same as
the vbo one.

On the hardware side, there are 4 counters containing the number of
bytes written to each TFB buffer. If TFB is started, the counters are
set to 0. Everytime TFB is ended or paused, the counters are stored
for each buffer in memory. When resuming TFB, the counters are simply
loaded from memory.

When we have to do DrawTransformFeedback, we copy the value of the
counter from memory to a special draw register. Since the value is in
bytes, we also have to set the TFB buffer stride to another special
draw register. That's all. The hardware then calculates count =
bytes/stride before drawing.

>
> [snip]
>>> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
>>> index 1670409..11bb76a 100644
>>> --- a/src/mesa/vbo/vbo_exec_array.c
>>> +++ b/src/mesa/vbo/vbo_exec_array.c
>>> @@ -1464,6 +1464,12 @@ vbo_draw_transform_feedback(struct gl_context *ctx, GLenum mode,
>>>        return;
>>>     }
>>>
>>> +   if (ctx->Driver.GetTransformFeedbackVertexCount) {
>>> +      GLsizei n = ctx->Driver.GetTransformFeedbackVertexCount(ctx, obj, stream);
>>> +      vbo_draw_arrays(ctx, mode, 0, n, numInstances, 0);
>>> +      return;
>>> +   }
>>
>> As you mentioned, the only issue is with primitive restart, so why is
>> this done even if primitive restart is disabled? Drivers which will
>> have to implement this just to make e.g. non-VBO vertex uploads work
>> will suffer from the CPU-GPU synchronization this code forces.
>>
>> Marek
>
> I hadn't thought about non-VBO vertex uploads.  What does Gallium do in
> that case?  Has it just been broken this whole time?

Yes, it has, I completely forgot about it. :(

>
> I guess I figured drivers would either implement this hook, or do the
> tfb_vertcount approach, but not both.  Maybe that's a bad assumption.

For vertex uploads and vertex fetch fallbacks (where we translate and
align vertex buffers to what a gallium driver supports -
util/u_vbuf.c), we can use a query like the one you want to add.
However, gallium drivers should use the tfb_vertcount approach (AKA
pipe_draw_info::count_from_stream_output) whenever they see it's not
NULL. Since most Gallium hardware drivers will never see non-VBO
vertex data or an unsupported vertex format, it's the only approach
they have to implement.

Marek


More information about the mesa-dev mailing list