[Mesa-dev] Path to optimize (moving from create/bind/delete paradgim to set only ?)

Tue Nov 16 17:26:03 PST 2010

On Tue, Nov 16, 2010 at 6:06 PM, Corbin Simpson
<mostawesomedude at gmail.com> wrote:
>> On Tue, Nov 16, 2010 at 9:17 PM, Jerome Glisse <j.glisse at gmail.com> wrote:
>>> On Tue, Nov 16, 2010 at 3:51 PM, Jakob Bornecrantz <wallbraker at gmail.com> wrote:
>>>> On Tue, Nov 16, 2010 at 7:21 PM, Jerome Glisse <j.glisse at gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> So i looked a bit more at what path we should try to optimize in the
>>>>> mesa/gallium/pipe infrastructure. Here are some number gathers from
>>>>> games :
>>>>> drawcall /     ps constant   vs constant     ps sampler    vs sampler
>>>>> doom3            1.45             1.39               9.24              9.86
>>>>> nexuiz             6.27             5.98               6.84              7.30
>>>>> openarena  2805.64             1.38               1.51              1.54
>>>>>
>>>>> (value of 1 mean there is a call of this function for every draw call,
>>>>> while value of 10 means there is a call to this function every 10 draw
>>>>> call, average)
>>>>>
>>>>> Note that openarena ps constant number is understable as it's fixed GL
>>>>> pipeline which is in use here and the pixel shader constant doesn't
>>>>> need much change in those case.
>>>>>
>>>>> So i think clear trend is that there is a lot of constant upload and
>>>>> sampler changing (allmost at each draw call for some games)
>>>>
>>>> Can you look into what actually changes between the sampler states?
>>>> Also that vs sampler state change number for OpenArena looks a bit
>>>> fishy to me.
>>>>
>>>> Cheers Jakob.
>>>>
>>>
>>> I haven't looked at what change yet, i assume something small, i think
>>> bugle trace of the engine is maybe easier to use than looking at
>>> quake3 source code. For the vs sampler i was surprised too but it's
>>> just the fact that q3 changes the vertex buffer a lot and this trigger
>>> the vs sampler.
>
> Could we get some problematic Bugle traces posted that we could all
> examine, rather than guessing at this? It'd be very nice to know
> whether or not the problems are in the GL state tracker layer before
> we move on to optimizing Gallium's interface, mostly because Dx
> appears to not suffer these same problems.
>

I haven't looked closely at sampler issue but the shader constant is
obvious on r600g, it's the pipe buffer allocation at each constant
update that kills us, even with somehow fixing pb* there is a too big
overhead in the pb layer. it's only few % of the whole cpu time bug
again things pile up and no matter how small you cut the cpu usage it
directly shows up in the framerate. That's why my feeling is that we
should keep the cpu overhead for state change as low as possible and i
fear the fastest way is to drop create/bind paradigm.

I pretty much use the dri benchmark wiki page for running games in
timedemo, lately i mostly used nexuiz because it's easy to install and
it's rendering is somewhat more complex that quake3 thus a little bit
more closer to what i would like to target for r600g driver.

Anyway my point is that here the gl state tracker is not to blame,
it's only the fact that real application lead to a lot of cso
activities and i am not convinced that what we might possibly win with
cso is more important than what we loose when considering API such as
GL.

Cheers,
Jerome Glisse