[Mesa-dev] Mesa (shader-work): glsl: introduce ir_binop_all_equal and ir_binop_any_equal, allow vector cmps

Thu Sep 9 02:35:07 PDT 2010

On Wed, 2010-09-08 at 21:30 -0700, Marek Olšák wrote:
> On Thu, Sep 9, 2010 at 2:35 AM, Eric Anholt <eric at anholt.net> wrote:
>         However, the fact that when I ask about performance nobody
>         says "OMG the
>         texture upload/download/copy/etc. support in gallium is so
>         great and is
>         way faster than anybody managed in classic because it catches
>         all the
>         corner cases" makes me really scared about it.
> 
> OK so there you have it: The texture upload/download/copy is way
> faster in r300g than anyone managed in r300c. For example, the ETQW
> main menu uses some texture streaming and I get 1 fps in r300c and
> about 30 fps in r300g. That's not something one can ignore.
> 
> The transfers in r300g were really simple to implement, it's just
> transfer, map, unmap, transfer (transfer=blit). The code is clean and
> isolated (about 250 LoC). This is just the driver code. There is also
> some additional transfer code in st/mesa, but I didn't have to care
> about that.
> 
> The overall speed of r300g is either at the same level as r300c or
> higher, based on what application you look at. Various users have
> reported that, unlike r300c, all compiz effects just work, kwin works,
> a lot more games work, and the driver is faster in some apps. We used
> to have some performance issues in Tremulous not so long ago, but
> that's been fixed since then. Of course, one can find synthetic tests
> where one driver is always faster than another. I am talking about
> real world applications here. For example, I no longer have to kill
> Strogg in ETQW with the lowest details on my R580 for it to be smooth.
> 
> r300g is quite optimized (I say "quite", because you're never sure),
> so the overhead of other mesa components is larger than other Gallium
> drivers might be able to see. In Tremulous, the overhead of r300g
> +libdrm is under 50% of the whole time spent in Mesa, and that's
> without using Draw, so we should start caring about the speed of
> st/mesa and mesa/main. The only obvious way to get some speed there
> seems to be merging Mesa core with st/mesa, dropping the classic
> driver inteface, and simplifying the whole thing. I guess this won't
> happen until everybody moves to Gallium.

There's a lot that can be improved in st/mesa with regard to assembling
the vertex buffers & vertex elements -- very little of this work is
reused between subsequent primitives, but with a bit of analysis and
dirty state tracking I think a good improvement is possible.

st/mesa just hasn't had a lot of love - it got to a working state fairly
early, but has been largely neglected since.  

The fact it's a bit ugly is one factor that puts people off, but I guess
it's also the one part of gallium which doesn't benefit from gallium --
ie. you still have to deal with all the overlapping & conflicting GL
semantics that made things so confusing in the first place.

> It's a sure thing Gallium is the future, and it doesn't seem to be a
> good idea to implement e.g. LLVM-powered vertex shaders for classic,
> considering the same thing has already been implemented and now stable
> in Gallium.
> 
> The only disadvantage of Gallium seems to be TGSI. It's not better
> than Mesa IR, because all shaders must pass through Mesa IR anyway. I
> suppose using GLSL IR or something similar would help some drivers
> produce more optimized shaders (I am getting at temporary arrays here,
> which r300 hardware partially supports).

I'm really open to improving (or replacing) TGSI & I agree that passing
some sort of semantically rich IR would be a big win - especially now
there is a source for such a thing.

There's a reasonably painless path to getting there, I think, where we
could start off passing the new IR but have drivers initially slot in a
helper to drop it down to TGSI - ie push the IR->TGSI translation down
into each driver & then eliminate it piecewise.

I've just been incredibly swamped the last month or so, so I haven't had
much chance to follow up with a plan to get there, but the steps to do
this seem fairly clear.

Probably the first thing would be to define an IR which is a derivitive
of the new Mesa IR, without any dependencies into mesa concepts - ie.
which talks about constant buffers, etc, instead of GL parameters.  I
haven't had even the beginning of an opportunity to see how hard this
would be.

Keith