Thought on cworth's latest EXA blog entry
otaylor at redhat.com
Fri Jul 13 16:05:25 PDT 2007
[ Re: http://cworth.org/exa/i965/emulating_speedups/ ]
The overriding question to me, looking at your blog entry, is
"how do we make these tests more bound by drawing". That is,
with your simulation, you have a time division like:
NoAccel: 3 units rendering, 7 units overhead
Exa: 2 units rendering, 10 units overhead
(Numbers, hypothetical, exaggerated a bit from the real ones)
The discussion in your blog is how to:
A) Reduce the 2 units rendering to 0
B) How to reduce the 10 units overhead to 7
But you are basically fighting Amdahl's law there ... the fastest
you can get is going to be 30% better than the NoAccel case.
It seems interesting to me to look at how you can make the
*NoAccel* case something like:
3 units rendering, 1 unit overhead
Then the ability to accelerate the EXA case is much more significant -
you can get a 75% win, not a 30% win.
(Obviously, this neglects the other Amdahl's law problems of cairo
vs. Xserver... and Gecko vs cairo)
Reducing the overhead may not be easy (*), but one thing
to look at might be how many glyphs at a time Gecko is calling cairo
with ... if the gecko gfx is rendering, say, lots of tiny little 2-3
character strings, that might cause significant inefficiency
throughout the whole stack and limit the ability to take advantage
of hardware optimization.
(To preserve the cairo rendering model, you basically have to coalesce
the strings within Gecko, but I think it's legitimate to say "this
is inherently inefficient, here's how to make use of cairo better")
(*) I can't begin to understand how vmlinux is eating up so much time
for NoAccel unless something is going horribly wrong at the X<=>network
layer, or at the framebuffer rendering layer.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 189 bytes
Desc: This is a digitally signed message part
More information about the xorg