Optimising xserver (Xft text rendering improvements)
keithp at keithp.com
Sun Mar 27 09:17:43 PST 2005
Around 9 o'clock on Mar 27, Lars Knoll wrote:
> I had the same thought, but I think working on a whole line of the
> destination might give you better cache performance in most cases (ie. all
> cases where you don't have a transformation on the destination).
Right, but I'm thinking that the general case code is much more likely to
be hit when a filter or transformation is involved, both of which require
non-linear access to the pixel data. For that, and because I suspect
using fixed size patches (8x8, or perhaps 16x16) will reduce register
pressure, I'd like to try this approach.
Using relatively small patches of 8 lines should ensure that the
consecutive lines of source needed should all fit in the second level cache
while the patches themselves remain in first level cache.
Furthermore, I envision building a patch cache mechanism so that filters
and scaling algorithms can re-use old upstream patches without needing to
recompute them. I think a simple LRU replacement strategy with a
pre-computed cache size will eliminate thrashing entirely (you can compute
up-front the maximum number of patches needed in the cache).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 228 bytes
Desc: not available
More information about the xorg