LibXft : xftglyphcore woes
chl at clerew.man.ac.uk
Wed Nov 26 09:59:21 PST 2008
XftGlyphCore can waste a lot of time if asked to write glyphs outside of
There is one major application (Opera/QT) that hits this problem in Spades.
I propose a patch to fix it.
These remarks relate to the version of xftcore.c identified by
* $Id: xftcore.c,v 1.4 2005/07/03 07:00:57 daniels Exp $
My system is (via uame -a)
SunOS clerew 5.10 Generic_118822-25 sun4u sparc SUNW,Ultra-2
Essentially, XftGlyphCore is provided with a Drawable (via XftDraw), a
bunch of glyphs, and a point (x,y) at which they are to be drawn. When
used in anti-aliasing mode (when the XftDraw specifies TrueColor), it can
consume a lot of resources (including at least one round-trip to the
Xserver from a call of XGetImage). Even if the area where the glyphs are
to be drawn does not intersect with any area within the drawable, it still
goes through the whole process of drawing the glyphs, right through to
obtaining the (supposed) existing background, calculating how the glyphs
are to be merged with it, and writing the results to an (supposed) XImage
which is never used.
In particular, the call of XGetImage fails, and it reverts to using its
'use_pixmap' mode for the next few calls. This involes a call of
XCreatePixmap to create a pixmap the size of the bunch of glyphs, a call
of XCreateGC and of XCopyArea to populate it with the (supposed)
background, and a further call of XGetImage to get that background in
XImage form (that's three Xserver round-trips).
It then writes those glyphs to the XImage, and uses XPutImage to write
them back to the Drawable (which then discovers there is no intersection,
and so ignores them). No harm is done; nothing breaks; but if you do it
often it consumes huge resources.
But why, you may ask, should anybody in his Right Mind call XftGlyphCore
to write glyphs that are not even inside the Drawable? A Good Question,
indeed, but sadly there is one major applications that does it all the
time :-( .
The Opera Web Browser is written on top of the QT Toolkit, which in turn
is written on top of LibXft. It includes a feature for reading and
composing emails, and hence contains a text editor (also used when filling
in Web Forms). I had long been aware that it had started to consume vast
resources when composing large emails (or replying to large emails), and a
long moan on opera.os.solaris had produced zilch response. So in
desperation I set out to discover what was happening.
The first observations was that it only happened on one of the two screens
on my machine (the one behind the fancy Creator Graphics card). After much
poking around with truss and mdb, I discovered where the machine cycles
were going to and, after downloading the source code of LibXft, I saw that
the problem was related to the use of 24bit color plus TrueColor (my other
screen uses 8bit color plus PseudoColor). Note that, up until that time, I
had ever even heard of LibXft, or of the Render extensions, or of
anti-aliasing (thanks to Wikipedia for explaining that). Though I must
confess that, to those of us whose accomodation is long gone and who have
to sit at a very precise distance from the screen to see it all in focus,
anti-aliasing does indeed give quite an improvement. So I had a very steep
learning curve to follow :-( .
Anyway, I eventually pieced together what Opera plus QT was actuaslly
doing, so here it is (it is not a pretty story, and I have yet to discover
whether it is an Opera problem or a QT problem).
Opera keeps a record of all the "word"s written to the editing window (a
"word" is essentially a sequence of alphanumeric characters - any other
character seems to be treated as a word of its own). Such words are used
in calls of XftDrawString16, which duly calls XftGlyphCore. Each time you
type a character (or use an arrow key, or delete a character) it discovers
which bit of the window it needs to redraw, and constructs a brand new
Pixmap of that size and prefills it with the supposed background of the
window at that place (which, in practice, is always just pure white
pixels). So now it needs to copy the required glyphs to that Pixmap
(XftDrawString16), and when that is done it copies the Pixmap back to the
original Window using XCopyArea, and then it throws the Pixmap away. A bit
long-winded you might think, but You Ain't Seen Nothin' Yet.
For, to do this, it needs to know which glyphs are to be written into this
(usually small) Pixmap. You might think that was a straightforward task,
but No! It systematically goes through the WHOLE WINDOW, rewriting All the
"words" known to be in it to that small Pixmap, whether they belong there
or not. Most of them don't, of course! So, it your window is full of text,
and you type some characters in at a reasonable typing speed, you can then
sit an watch for several seconds while they all gradually appear (cursor
movements and backspaces included) one-by-one. Not a pleasant way to
construct your emails :-( .
But there is worse to come! Being an editing window, it naturally contains
a cursor (this is the point-of-insertion cursor, not the mouse cursor).
And this cursor blinks - 1/2 second on, 1/2 second off. Now it has the
good sense not to use XDrawString16 to draw the cursor, BUT it does regard
the cursor as part of the background, and so whatever glyph there might be
at that point has to be re-anti-aliassed. You can see what is coming ...
Twice every second, it has to redraw every "word" in the window, on the
offchance that it overlaps the 2x15 Pixmap where the cursor is ........
OK, time for some numbers. The worst case is when the window contains
"words" of 1 character each, so I wrote a window containing alternate 'x'
and SP - that's 1700 'x's altogether, and observed the CPU load involved
just to keep that cursor blinking.
Now my machine has two processors of 300MHz each (there are faster machine
around, but that is still quite some computing power), and of those two
XSun was using 32.7% - call it 65% of one processor
Opera was using 26.0% - call it 52% of the other processor
just to keep the cursor blinking. After applying the Patch which I shall
describe, that reduced to
XSun was using 0.3%
Opera was using 3.5%
and now I can compose my emails in peace again.
But what an incredibly Stupid way to program an application! Yes, I shall
be moaning again to the Opera (or QT) people, but in the meantime I think
LibXft needs to be made proof against such stupidities, because stupid
applications are still going to happen.
I have attached my Patch. It essentially does three things:
The macro XftIntMult is modified to optimize the case where the background
is pure white or the glyph color is opaque. This was an early mod I made,
and though possibly useful is not essential.
_XftSmoothGlyphGray8888 is modified so that it only draws the part of the
glyph(s) that intersect with the XImage of the Drawable (which is always a
Pixmap in the Opera case). Without this, there is now a danger of writing
over unallocated storage.
XftGlyphCore now uses XGetGeometry to discover the size of the Drawable
(cacheing it in a static variable to save Xserver round-trips). Then it
determines the intersection with the glyphs to be drawn, bailing out if
the intersection is empty. Finally, it draws whatever portion of the
glyphs lies within the intersection. It also, for good measure, checks the
intersection and bails out in the same way when sharp glyphs are used.
Of course, this all causes some extra overhead in cases where the all the
glyphs do lie within the Drawable, but not too much of it AFAICS.
Note that, if this patch gets adopted, it will probably be necessary to
apply similar treatment to XftGlyphSpecCore and to the other
_XftSmoothGlyph*, and I would be happy to work on that if needed (though I
am not sure I could test them). But what I have done so far is sufficient
for my present need, and for proof of concept.
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131
Email: chl at clerew.man.ac.uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 8008 bytes
Desc: not available
More information about the xorg