client-side font rendering very very slow in X.org xserver 1.5.3 w/r200: massive fetches from VRAM, why?
Nix
nix at esperi.org.uk
Fri Jan 30 13:59:45 PST 2009
On 30 Jan 2009, Michel Dänzer stated:
>Trying current xf86-video-ati Git might be good, but my main suggestion
>would be to try xserver Git server-1.6-branch with EXA.
OK. Do I need to upgrade Mesa or anything related at the same time?
(I'm currently on libdrm 2.4.1, Mesa a few commits past 7.2.0).
>> Both Konsole 3.5.9 (without a background pixmap) and a recent xterm are
>> equally sluggish;
>
> xterm uses core fonts by default, did you configure it differently?
I just used -fa/-fs to force client-side fonts.
>> X: fbFetch_a1 10.92
>> dixLookupPrivate 7.04
>> fbStore_a1 3.70
>> mmxCombineAddU 3.06
>> pixman_image_composite 2.58
>
> [...]
>
>> So it looks like we *are* doing huge numbers of fetches from VRAM,
>> judging by the massive time spent in calls upon pixman's fbFetch().
>
> No, at least with EXA, fb*_a1 can't access video RAM directly, as the
> EXA core currently never migrates pixmaps of bpp < 8 to video RAM.
This was a profile with XAA, not EXA. Here's a more comprehensive set of
results (using xterm at all times, 'non-AA' forced by using my preferred
terminal font, the bitmap font Neep Alt), started and stopped by hand so
it's all quite crude, roughly 60s per benchmark run (so I can't explain
sysprof's saying that some runs spent 90s in X: on a single core that
seems quite unlikely):
XAA, 16, AA: fbCompositeSolidMask_nx8888x0565Cmmx 59.25 (time in X, 90.12)
XAA, 24, AA: fbCompositeSolidMask_nx8888x8888Cmmx 57.12 (time in X, 85.03)
EXA, 16, AA: dixLookupPrivate 23.17
visually much faster than XAA, occasionally degrades to
XAA speed
EXA, 24, AA: dixLookupPrivate 26.83
XAA, 16, non-AA: fbFetch_a1 12.43 %time in X, 79.96)
XAA, 24, non-AA: fbFetch_a1 14.51 (time in X, 87.60) (v. slow in konsole,
very slightly faster-seeming in xterm but profile results
identical so this is just an artifact of differing repaint
strategies)
EXA, 16, non-AA: fbFetch_r5g6b5 53.40, fbFetch_a1 5.75 (time in X, 95.88)
horrendously, impossibly slow, >10s for a single screen repaint
EXA, 24, non-AA: fbFetch_a1 12.40 (89.34s in X)
much better than the abominable depth 16 results, back to
XAA speed
XAA, 16, core: cat, bash, xterm; CPU load nearly nil; screen a blur far too
fast to read
highest consumer in X, at <1s, DrawTETextScanlineWidth7()
XAA, 24, core: cat, bash, xterm; CPU load nearly nil; screen a blur far
too fast to read; highest consumer in X, at <1s,
DrawTETextScanlineWidth7()
EXA, 16, core: pixman_fill_mmx 22.17, fbGlyph16 15.83, CPU still pegged by X,
in sharp contrast to non-EXA; (time in X, 62.58)
EXA, 24, core: pixman_fill_mmx 37.51, fbGlyph32 13.63 (time in X, 69.48)
better than anything else bar core XAA
In general, core fonts much faster than client-side fonts, 24-bit as
fast or faster than 16-bit (this has changed in the about eight years
since I paid any attention to it last, and DRI no longer stops working
in 24-bit mode: maybe I'll switch), XAA faster than EXA with the single
exception of anti-aliased fonts, which I don't use in terminals and
text editors because I like my text small enough that antialiasing is
uglier than not.
I must say, looking at these crude benchmark results I'm wondering if
this client-side font thing wasn't an appealing diversion. Yes, they're
pretty, and more flexible than core fonts: but all of a sudden simply
simply redrawing the screen has become so CPU-intensive that a screen
scroller can peg the CPU without any real effort :( isn't X supposed to
use *less* CPU time than the apps that call on it? :(((
> To avoid a1 pictures, you could try using anti-aliasing everywhere, i.e.
> don't choose any bitmap fonts and don't disable anti-aliasing for small
> font sizes.
The benchmarks show that this would indeed speed things up. It would
also eliminate every font I use day-to-day and give me piercing
headaches. No thanks, let's find another way. :)
>> (++) RADEON(0): Depth 16, (--) framebuffer bpp 16
>> (II) RADEON(0): Pixel depth = 16 bits stored in 2 bytes (16 bpp
>> pixmaps)
>
> Is it any better in depth 24, or even worse?
(See above.)
Better under EXA: sometimes better, sometimes worse under XAA (better
for antialiased fonts only, all others worse, or too fast to tell in
the case of core fonts).
More information about the xorg
mailing list