[xorg-bugzilla-noise] [Bug 839] New: Speeding up render with gcc
3.4 and MMX intrinsics
bugzilla-daemon at pdx.freedesktop.org
bugzilla-daemon at pdx.freedesktop.org
Thu Jul 8 09:45:22 PDT 2004
Please do not reply to this email: if you want to comment on the bug, go to
the URL shown below and enter your comments there.
http://freedesktop.org/bugzilla/show_bug.cgi?id=839
Summary: Speeding up render with gcc 3.4 and MMX intrinsics
Product: xorg
Version: CVS_head
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Server/general
AssignedTo: xorg-bugzilla-noise at freedesktop.org
ReportedBy: sandmann at daimi.au.dk
Attached is a patch that speeds up some operations in the Render
extension by using gcc 3.4 MMX intrinsics. A benchmark rendering a
paragraph of component alpha text to a pixmap gave these results on a
1200 MHz laptop with an i830 chip running Fedora Core I:
Unmodified X server and the pixmap in system RAM:
[ssp at localhost x]$ ./a.out
total time: 41.394618
average rect time: 0.683200
worst rect: 9
average glyph time: 3.550500
with the MMX optimizations:
[ssp at localhost x]$ ./a.out
total time: 22.972553
average rect time: 0.677900
worst rect: 9
average glyph time: 1.692000
Ie., text rendering is more than twice as fast. The 'average glyph
time' here is the time it takes to render the entire paragraph of
text.
With the pixmap in video RAM, the speedup is not quite as
spectacular:
Unmodified X server:
[ssp at localhost x]$ ./a.out
total time: 95.900768
average rect time: 0.003300
worst rect: 1
average glyph time: 9.693500
With MMXified compositing:
total time: 66.559287
average rect time: 0.015100
worst rect: 6
average glyph time: 6.720500
But still a nice improvement. The patch includes improved code for
these cases:
Subpixel text:
- (constant color) in (component alpha mask) over 565 destination
- (constant color) in (component alpha mask) over 32bit destination
- (32 bit component alpha) Saturate (32 bit destination)
Regular antialiased text:
- (8 bit alpha) Saturate (8 bit destination)
- (constant color) in (8 bit alpha mask) over 565 destination
- (constant color) in (8 bit alpha mask) over 32bit destination
GdkPixbuf:
- (reversed, non-premultiplied source) over 32bit destination
- (reversed, non-premultiplied source) over 565 destination
Alpha rectangle (e.g., Nautilus selection rectangle):
- (constant color) over 32bit destination
- (constant color) over 565 destination
Solid fill
- solid fill of 32 bit drawable
- solid fill of 16 bit drawable
The code can optionally be compiled to use the pshufw instruction, which
is only available on pentium III.
One question: The patch has a bad hack where it redefines
DefaultCCOptions for all of the framebuffer code. How should this be
done properly? The problem with the existing DefaultCCOptions is that
they include -pedantic which doesn't work with the MMX intrinsics.
Attaching the patch, two new files and the benchmark.
--
Configure bugmail: http://freedesktop.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the xorg-bugzilla-noise
mailing list