Bad 2D performance with xf86-video-ati driver on RV350 AR [Radeon 9600]

Patrick Haller patrick.haller at haller-consult.com
Sun Dec 21 13:41:06 PST 2008


Hello everybody,

it's been quite a while since I last participated in F/OSS development, 
but wanted to take a plunge into xorg-server (or Clemens' great 
jxrender) in the x-mas vacation.

Following how EXA performance improved, especially the 'x11perf 
-aa10text', I was wondering why others got great results with 600.000 
chars/sec and more, but my setup was stuck at ~55.000 chars/sec.

I observed that with my setup a lot of EXA migrations took place if I 
ran in dual head mode (2560x1024). In single head mode (1280x1024) 
virtually no migrations took place, even setting up an awkward 2559x1024 
caused no migrations and performed well, too.

My Setup:
- xorg-server head (1.6.99.1), xf6-video-ati head
- 01:00.0 ATI Technologies Inc RV350 AR [Radeon 9600]
- 01:00.1 ATI Technologies Inc RV350 AR [Radeon 9600] (Secondary)

I found that in radeon_exa_render.c, R300CheckComposite() the limiting 
boundaries of the chipset are checked:

     if (pDstPixmap->drawable.width >= max_dst_w ||
         pDstPixmap->drawable.height >= max_dst_h) {
         RADEON_FALLBACK(("Dest w/h too large (%d,%d).\n",
                          pDstPixmap->drawable.width,
                          pDstPixmap->drawable.height));
     }

Could this possibly be an off-by-one error? I changed this (and similar 
locations in the neighborhood) to

     if (pDstPixmap->drawable.width > max_dst_w ||
         pDstPixmap->drawable.height > max_dst_h) { ... }

So far my tests looked very good (no failures during testing):

1280x1024 2560x1024   fixed
--------- --------- -------
    870000     56700   953000 Char in 80-char a line (Charter 10)
   1000000     57200  1110000 Char in 80-char aa line (Charter 10)
    871000     56700   954000 Char in 80-char rgb line (Charter 10)

    534000     15500   567000 Char in 30-char a line (Charter 24)
    595000     15500   630000 Char in 30-char aa line (Charter 24)
    535000     15500   566000 Char in 30-char rgb line (Charter 24)

    854000     45600   936000 Char in 80-char a line (Courier 12)
    849000     45600   936000 Char in 80-char aa line (Courier 12)
    853000     45600   937000 Char in 80-char rgb line (Courier 12)

Other benchmarks such as gtkperf also showed significant improvement.

I'm neither familiar with the r300 hardware nor the Xorg software.
I tried to create a nice git patch (that's a first, too).
So could somebody kindly check and review this?


Thanks + Merry X-Mas,
Patrick

-------------- next part --------------
>From d5719514dd2eed4ecef9e4c8e8efe3724728cf65 Mon Sep 17 00:00:00 2001
From: Patrick Haller <patrick.haller at haller-consult.com>
Date: Sun, 21 Dec 2008 22:30:51 +0100
Subject: [PATCH] Off-by-one when checking chipset limits caused exa migrations.

Off-by-one when checking chipset limits caused exa migrations. Changing the comparions
improves the performance of x11perf -aa10text substantially if the chipset limits
exactly match the virtual screen size.
---
 src/radeon_exa_render.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)
 mode change 100644 => 100755 src/radeon_exa_render.c

diff --git a/src/radeon_exa_render.c b/src/radeon_exa_render.c
index 55e55be..a922915
--- a/src/radeon_exa_render.c
+++ b/src/radeon_exa_render.c
@@ -531,7 +531,7 @@ static Bool FUNC_NAME(R100PrepareComposite)(int op,
 	return FALSE;
 
     if (pDstPicture->format == PICT_a8 && RadeonBlendOp[op].dst_alpha)
-        RADEON_FALLBACK("Can't dst alpha blend A8\n");
+        RADEON_FALLBACK(("Can't dst alpha blend A8\n"));
 
     if (pMask)
 	info->accel_state->has_mask = TRUE;
@@ -831,7 +831,7 @@ static Bool FUNC_NAME(R200PrepareComposite)(int op, PicturePtr pSrcPicture,
 	return FALSE;
 
     if (pDstPicture->format == PICT_a8 && RadeonBlendOp[op].dst_alpha)
-        RADEON_FALLBACK("Can't dst alpha blend A8\n");
+        RADEON_FALLBACK(("Can't dst alpha blend A8\n"));
 
     if (pMask)
 	info->accel_state->has_mask = TRUE;
@@ -1121,8 +1121,8 @@ static Bool R300CheckComposite(int op, PicturePtr pSrcPicture, PicturePtr pMaskP
 	max_dst_h = 2560;
     }
 
-    if (pSrcPixmap->drawable.width >= max_tex_w ||
-	pSrcPixmap->drawable.height >= max_tex_h) {
+    if (pSrcPixmap->drawable.width > max_tex_w ||
+	pSrcPixmap->drawable.height > max_tex_h) {
 	RADEON_FALLBACK(("Source w/h too large (%d,%d).\n",
 			 pSrcPixmap->drawable.width,
 			 pSrcPixmap->drawable.height));
@@ -1130,8 +1130,8 @@ static Bool R300CheckComposite(int op, PicturePtr pSrcPicture, PicturePtr pMaskP
 
     pDstPixmap = RADEONGetDrawablePixmap(pDstPicture->pDrawable);
 
-    if (pDstPixmap->drawable.width >= max_dst_w ||
-	pDstPixmap->drawable.height >= max_dst_h) {
+    if (pDstPixmap->drawable.width > max_dst_w ||
+	pDstPixmap->drawable.height > max_dst_h) {
 	RADEON_FALLBACK(("Dest w/h too large (%d,%d).\n",
 			 pDstPixmap->drawable.width,
 			 pDstPixmap->drawable.height));
@@ -1140,8 +1140,8 @@ static Bool R300CheckComposite(int op, PicturePtr pSrcPicture, PicturePtr pMaskP
     if (pMaskPicture) {
 	PixmapPtr pMaskPixmap = RADEONGetDrawablePixmap(pMaskPicture->pDrawable);
 
-	if (pMaskPixmap->drawable.width >= max_tex_w ||
-	    pMaskPixmap->drawable.height >= max_tex_h) {
+	if (pMaskPixmap->drawable.width > max_tex_w ||
+	    pMaskPixmap->drawable.height > max_tex_h) {
 	    RADEON_FALLBACK(("Mask w/h too large (%d,%d).\n",
 			     pMaskPixmap->drawable.width,
 			     pMaskPixmap->drawable.height));
-- 
1.6.0.4



More information about the xorg mailing list