michel at tungstengraphics.com
Wed Aug 8 02:19:19 PDT 2007
On Tue, 2007-08-07 at 20:45 +0200, Lukas Hejtmanek wrote:
> On Tue, Aug 07, 2007 at 11:30:46AM +0200, Michel Dänzer wrote:
> > P.S. I still doubt this is the bottleneck of your virtual desktop
> > switching, as the numbers you're getting translate to filling the screen
> > in just tens of milliseconds.
> So, I did profiling of virtual desktop switching.
Could you share the profiles?
> It has problems at two places.
> 1) exaCopyDirtyToSys calls exaMemcpyBox which uses plain memcpy instead of
> pixman_blt_mmx. Which is result of initial call to GetImage.
As Søren pointed out, pixman_blt_mmx would be unlikely to make a
difference here. The bottleneck is probably reading from uncacheable
Generally, if memcpy doesn't do its job as quickly as possible, that
should be fixed.
> Btw, is it possible to expose offscreen pixmap to the application so that
> PutImage and GetImage can be safely ignored?
Not sure that's what you mean, but you can try commenting out the
exaDoMigration call in exaGetImage to see if that makes any difference.
> 2) I830WaitLpRing consumes too much CPU because of too frequent calls of
> I think that initial approach to optimization could be to call only
> GetTimeInMillis each 1000th iteration or something like that.
Then the cycles would just be burned in I830WaitLpRing instead of
GetTimeInMillis, wouldn't they?
> Is it possible to completely avoid busy-loop or chip does not support
> anything else than busy-loop?
As has been pointed out, it could be done using the IRQ. This will
probably happen via fences when using TTM, in the meantime you could try
using DRM_IOCTL_I915_IRQ_EMIT and DRM_IOCTL_I915_IRQ_WAIT.
Earthling Michel Dänzer | http://tungstengraphics.com
Libre software enthusiast | Debian, X and DRI developer
More information about the xorg