Kernel scheduling algorithm and X.Org performance

Fri Sep 2 07:37:21 PDT 2005

Thanks all for the replies. I've spend quite some time analyzing so
please take a look at what I've got.

The very first thing I wanted to do was getting a complete trace of
system events. Fortunately almost everything I proposed has already been
implemented in Linux Trace Toolkit (http://www.opersys.com/LTT) and its
development version has a patch against linux-2.6.9 kernel. Just recent
enough, so I used it to trace what happens.

The test I've chosen as the most demonstrative is the following:
1. Open a large enough text file in gedit
2. Open gcalctool
3. Drag gcalctool over gedit actively (but not too aggressively) for a
while

I did 10-seconds dumps with LTT. It shows virtually everything happened
in the system and, most importantly, input interrupts, reschedules and
active processes.

Among all process priority combinations that I tried I've chosen two as
the most definitive and clear.

The first test uses the following priorities:
Xorg: -1
Window Manager (metacity): 1
applications: 0 (i.e. untouched)

I'd also like to notice that using -20 priority instead of -1 or 19
instead of 1 doesn't seem to make sense in this particular test. These
priorities are "visually" equivalent.

This test feels as follows:
- the mouse is smooth, the cursor moves just as it does with 0% system
load
- gcalctool moves over gedit window with noticeable jumps (quite laggy)
- gedit seems to do its best to redraw previously obscured regions and
almost keeps up, it is evident that flickering that is seen is at the
minimum possible level - it is only the time necessary for gedit to
handle the event (which is still noticeable but feels fine)

The time graph may be seen on the following screenshots.
Overall picture with some comments:
(22Kb)
http://img5.picsplace.to/img5/3/good_scenario_overview_commented.png
The zoomed version:
(20Kb) http://img5.picsplace.to/img5/3/good_scenario_zoomed.png

I've commented right on an overall screenshot on what really happens.

The second test's priority values are all zeroes:
Xorg: -1
Window Manager (metacity): 1
applications: 0

This test feels like this:
- the mouse is not as smooth as in test 1, but still quite responsive
- gcalctool moves over gedit window much smoother that in test 1, small
jumps happen but it feels better
- gedit doesn't keep up with redrawing. we see the eraser-effect

Here are the screenshots of LTT's tracevisualizer tool:
Overall picture:
(25Kb) 
http://img5.picsplace.to/img5/3/bad_scenario_overview.png
The zoomed version:
(21Kb) http://img5.picsplace.to/img5/3/bad_scenario_zoomed.png

Look at the overall graph of the bad case (all priorities 0 - the real
life). gedit still does receive control after X sends expose events to
its clients but it doesn't get enough time to respond to it. The
good-picture shows that it takes about 25ms to gedit to process one
expose event and in the bad case in gets only 5ms and then it is
preempted by the window manager and we already know what happens then.
Ans still 25ms is less than the maximum kernel scheduling timeslice for
0-priority processes which is by default 100ms.

As the result, scheduling is unfair. When we speak about graphics,
fairness means that all applications redraw themselves with the same
frames-per-second speed. Achieve the same fps for all apps and the user
will be more satisfied than when the WM renders at 100fps while the app
in the background renders at 4fps and looses its window's contents.

At the same time we do not want a slow application (there are ones,
inkscape renders quite slow for example) to lag the whole system. So if
the application is slow - let it be slow, the user will notice this
anyway. This means that we want to state some minimum desired fair fps
value. If that minimum fps can't be done, let WM preempt the apps so
that they loose their windows - we are already slow enough so that
responsiveness is now a priority.

What I just said in the scheduler language means a quite simple thing:

- Don't let the window manager preempt the application that processes an
expose event for N milliseconds

or in the language of the kernel scheduler:

- Don't let the process with the same static priority preempt the
process that is handling an expose event for N milliseconds (say, 25ms)

This way the application will have a chance to process the expose event
completely and send certain drawing commands to the X server. And still
the application can be slow enough that it doesn't use its chance and
becomes slow, but system responsiveness in the latter case doesn't
become worse than F fps, where F is directly related to N ms timeslice
given to the application.

This doesn't mean that scheduling becomes unfair. The application still
has 100ms maximum timeslice after which it is forced to yield the
processor, we just let that application to work for a while if we know
that it is processing an expose event.

Such modification is targeted at the kernel scheduling algorithm. This
doesn't mean that the current algorithm is not fair. This means that X
Windows graphics is special and handling it the better way worths some
exceptions to be made.

One question here is: how do we know that an application is currently
handling an expose event? It's evident that the moment should be marked
some way and we'll have to switch to the kernel to do this. My guess is
an entry in /proc to write to to mark the process as 'currently
exposing'. The process is not considered exposing after it has been
rescheduled at least once after being marked. And I think that writing
to /proc is fast enough: look at the graphs, redrawing takes a _lot of
time.

The second question is: who will mark the app as 'currently exposing'?
The most trivial thing is to do it at the toolkit level. For example, in
Gtk+ it may be done on receiving an expose event from X (in GDK level),
once before each flushing of pending X commands. The other player who
could mark the process is X server itself, this way would be the least
painful, but I'm not sure that this could be done because of X's network
transparency. So here is the question: can this be handled in Xserver?

Please let me know if I'm not clear enough.

Dmitry