control groups

Thiago Macieira thiago at
Fri Apr 13 12:55:26 PDT 2012

On sexta-feira, 13 de abril de 2012 15.10.50, Neal H. Walfield wrote:
> At Wed, 11 Apr 2012 09:49:49 -0400,
> Havoc Pennington wrote:
> > In the performance paper
> >
> > pdf there's discussion of scheduler issues where the bus only gets the
> > same CPU slice as each individual dependent app, so in the aggregate the
> > apps get way too much time and the bus way too little.
> This is a basic problem throughly examined by the real-time scheduling
> community.  Start looking at things like priority inheritance
> (globally fairer) or priority ceiling (easier to implement).

Futex priority inheritance is implemented in all Linux versions, and other 
systems have similar functionality -- though I have no clue how they'd 
transmit the priority token to other processes.

For Linux, I'd start with:
 1) create a temp file, mmap it to memory, create a futex with PI there
 2) pass the file's FD as part of the message, with a new header field 
containing the FD index
 3) the receiver side gets the extra FD, maps it to memory and uses the futex 
PI to get the enhanced priority

The bus daemon would do that while processing that message, then pass the PI 
token to the target process and stop using on its own.

There are a lot of problems with this initial idea. Those come readily to 
 a) the futex PI interface was created so that a thread trying to lock would 
give its priority to the thread currently holding the lock. We have a race 
condition: the sender needs to lock when the receiver has already obtained the 

 b) worse: for the receiver to know it needs to lock and to know what to lock, 
it needs to have received the message and started processing, which is the 
very priority inversion problem we're trying to solve

 c) in order for the sender to give its priority to something else, it needs 
to be suspended in a futex lock. That means we can't do asynchronous 
processing nor can we pop messages off the socket while our priority is being 
given away.

 d) by the way, how can one thread give its priority *and* continue executing?

So I think that this is a nice exercise to get the brain started, but we'll 
need kernel help. I'd suggest instead:

 - a new system call that returns a priority-inheritance file descriptor

 - said FD is passed in the D-Bus message

 - whenever a thread is polling that FD, its priority is automatically 
inherited by a process holding that FD open -- this includes FDs currently 
queued in a Unix socket buffer

 - if a process has multiple priority FDs open, it gets the highest priority 
among them

This solves the problems above:
 a) the priority is given by poll / select, so there's no race condition

 b) since the priority is received automatically, there's no priority 
inversion: you have the FD open, you get it

 c) since we're giving the priority by way of the very syscall we use to find 
out if there's more data on the socket, the sender can be woken up by socket 
data and process it (except if you've given an RT FIFO priority away).

 d) since the priority is given when the calling thread goes to sleep, by 
consequence it's not running; if it does wake up, the given priority is taken 

Additional benefits:
 e) if the call timed out, the calling thread resumes execution and takes away 
its priority

 f) the daemon code and the library actually need no modification to receive 
priorities: since they receive all file descriptors and keep them open for the 
duration of the DBusMessage, they have the priority.

  However, the daemon code should be modified so it does *not* forward the 
priority FD to eavesdropping receivers. In fact, it should *only* forward to 
the intended receiver, which also neatly solves the issue of priority FDs 
being passed in signal messages: the bus gets the priority, but doesn't pass 
it along

 g) user code often keeps the original DBusMessage around before sending its 
reply, if nothing else for the serial ID. Code that drops the message should 
be adapted to keep it around or at least keep the priority FD. The priority FD 
should be sent along the reply, so the bus daemon gets the priority needed to 
process it.

 h) since a thread gets the highest priority from the priority FDs it has 
open, the bus daemon is automatically running at the highest priority of its 
pending messages, and so are target processes

 priority is usually given per thread, but a file description is a process 
thing. Which thread gets the enhanced priority? All of them?

What do you think?

Thiago Macieira - thiago (AT) - thiago (AT)
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <>

More information about the dbus mailing list