Multimedia support (Re: X Developers' Summit)

Tue Jul 17 06:57:42 PDT 2007

Am Dienstag, 17. Juli 2007 14:41 schrieb Carsten Haitzler:
> ok. something here doesn't gel with me. my warning instincts are beeping.
> we now have:
> 1. audio client write to buffer
> 2. context switch to server.
> 3. server write to mixer client
> 4. context switch
> 5. mixer client mixes and writes to buffer
> 6. context switch
> 7. server writes to audio device

not quite, it would be more like
0. mixer client schedules mixing operations to server (say 1/10 of a second in 
advance)
(1/2. context switch to audio client)
1. audio client write to buffer
2. context switch to server
3. server mixes (performs scheduled requests) and writes to audio device

of course mixer audio client and server need not run lock-stepped, so in 
reality number of context switches is far lower

as you can see the only latency that is introduced into the audio path is the 
context switch into the server, but this is unavoidable with any audio server 
design

(changing the mixing ratio would have higher latency, but I consider this 
acceptable; it is probably far better than the 100ms guess above for a local 
mixer client, the mixer client only needs to schedule enough operations so it 
can be sure the server has enough commands to chew on until the mixer can 
send its next batch of operations, but 100ms works for me on a LAN)

> to me the idea of adding timestamps to showing an image or playing a sound
> is nice - BUT i think to me it smells of patching a problem the wrong way.
> a problem that is created by the design. while a mixer client is nice in
> principle - in reality - how many funky uses to you really expect?

Hm, I don't know yet, but the mechanism to schedule requests serves more 
purposes than just this one, it is used to draw and display images in sync 
with audio, see below...

> personally if i am writing an x app that displays an image with some sound
> syncronise to the display i want to do:
>
> for (;;) {
> int time_to_wait;
> XShmPutImage();
> XAudioShmPut();
> XSync();
> /* calculate time to wait based on last "Frame" display */
> usleep(time_to_wait);
> }

The goal is to enable playback across the network, so I do not want to rely on 
shm and client-side synchronisation; I know that most multimedia guys have 
given up on network transparency (and treat X just as a "blit-to-screen 
service"), but the goal of my project is to demonstrate that it is possible 
to do this sanely.

[snip: shm audio channel]
> well you still need x proto for a signalling channel - when to switch shm
> buffers, which parts of the buffer are to be played, locked, etc. but it
> does cut out the copies. though unlike images the cost of the copy is
> almost nothing compared to XPutImage vs. XShmPutImage()

Yes of course shared memory would not be about data volume, but entirely about 
latency. As explained above, commands to perform mixing can be scheduled 
largely asynchronously.

The shm areas could be structured as ring-buffers (the server side sample 
buffers are a "strange" sort of ring buffers already). This way the client 
app producing the data and the server loop consuming the data for mixing can 
be quite loosely coupled. Basically you "only" need to take care that 
producer and consumer do not drift "too much", and I think this could be done 
using a client-side timer and occasional status messages exchanged via the X 
protocol to avoid clock drift. As I said, shm is not implemented yet, so I am 
not sure if this really works out in practice, but I think it could.

> anyway - just wishing you a lot of luck in what you are doing - the
> principle idea i think is good - the details i think can be discussed. as i
> said - i disagree a bit as i think 99.9% of the use will be either simple
> mixing (multiplied by client/channel volume) or blocking out certain client
> channels in favor of others (effectively being able to set other channel
> volumes to 0). i agree you need an "audio manager" (like a window manager
> or composite manager), but i think it needs to take more of a "control
> channel sources, destinations, volumes etc.).
>
> maybe we need both? maybe we need an internal mixer and like xcomposite -
> the ability TO redirect if we need to do something bizarre/funky?

Maybe, I do not yet know how it turns out because so far I am the only user :) 
This is why I appreciate feedback like yours, and I am sorry we are talking 
about things you cannot yet touch...

Best regards
Helge Bahmann
--
Mathematicians stand on each other's shoulders while computer scientists
stand on each other's toes.
-- Richard Hamming