Pushing image transport logic down the stack

Mon Sep 4 15:21:25 PDT 2006

On Mon, 2006-09-04 at 14:25 +0100, Alan Cox wrote:
> Ar Llu, 2006-09-04 am 08:47 -0400, ysgrifennodd Owen Taylor:
> > Well, the pipedream would be that you'd be able to allocate a bunch 
> > of pages, put data into them, pass them over the X socket to the
> 
> (I assume you mean pass by reference here)
> > server, the server picks them up, uses them, and when the graphics
> > engine is done, releases them back to the kernel.
> 
> > Are the page table manipulation costs of such an API low enough to
> > make it feasible? I have no idea.
> 
> SYS5 shm isn't the most efficient way but it isn't that inefficient
> either. The create/destroy cycle is also inefficient because you don't
> neccessarily need to do it in all cases.

But how do you know? There are a small class of applications that
reliably use the same amount of image buffers over and over again -
media players, say. I think the current SHM extension works fine for
these apps ... if they aren't using Xvideo.

For cairo or GTK+ to retain shared memory buffers is a losing gamble
most of the time.

> Assuming the X server didn't need to touch the pages (ie you could work
> out what can be done in advance) then the DRI layer does close to what
> is needed at the moment.
> 
> You allocate some pages of your own address space (which may be shared
> with other clients assuming MAP_SHARED is used and you've got things
> like themes file backed), you pass them to the DRI layer which
> references them, locks them if need be and issues 3D command sequences.
> It then frees its use of them and you get them back.

It's certainly a seductive idea that we might be able to get the
graphics card pointed directly at the application's data or a bit
of the GTK+ mmap'ed icon cache and avoid troubling the processor's
cache. A much better result. But I think that idea is really the
enemy here: to get it going would require API changes at every layer,
from the application to the graphics card.

What interests me is getting copies down as far as possible without
major changes to the model. There is a lot of complexity and
inefficiency in what goes on right now that is just extraneous.

> For the existing shm type model going to posix shared memory would
> probably be more efficient and you can do more useful things with it but
> it isn't much different. You can mmap it then and have real handles and
> paths (eg you can pass the file handle to posix shm down an AF_UNIX
> socket), and you can delete it and it will go away on last close ..

Interesting.

> Neither solves the network case where you really want the object
> residing in the remote server.

Well, that's again hard, because you have to guess the intent of the
application... and then worry about applications with good intentions
that hurt the overall system.
					- Owen

P.S. - this morning I said:

> > Alternatively, improving the efficiency of the standard 
> > XPutImage() / write() path makes the need for shared memory less; 
> > as mentioned in my earlier mail, vmsplice() could possibly be
> > useful to that end.

As I was out hiking today, it occurred to me that the above is pretty
much nonsense: If the goal is to reduce copies of data during
communication with the X server, we already know how to do it: use
a SHM transport for the X protocol. There is no need to get the
kernel involved with buffer management.

You still have the question of what to do about images that exceed
the size of your protocol buffer:

 - You can chunk them into multiple requests, if the request is 
   chunkable. PutImage is, a hypothetical RenderCompositeImage isn't
   if there is transformation or filtering of the source. You lose
   pipelining, since you have to wait for each request to finish
   before you can free the buffer space.

 - You can feed the single request through the buffer in chunks
   and let the server reconstitute it on the other side. Adding
   back a copy.

 - You could reference an external shared memory buffer; it's clear
   that at some image size, allocating a new shared memory buffer is
   better than copying data, but I have no idea what that point
   is - is it 100k, 1M, 10M?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.x.org/archives/xorg/attachments/20060904/b4713729/attachment.pgp>