[gst-devel] mmap'ing in the disksrc

Erik Walthinsen omega at temple-baptist.com
Wed Mar 21 05:55:44 CET 2001


This is an explanation of a better way to use mmap() from a guy at OGI
who's working on a big-media streaming server (not with gstreamer, since
he's using this architecture, though there's nothing to say we can't pull
it off within gstreamer).  I'd like to make use of as much of this as
possible with in the [multi]filesrc plugin[s].

I have a feeling this is why some people were seeing the memory allocation
of gstreamer-based apps increase without limit: the mmap of the whole file
gets attributed the application, which isn't too smart.  We should instead
be mmap()ing regions of the file, as explained below.

---------- Forwarded message ----------
Date: 20 Mar 2001 20:35:02 -0800
From: Charles 'Buck' Krasic <krasic at acm.org>
To: Erik Walthinsen <omega at temple-baptist.com>
Cc: krasic at acm.org
Subject: Re: mmap()ing sequence


There's a paper on the Flash webserver (link at the C10k page).

The idea is to simulate an asynchronous read that uses mmap'ed IO.
You ask for offset,length and either you get the address of the result
immediately, or EGAIN.

The read api keeps a section of the file mapped internally.  If
file,offset is not within the current mmapped range, the file is
re-mapped appropriately. madvise is used to tell the OS to fetch all
of the mmap, and release the old pages.

When the offset,length are within the current mmaped region, mincore
is used to determine if the corresponding pages are actually resident.
If they are, the address of the offset,len within the current map are
returned from read.

If not, the read returns EAGAIN, and a real-time signal will be posted
later, with the result.  I use sigwaitinfo/sigtimedwait exclusively,
so there's no need for locks of any kind.  The read is actually
delegated to a helper thread, again using a realtime signal
(sigqueue).  The helper simply touches each of the pages that
offset,len falls within.  [ Another option is to mlock() the pages. ]
Once the helper has made the pages resident, it sends a signal back to
the main thread.  The main thread dequeues this signal with
sigwaitinfo/sigtimedwait and calls read_done with the siginfo to get
information about the completed read.

Given that the madvise tells the kernel to go ahead an page everything
in the map, I expect the reads will mostly be zero-latency (i.e. no
need to dispatch to the helper).

There are a couple alternative approaches that might make more sense
depending on what you want to do.

SGI (and maybe others) have a true kernel implementation of the posix
aio api.  If you are touching the data, then the copy avoidance of
mmap might not be a win.    (I'm not since I'm just streaming).

On the other hand, there's patches floating around for zero-copy
sendfile.  This is even more efficient, but they don't support an
asynchronous version yet (AFAIK).

-- Buck

Erik Walthinsen <omega at temple-baptist.com> writes:

> You mentioned a specific sequence of mmap()ig files a piece at a
> time, and doing mincore() and such.  Do you have a reference for
> that, or can you write up a quick description of it?

> TIA,
>    Omega





More information about the gstreamer-devel mailing list