[pulseaudio-discuss] optimal app/Pulse data transfers?

Sun Oct 18 16:25:48 PDT 2009

On Fri, 02.10.09 17:55, pl bossart (bossart.nospam at gmail.com) wrote:

> Hi there,

Heya.

Sorry for not responding earlier. I kinda get the idea that most
questions raised in this mail are already cleared up now, judging by
the recent discussion on the gst ML. But nonetheless:

> I am trying to understand some performance measurements I did this
> week. I have a test app that decodes audio and writes a decoded buffer
> with pa_simple_write(). Depending on the size of the buffer I pass to
> PulseAudio, I see a ~20% variation in the core activity (bigger
> buffers are better), and I can't really figure out the reason why I am
> having this overhead.
> 
> It seemed to me that setting minreq to -1 in the buf_attr fields would
> select an 'optimal' size, however having looked at the code I realize
> PulseAudio selects a fixed value of 20ms. As I understanding it, as
> soon as I have minreq=20ms free in the server buffer+queues, a request
> will be made to the client and unblock the app. Before I experiment
> further, I was wondering what would happen if I set minreq to a larger
> value, say 100ms? It would force the queue to drain and my decoder to
> provide bigger buffers, which would seem to make sense power-wise. And
> why was 20ms optimal in the first place, how was this value
> determined? To avoid memory copies, shouldn't minreq be determined by
> the granularity what the application can provide, e.g. 30ms for
> G723.1, 1152/sample-freq for MP3, etc?
> Or is there another parameter that could explain my results?

The effect of minreq is actually not as important as it might appear:
It's just a threshold that when more data than this threshold is
missing from the server side buffer the client is asked for
it. However the evnt when this comparison is made is not influenced by
minreq: that solely depends on the sink's latency and sleep times.

I probably should never have allowed the user to configure minreq
per-stream. It's mostly a value that allows the server to suppress
certain client data request and should be dependant on the time the
server needs to shovel data from the per-stream buffers to the device
buffer while mixing it. So it is a bit of protection that when buffers
of different sizes (coming from differnt clients) are mixed (the
resulting buffer will have the minimial size of all mixed source
buffers) we don't end up sending smaller and smaller requests to the
clients.

> While I was at it, I realized there's a new routine since 0.9.16
> called pa_stream_begin_write(). The comments are not totally clear:
> - "This function may be used to optimize the number of memory copies
> when doing playback ("zero-copy")". I thought we were already using
> zero-copy? What's saved here compared to a plain pa_stream_write()?

For local clients, if you use pa_stream_write() the data you pass in
will be copied into a SHM seg which is the handed to the
server. If you use pa_stream_begin_write() you can do without that
copy and place your data directly inside the shm seg.

> Looks to me this is only useful if the application can use the buffer
> internally (sort of mmap-like).

Yes, it is very similar to ALSA's mmap iface.

> - 'It is not recommended letting an unbounded amount of time pass
> after calling pa_stream_begin_write() and before calling
> pa_stream_write()'. Not recommended as in not safe, or performance
> would degrade, or we would run out of memory?

Things are not designed so that this would be a good idea. i.e. we
take the liberty to shift around memory if we want to. That is blocked
while you have this write standing out.

OTOH you might get away with blocking it for indefinite time, but
really, that's not how its supposed to be used. And maybe one day this
will actually turn out to be a problem in your app.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net
http://0pointer.net/lennart/           GnuPG 0x1A015CC4