[gst-devel] pulsesink optimizations

Wed Oct 14 23:48:47 CEST 2009

pl bossart write:
> Hi folks,
> I noticed performance issues due to the rewrite of pulsesink since the
> 0.10.15 release. The degradation is in the 30% range on my Atom board
> when playing MP3/AAC. There have been a couple of modifications in git
> related to buffer attributes and latency settings, but overall the
> overhead remains, and the pulsesink code could be further optimized
> for low-power playback apps that don't care about latency.

I noticed the same on the Nokia N900.

> I finally took the time to look at the code and check what was going
> on. It seems that the overhead is mainly due to the granularity of
> transfers between pulsesink and PulseAudio. What happens is that the
> sink waits for space available in the PulseAudio buffer. When PA
> requests data in a callback, the mainloop unblocks and the sink writes
> its PCM to PulseAudio. The problem is that the sink will not try to
> fill the whole buffer before handing-off the data to PulseAudio. For
> example, say PulseAudio requests 100k (as defined by minreq) and you
> are doing MP3 decode, you are going to send one frame (4608 bytes) at
> a time to PulseAudio until the 100k have been filled. That's a lot of
> overhead. It would be a lot more efficient power-wise to decode and
> store as many frames as possible into the PA buffer before calling
> pa_stream_write().

Wim just committed my patch that changes pulsesink back to set the minreq to 
the value of the latency-time property, which lets applications tune the 
gst<->pa overhead again.

During the investigation of that regression, I found that there is some further 
things to optimize in pulsesink. I will be filing more bugs and sending more 
patches as I come up with better solutions.

> I have snippets of code as a proof of concept. I don't mind releasing
> the code, but I must admit this is a hack and does not cover all the
> cases pulsesink addresses. An additional optimization could consist in
> passing the PulseAudio buffer upstream to avoid memory copies. The new
> PA release provides support for this with pa_stream_begin_write(). In
> short, I would badly need a review from more experienced developers...
> If anyone is interested let me know.
> 
> Cheers,
> - Pierre

Using that API is a step into the right direction. However there is still a lot 
to do. GStreamer desperately needs a zero-copy mechanic for audio such that the 
audio decoders' output buffer sizing doesn't incur arbitrary overhead.

For the time being, I think you can get almost the same performance/battery 
life gain by increasing the output buffer size of your audio decoders. Felipe 
Contreras has been trying this with the vorbis decoder, with good results.

-- 
Regards,
   René Stadler