[pulseaudio-discuss] GSoC 2014 call for ideas

Tanu Kaskinen tanu.kaskinen at linux.intel.com
Fri Feb 7 08:29:11 PST 2014

On Fri, 2014-02-07 at 17:44 +0600, Alexander E. Patrakov wrote:
> 2014-02-07 10:17 GMT+06:00 Arun Raghavan <arun at accosted.net>:
> > Hello,
> > This year's call for projects participation is out, and I'd like to
> > gauge interest in participation. I'm happy to run org admin duty
> > again, and if you've got ideas for a project and/or would like to
> > mentor a student, please drop your name on the wiki:
> >
> > http://www.freedesktop.org/wiki/Software/PulseAudio/Software/PulseAudio/GSoC2014/
> >
> > We should decide one way or the other by mid next week so that we can
> > get our org application in well in time if we're doing this.
> Hello.
> The following is mostly a copy-paste from the ideas that I have
> already sent to the list or privately to people, plus a direct
> translation of some features provided by hardware. The text, of
> course, needs to be improved before the final sumbission to Google.

Thanks, good project ideas! Note that the ideas are not sent to Google,
students will read them directly from our own wiki.

> I
> would be happy to review any related code.
> 1. Tool for objective automated noninteractive evaluation of the
> percieved resampler quality.

Would be nice to have, but I don't volunteer to mentor this.

> Problem statement: in commit 92bb9fb8b5aeebb87c4df7416e75db1782e2dd3a,
> the default resampler quality has been changed without any objective
> arguments about the impact on the percieved sound quality. And there
> is no tool to make such objective arguments, although there is enough
> science to create it. It should be created.
> The task, as I see it, is to:
> a) implement a well-respected published psychoacoustical model, or
> take an existing one;
> b) quantify distortions (noise from rounding errors on intermediate
> results, unwanted aliased frequency content, attenuated high
> frequencies) introduced by the existing resamplers - i.e. write a
> program that, given a sound file and the target sample rate, produces
> the dB level of the distortion introduced by a given resampler in each
> time interval at each frequency bin; bonus points for doing the same
> for Windows and Mac OS X built-in resamplers (definitely doable by
> capturing their impulse response through KVM; I did this for Windows
> before writing the Wine resampler, but did not link it to any
> psychoacoustical model);
> c) given a variety of reql-world sound material (music of different
> genres, soundtracks, talks) and a psychoacoustical model, calculate
> the dB level of distortion that can be introduced in each time
> interval in each frequency bin without the average human noticing
> this;
> d) compare the results from (b) and (c), make one of the conclusions:
> "overkill", "just right", "introduces noticeable distortion in this
> frequency band, here is the problematic sample".
> <off-topic>I am quite surprised that there was no "audiophile"
> discussion on the list or elsewhere, especially since the old default
> filter length closely matched what Windows XP does by default (I can
> state that as an author of the resampler used in Wine). But I can't
> make any statements about whether the new default is good enough
> without the mentioned tool.</off-topic>
> Contacts: Alexander E. Patrakov
> Necessary background: digital sound processing, access to scientific
> papers on the topic, python with numpy and scipy, or any other
> mathematical toolbox. If I were to do this, numpy/scipy would be my
> toolbox of choice.
> 2. Rewind-friendly resampler.

I could probably mentor this.

> Problem statement: As of now, when rewinding a sink input, PulseAudio
> resets the resampler. This is wrong and leads to audible clicks, but
> this is a necessary evil because none of the resampler libraries used
> by PulseAudio has a rewind-compatible API (i.e. the existing APIs
> don't allow to say "forget the last 1000 input samples, tell me how
> many output samples should be forgotten due to that"). A new resampler
> has to be written or an existing one improved to such a degree that
> calling pa_stream_write() with the last two parameters other than 0,0
> and overwriting the previously-written samples with themselves does
> not introduce clicks. Just as well, if a sink processes a rewind for
> internal reasons, there should be no clicks.
> Contacts: Alexander E. Patrakov
> Necessary background: digital sound processing, C
> Note: a similar problem exists with virtual sink modules:
> module-equalizer-sink and module-virtual-surround-sink. However, the
> next two proposals invalidate a "fix virtual sinks" would-be-proposal,
> as after them only essentially-realtime effects and module-ladspa-sink
> remain.
> 3. Equalizer in pavucontrol (very questionable, see below)

I don't volunteer to mentor this.

> Problem statement: As of now, the only graphical frontend to
> module-equalizer-sink is qpaeq (PyQT4-based). A GTK-based based
> frontend should be written and included into pavucontrol.
> Contacts: Colin Guthrie?
> Necessary background: C, GTK+, D-Bus
> And here is why I think this is questionable. First, look at
> module-equalizer-sink code. The impression is that it has been
> accepted without any review. It just prenends to "work". E.g. a buffer
> is allocated with fftwf_malloc() and freed with free() instead of
> fftw_free(). The code is also wrong from the DSP viewpoint - e.g. it
> does nothing to ensure that the impulse response is shorter than the
> FFT size minus the window size, thus failing time invariance. If the
> sink is used at the 16000 Hz sampling rate or less, there is a buffer
> overflow due to inconsistent choice of the FFT length and the window
> size. The algorithmic latency is fixed at 15999 samples, which is way
> too much. The module does not use any benefits (e.g. the chance to
> handle rewinds properly) of being a native PulseAudio module and not a
> LADSPA plugin. Veromix (an advanced mixer application for PulseAudio)
> already uses module-ladspa-sink instead of this, maybe due to the
> unified D-BUS API provided by module-ladspa-sink that allows veromix
> to control other LADSPA plugins as well. If I were you, I would have
> deleted the module right now instead of proposing this GSoC project.
> But then, "implement an equalizer in pavucontrol, using
> module-ladspa-sink as a backend" would be valid.
> 4. Channel remixer improvements. (needs splitting, see even more ideas
> in a big comment in resampler.c)

I could mentor this. The first thing to do is to add the infrastructure
for supporting multiple remixer implementations. Then a hook should be
added to make it possible to choose the remixer implementation on a
case-by-case basis from a policy module. Then we can start adding fancy
remixer implementations.

> Problem statement: currently, PulseAudio has a remixer in its core
> that only produces instantaneous linear combinations of the input
> channels, and also module-virtual-surround-sink, that, given a wav
> file with head-related impulse responses, downmixes 5.1 to stereo
> while preserving spatial information. These two remixers have a bad
> interaction between themselves and with profile switches, see below,
> and this looks ugly-to-fix in the virtual-sink model. The goal is to
> introduce advanced upmixing and downmixing techniques into PulseAudio
> core.
> Bad interaction: suppose that one plays a 4.0 track through
> module-virtual-surround-sink. Module-virtual-surround-sink is a sink,
> so PulseAudio applies its usual remixing to all input streams using
> its core. Thus, module-virtual-surround-sink sees not the original 4.0
> content that needs to be downmixed, but fake 5.1 content corrupted by
> synthesizing the fake center and LFE channels. PA_RESAMPLER_NO_REMIX
> would help here, but introduces another problem: with normalization.
> Again, there is no way to distinguish this from a 5.1 stream with a
> silent center channel. Ideally, for safety, the overall filter gain in
> the HRIR-aware downmixer should be such that there is no clipping even
> if all input channels are active - and that gain is different for 5.1
> and 4.0 cases, just because of the different number of channels. The
> core remixer erases this information.
> Now consider that this listener unplugs headphones. Now the sound
> should go to his 5.1 audio system, but instead continues to play on
> this downmixer sink and gets further upmixed by the core. That's
> clearly wrong.
> The same conclusion about profile interaction can be reached by
> considering module-equalizer-sink. It does not switch its number of
> channels even if its master sink does that. As a result, 2.0 -> 5.1 ->
> 2.0 yo-yo is entirely possible (and of course unwanted).
> Writing a fancy upmixer based on reverse-engineered Dolby Pro Logic or
> on scientific papers is also within the scope of this project.
> Current status: I have a rewritten (and rewind-friendly!) virtual sink
> module sitting on my laptop that applies arbitrary IIR filters. I will
> send it after cleanup of scripts that generate the filter
> coefficients. This is already good enough to provide LFE channel
> extraction, to replace the virtual surround sink and even to provide
> virtual surround effect on my laptop speakers, but of course not good
> enough to solve the profile-related problem.
> Contact: Alexander E. Patrakov
> Necessary background: digital sound processing, C
> Possible split:
>  * integrate multichannel-to-binaural HRIR-based downmixer into core
> (possibly after I publish the IIR sink) (and maybe allow its use even
> for stereo streams, to narrow them down if the user wants it)
>  * integrate binaural-to-stereo remixer into core (when I publish the
> IIR sink, or based on the published ambiophonics research)
>  * integrate LFE extraction into core (when I publish the IIR sink, or
> independently)
>  * write and integrate a fancy stereo-to-5.1 upmixer based on published research
>  * integrate heuristics to apply and unapply the above effects appropriately
> 5. Per-channel delay (probably too simple)

I don't volunteer to mentor this.

> Problem statement: some high-end audio receivers (e.g. Onkyo TX-NR626)
> have an option to introduce a separately-configurable delay in each
> channel. This is needed, e.g., if due to the room geometry constraints
> the speakers are not equidistant from the listener. This happens,
> e.g., with the front-center channel if one places all three front
> speakers near the wall - in this case, the front-center signal needs
> to be slightly delayed WRT front-left and front-right in order to
> arrive to the listener at precisely the same moment of time. It would
> be nice to emulate this feature in PulseAudio for the benefit of users
> with cheap 5.1 analog speakers, and provide a GUI for it.
> Contact: Alexander E. Patrakov (?)
> Necessary background: C, GTK+.
> 6. Digital Room Correction for PulseAudio

I don't volunteer to mentor this.

> Problem statement: some high-end audio receivers (e.g. Onkyo TX-NR626)
> do not even have a graphical equalizer! Instead, they come with a
> calibrated microphone and a digital room correction feature in the
> firmware. They play a known test sound through each speaker, record
> what the microphone hears, and thus learn about the room acoustics.
> Then they apply this knowledge to equalize the played-back sound. This
> feature should be available for users of analog speakers, too, via
> PulseAudio.
> In fact, there already exists a free implementation of Digital Room
> Correction: http://drc-fir.sourceforge.net/ , one just needs to write
> a FIR convolution engine for PulseAudio and a GUI for calibration. And
> also to think how to work around the fact that a calibrated microphone
> is not always available - luckily there are some readily-available
> "calibrated" sound sources like popping bubble wrap.
> Contact: Alexander E. Patrakov
> Necessary background: C, digital sound processing, a calibrated microphone.
> 7. Intra-application sound mixing (needs discussion, may be a social
> problem after all)

I could mentor this.

> Some time ago, I added a documentation patch (with some improvements
> from Tanu) about known misuse of PulseAudio API. As a part of that
> patch, I made a far-fetched but IMHO true statement that sometimes it
> is a responsibility of the application itself to mix its own streams
> (as it is done in Wine) or to attenuate samples. However, I am afraid
> that this will be percieved as a documentation of a PulseAudio bug
> (inability to mix individual application streams without polluting the
> mixer GUIs with extra sliders) that just shifts the responsibility and
> extra work to individual developers. Also, this documentation is not
> read by developers that use PulseAudio not directly, but via wrappers
> like GStreamer and Qt, so a source of "application bugs" still exists.
> To be fair, in GStreamer the problem looks solved: "audiomixer"
> performs synchronous in-application mixing - just what is needed. But
> not everyone uses or wants to use GStreamer. So I think that there is
> some room for improvement in PulseAudio itself.
> Problem statement: add API functions to PulseAudio that would allow an
> application to request that its streams are mixed together without
> showing a separate volume slider for each of them in pavucontrol and
> similar PulseAudio mixer applications.
> Contact: ?
> Required background: C
> 8. LV2 sink (maybe too simple)

I don't volunteer to mentor this.

> Problem statement: LV2 is a successor of LADSPA. Pulgin authors move
> to the new API, but PulseAudio does not have any way to load these
> lugins and use them for sound processing. A new virtual sink needs to
> be written, as well as a GUI (possibly integrated into pavucontrol).
> Contact: Alexander E. Patrakov
> Required background: C, GTK+
> 9. Dynamic range compression (maybe already solved)

I don't volunteer to mentor this.

> Problem statement: some consumer electronics (e.g. the Onkyo TX-NR626
> receiver) have a mode in which they reduce the dynamic range of the
> incoming signal. This is supposed to be used when listening to
> classical music at night, so that neighbours don't wake up and the
> quietest passages are still audible. Make this feature available to
> users of cheap analog speakers, via PulseAudio. Write a GUI for
> configuring it.
> This may be already solved by vlevel LADSPA plugin (I have not tried
> it), but needs GUI integration and heuristics to apply this only to
> high-latency music sterams from players. And possibly a port is needed
> for rewind comatibility, but I am not sure here if this is possible at
> all.
> Contact: ?
> Required background: C, GTK+, digital signal processing (?)
> 10. GUI for module-combine-sink and module-remap-sink

I don't volunteer to mentor this. When the routing work is complete,
module-combine-sink should become largely obsolete. The core should
directly support routing streams to multiple outputs.


More information about the pulseaudio-discuss mailing list