[pulseaudio-discuss] Testing echo cancellation on an armhf OMAP phone

Wed Dec 19 23:25:42 PST 2012

On Wed, 2012-12-19 at 07:41 +0000, Neil Jerram wrote:
> Arun Raghavan <arun.raghavan at collabora.co.uk> writes:
> 
> > On Mon, 2012-12-17 at 21:49 +0000, Neil Jerram wrote:
> > [...]
> >> - load module-echo-cancel
> >> 
> >> - do "paplay -d
> >>   alsa_output.platform-soc-audio.0.analog-stereo.echo-cancel
> >>   /media/card/Documents/audio/ogg/Do\ They\ Know\ It\'s\ Christmas.ogg"
> >>   in one terminal
> >> 
> >> - do "parecord -d
> >>   alsa_input.platform-soc-audio.0.analog-stereo.echo-cancel
> >>   --file-format=wav > record1.wav" in another terminal
> >> 
> >> - speak into the microphone.
> >
> > In general, to start with, you should pick a recording of voice rather
> > than music since that's the sort of echo that is designed to be
> > cancelled. I've noticed varying degrees of success for music with speex
> > and much better success with the webrtc canceller, but starting with the
> > basics is better.
> 
> Good point, thanks, I'll do that.  Also I realise now that I really want
> the entire process of in-call audio routing to be running at 8000 only -
> because that's all I need for voice, and because I presume that should
> take less power than involving higher rates.
> 
> Overall, for this phone, I have two audio scenarios.
> 
> -  In-call audio, which can/should all be handled at 8000.
> 
> -  Media playback outside calls, which I think should be at 44.1 kHz for
>    best quality.
> 
> Is it possible for a single instance of PulseAudio to switch between
> those scenarios.  If not, I think I can pretty easily stop and restart
> PulseAudio when the scenario changes.  (I'm guessing from your and
> Tanu's other replies to me that I might need to restart with different
> default-sample-rate settings, to get the best outcome and performance
> for my two scenarios.)

Restarting pulseaudio would be an atrocious hack. I really doubt that it
can work well.

Anyway, I recommend you to start with configuring the sound card with 48
kHz and module-echo-cancel with 8 kHz.

The sound card appears to support both 44.1 kHz and 48 kHz (but when
using both input and output at the same time, the rates must match).
There is then some room for optimization: normally 44.1 kHz would be
better, but during phone calls 48 kHz would probably be better
(resampling between 48 kHz and 8 kHz should be easier than between 44.1
kHz and 8 kHz, but I don't know if the resamplers in pulseaudio are able
to optimize the 48/8 kHz case in practice).

Switching between 44.1 kHz and 48 kHz would ideally be done by making
two different card profiles, which you would switch when the current
scenario changes. It's not currently possible to specify the sample rate
in the profile configuration, however, so this is not viable right now.

Pulseaudio supports automatic sample rate switching depending on the
connected streams (set default-sample-rate to 44100 and
alternate-sample-rate to 48000, like they are by default), and this
would be a great solution, if it wasn't for the fact that the input and
output must always have matching rate. The sample rate switching logic
isn't able to take that into account, so if the output is active when
you change from the music playback scenario to the phone call scenario,
it probably doesn't work. It might be possible to make this work so that
you forcefully suspend both the sink and the source before changing the
scenario, then tear down the music stream and start the phone streams,
and then unsuspend the sink and the source.

If the sound card supports 8 kHz, then the above still applies, just
replace 48000 with 8000.

By the way, the failure to resume the source, which you reported
earlier, should be avoidable by recording using some sample rate that is
divisible by 4000 (i.e. 8 kHz or 48 kHz should work fine). You didn't
specify the sample rate in your parecord command, so it defaulted to
44100 Hz. That caused pulseaudio to try to resume the source at 44.1
kHz. If you used e.g. 8 kHz, then pulseaudio would have tried to resume
the source at 48 kHz, which would probably have worked.

-- 
Tanu