GSoc idea: Echo cancellation for voice calls.
dev at stefankriwanek.de
Mon Mar 19 21:10:14 EDT 2012
I'm glad you decided to apply this year! I'd like to ask for your opinions towards an idea of mine I'd like to work on.
My idea is to equip Pidgin (or libpurple) with a means to do echo cancellation in the audio stream. I think this could drastically improve voice call quality or make it possible at all in difficult settings.
Since there is actually not a single idea related to voice/video in your GSoC ideas list, I'm trying to provide a detailed rationale:
What is echo cancellation? I cite from http://blogs.gnome.org/uraeus/2010/10/07/echo-cancellation-on-linux/ :
[Echo cancellation] is a way to resolve the issue that if you record sound from your laptop microphone and at the same time output sound from your speakers, you easily end up with the sound looping, creating an irritating echo effect, which makes doing voice calls on a machine painful and sometimes impossible. Echo cancellation systems basically try to analyse the data coming out of the speakers so that it can filter it out and ignore it when it comes back through the microphone.
In fact, acoustic echo cancellation is a standard part of all the VoIP solutions I know so far: Skype, Ekiga, Mumble and GNOME's Empathy.
At the moment, in Pidgin the user can only somewhat control the quality of a voice call by manually adjusting microphone sensitivity and silence detection threshold (and of course output volume).
Pidgin's current lack makes successful voice calls impossible in some situations due to an audio feedback loop occuring (the high-pitched squeaking noise you probably know from any kind of live performance). The worst case I can think of is a (popular?) laptop-to-laptop call which has both very strong indirect couping (through the air) and (distorted) through-the-case audio coupling of loudspeaker output to microphone input.
This means if a user wants to do such a call she has to adjust the silence detection to a very high level in order to prevent an audio feedback loop from occuring. However, then the silence detection level is so high that probably the first and last parts of spoken words or sentences are lost, or worse all voice. The other method to prevent a feedback loop is to lower the loudspeaker volume, but obviously not by an arbitrary amount either.
My point is that there are some popular hardware setups where voice calls are simply impossible to do with current Pidgin. However, e.g. Skype handles the same situation just well, which means with the proper software adjustments a good quality voice call is perfectly possible even in the worst case of two laptop users talking to each other.
Additionally, even if the acoustic setup is well enough for no feedback loop to occur, but "only" audible echo, this still severely hinders our brain's performance of speech recognition (there are actual experimental studies to that for which I do not have a link at hand right now). This is due to the high delay time in VoIP calls (in contrast to "analog" PSTN calls. This is one of the reasons why acoustic echo cancellation is usually unnecessary there. The other reason is that the sound of an earphone can be much more muted than the sound of a laptop speaker in a voice call.)
Also, proper echo cancellation will probably prevent a feedback loops in any setup including the acoustically even worse "testing" case of a user speaking to herself to adjust microphone sensitivity.
If you like the idea, I'd be happy to elaborate upon the implementation part.
At the same time: I'm curious as to why there seems to be neither a VV GSoC idea on your page nor VV related mails on this list (during last year). Is there simply no-one maintaining that part of Pidgin or is it a conscious decision not to develop it further, perhaps because specialized VoIP apllications probably will do a better job? In my opinion, there's definitely a need for an open source voice/video call solution that "just works". (A solution which is not separated from the user's usual IM application.)
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 900 bytes
Desc: OpenPGP digital signature
More information about the Devel