If you’re around and interested, do drop by!
I have a secret to confess. I’ve spent a great deal of time over the last few months talking to myself. I can’t say I haven’t enjoyed it — it turns out my capacity to entertain myself is far greater than initially suspected. But I hear you ask … why?
Here at Collabora, I’ve been building on Wim’s previous work on adding echo cancellation to PulseAudio. Thanks go to Intel for supporting us in continuing this work. Before too long, all this work will be trickling down to your favourite Linux distribution and all your friends will stop hating you.
First, a quick recap on what acoustic echo cancellation (AEC) is. If you already know this, you might want to skip this paragraph and the next. Say you’re on your laptop, and you receive a voice call from your friend. You don’t have a pair of headphones lying around, so you’re just going to use your laptop’s built-in speakers and mic. When your friend speaks, what she says is played out the speakers, but is also captured by the microphone and she gets to hear herself speak, albeit a short while (a few hundred milliseconds or more) later. This is called acoustic echo, and can be frustrating enough to make conversation nigh impossible. There are other types of echo for phone systems, but that’s not interesting to us at the moment.
This problem is common on pretty much all devices that you use to make phone calls. Astute readers will ask why they don’t actually face this problem on their phone. That’s because your phone (or, if you have a cheap phone, your phone company) has special software hidden away that removes the echo before sending your signal along to the other end. On laptops, which are general-purpose hardware, the job of echo cancellation is left to either your operating system (Windows XP onwards, for example) or your chat client (Skype, for example) to provide.
On Linux, we implement echo cancellation as a PulseAudio module (code-ninja Wim Taymans wrote this last year). We use the Speex DSP library to perform the actual echo cancellation. The code’s quite modular, so it’s not very hard to plug in alternate echo cancellers (we even include an alternate implementation, which isn’t quite as effective as Speex).
Recently, we plugged in some more bits from the Speex library to do noise suppression and digital gain control (so you can quit twiddling with your mic volume for the other end to be able to hear you). We also added a bunch of fixes to reduce CPU consumption significantly — this should be good enough to run on a netbook and reasonably recent ARM platforms.
While all this sounds nice, I think a demo would sound (haha!) nicer …
This is a recording of a call between my laptop and N900. The laptop is playing audio out the speakers and recording with the built-in mic. What you hear is the conversation as heard on the N900.
All this echo cancelling goodness will come to a Linux distribution near you in the upcoming 1.0 release of PulseAudio. The next version of the GNOME IM client, Empathy (3.2), will actually make use of this functionality. In due time, we intend to make it so that all voice applications will end up using this functionality (so if you’re writing a VoIP application and don’t want to use this functionality, you need to set a special stream property to disable this — filter.suppress="echo-cancel").
For the impatient among you, you can try all this out by getting recent testing versions of PulseAudio (I know packages are available for Ubuntu, Debian, Gentoo and Mageia at least). To force your phone streams to use echo cancellation, just run pactl load-module module-echo-cancel, and you’re done.
There’s still some work to be done, refining quality and using other AEC implementations (in the short-term, the WebRTC one looks promising). Things don’t work at all if you’re using different devices for playback and capture (e.g. laptop speakers and webcam mic). These are things that will be addressed in coming weeks and months.
My exploits at Collabora Multimedia currently involve a brief detour into hacking on Rygel, specifically improving the DLNA profile name guessing. We wanted to use Edward‘s work on GstDiscoverer work, and Rygel is written in Vala, so the first thing to do was write Vala bindings for GstDiscoverer. This turned out to be somewhat easier and more difficult than initially thought. :)
There’s a nice tutorial for generating Vala bindings that serves as a good starting point. The process basically involves running a tool called vapigen, which examines your headers and libraries, and generates a GIR file from them (it’s an XML file describing your GObject-based API). It then converts this GIR file into a “VAPI” file which describes the API in a format that Vala can understand. Sounds simple, doesn’t it?
Now if only it were that simple :). The introspected file is not perfect, which means you need to manually annotate some bits to make sure the generated VAPI accurately represents the C API. These annotations are specified in a metadata file. You need to include things like “the string returned by this function must be freed by the caller” (that’s a transfer_ownership), or, object type Foo is derived from object type FooDaddy (specified using the base_class directive). Not all these directives are documented, so you might need to grok around the sources (specifically, vapigen/valagidlparser.vala) and ask on IRC (#vala on irc.gnome.org).
All said and done, the process really is quite straightforward. The work is in [my gst-convenience repository][arun-gst-conv-ks.git] right now (should be merged with the main repository soon). I really must thank all the folks on #vala who helped me with all the questions and some of the bugs that I discovered. Saved me a lot of frustration!
I’ve already got Rygel using these bindings, though that’s not been integrated yet. More updates in days to come.
Yesterday was my last day at NVidia. I’ve worked with the Embedded Software team there for the last 15 months, specifically on the system software for a Linux based stack that you will see some time next year. I’ve had a great time there, learning new things, and doing everything from tweaking bit-banging I²C implementations with a CRO to tracking down alleged compiler bugs (I’m looking at you -fstrict-aliasing) by wading through ARM assembly.
As some of you might already know, my next step, which has had me bouncing off the walls for the last month, is to join the great folks at Collabora Multimedia working on the PulseAudio sound server. I’ll be working from home here, in Bangalore (in your face, 1.5-hour commute!). It is incredibly exciting for me to be working with a talented bunch of folks and actively contributing to open source software as part of my work!
More updates as they happen. :)