I would like to thank the 36 contributors over the last 6 months who have made this release what it is and continue to demonstrate what a vibrant community we have!
For those of you who missed my previous updates, we recently organised a PulseAudio miniconference in Copenhagen, Denmark last week. The organisation of all this was spearheaded by ALSA and PulseAudio hacker, David Henningsson. The good folks organising the Ubuntu Developer Summit / Linaro Connect were kind enough to allow us to colocate this event. A big thanks to both of them for making this possible!
The conference was attended by the four current active PulseAudio developers: Colin Guthrie, Tanu Kaskinen, David Henningsson, and myself. We were joined by long-time contributors Janos Kovacs and Jaska Uimonen from Intel, Luke Yelavich, Conor Curran and Michał Sawicz.
We started the conference at around 9:30 am on November 2nd, and actually managed to keep to the final schedule(!), so I’m going to break this report down into sub-topics for each item which will hopefully make for easier reading than an essay. I’ve also put up some photos from the conference on the Google+ event.
Mission and Vision
We started off with a broad topic — what each of our personal visions/goals for the project are. Interestingly, two main themes emerged: having the most seamless desktop user experience possible, and making sure we are well-suited to the embedded world.
Most of us expressed interest in making sure that users of various desktops had a smooth, hassle-free audio experience. In the ideal case, they would never need to find out what PulseAudio is!
Orthogonally, a number of us are also very interested in making PulseAudio a strong contender in the embedded space (mobile phones, tablets, set top boxes, cars, and so forth). While we already find PulseAudio being used in some of these, there are areas where we can do better (more in later topics).
There was some reservation expressed about other, less-used features such as network playback being ignored because of this focus. The conclusion after some discussion was that this would not be the case, as a number of embedded use-cases do make use of these and other “fringe” features.
Increasing patch bandwidth
Contributors to PulseAudio will be aware that our patch queue has been growing for the last few months due to lack of developer time. We discussed several ways to deal with this problem, the most promising of which was a periodic triage meeting.
We will be setting up a rotating schedule where each of us will organise a meeting every 2 weeks (the period might change as we implement things) where we can go over outstanding patches and hopefully clear backlog. Colin has agreed to set up the first of these.
Next on the agenda was a presentation by Janos Kovacs about the work they’ve been doing at Intel with enhancing the PulseAudio’s routing infrastructure. These are being built from the perspective of IVI systems (i.e., cars) which typically have fairly complex use cases involving multiple concurrent devices and users. The slides for the talk will be put up here shortly (edit: slides are now available).
The talk was mingled with a Q&A type discussion with Janos and Jaska. The first item of discussion was consolidating Colin’s priority-based routing ideas into the proposed infrastructure. The broad thinking was that the ideas were broadly compatible and should be implementable in the new model.
There was also some discussion on merging the module-combine-sink functionality into PulseAudio’s core, in order to make 1:N routing easier. Some alternatives using te module-filter-* were proposed. Further discussion will likely be required before this is resolved.
The next steps for this work are for Jaska and Janos to break up the code into smaller logical bits so that we can start to review the concepts and code in detail and work towards eventually merging as much as makes sense upstream.
This session was taken up against the background of improving latency for games on the desktop (although it does have other applications). The indicated required latency for games was given as 16 ms (corresponding to a frame rate of 60 fps). A number of ideas to deal with the problem were brought up.
Firstly, it was suggested that the maxlength buffer attribute when setting up streams could be used to signal a hard limit on stream latency — the client signals that it will prefer an underrun, over a latency above maxlength.
Another long-standing item was to investigate the cause of underruns as we lower latency on the stream — David has already begun taking this up on the LKML.
Finally, another long-standing issue is the buffer attribute adjustment done during stream setup. This is not very well-suited to low-latency applications. David and I will be looking at this in coming days.
Merging per-user and system modes
Tanu led the topic of finding a way to deal with use-cases such as mpd or multi-user systems, where access to the PulseAudio daemon of the active user by another user might be desired. Multiple suggestions were put forward, though a definite conclusion was not reached, as further thought is required.
Tanu’s suggestion was a split between a per-user daemon to manage tasks such as per-user configuration, and a system-wide daemon to manage the actual audio resources. The rationale being that the hardware itself is a common resource and could be handled by a non-user-specific daemon instance. This approach has the advantage of having a single entity in charge of the hardware, which keeps a part of the implementation simpler. The disadvantage is that we will either sacrifice security (arbitrary users can “eavesdrop” using the machine’s mic), or security infrastructure will need to be added to decide what users are allowed what access.
I suggested that since these are broadly fringe use-cases, we should document how users can configure the system by hand for these purposes, the crux of the argument being that our architecture should be dictated by the main use-cases, and not the ancillary ones. The disadvantage of this approach is, of course, that configuration is harder for the minority that wishes multi-user access to the hardware.
Colin suggested a mechanism for users to be able to request access from an “active” PulseAudio daemon, which could trigger approval by the corresponding “active” user. The communication mechanism could be the D-Bus system bus between user daemons, and Ștefan Săftescu’s Google Summer of Code work to allow desktop notifications to be triggered from PulseAudio could be used to get to request authorisation.
David suggested that we could use the per-user/system-wide split, modified somewhat to introduce the concept of a “system-wide” card. This would be a device that is configured as being available to the whole system, and thus explicitly marked as not having any privacy guarantees.
In both the above cases, discussion continued about deciding how the access control would be handled, and this remains open.
We will be continuing to look at this problem until consensus emerges.
Improving (laptop) surround sound
The next topic dealt with being able to deal with laptops with a built-in 2.1 channel set up. The background of this is that there are a number of laptops with stereo speakers and a subwoofer. These are usually used as stereo devices with the subwoofer implicitly being fed data by the audio controller in some hardware-dependent way.
The possibility of exposing this hardware more accurately was discussed. Some investigation is required to see how things are currently exposed for various hardware (my MacBook Pro exposes the subwoofer as a surround control, for example). We need to deal with correctly exposing the hardware at the ALSA layer, and then using that correctly in PulseAudio profiles.
This led to a discussion of how we could handle profiles for these. Ideally, we would have a stereo profile with the hardware dealing with upmixing, and a 2.1 profile that would be automatically triggered when a stream with an LFE channel was presented. This is a general problem while dealing with surround output on HDMI as well, and needs further thought as it complicates routing.
I gave a rousing speech about writing more tests using some of the new improvements to our testing framework. Much cheering and acknowledgement ensued.
Ed.: some literary liberties might have been taken in this section
Unified cross-distribution ALSA configuration
I missed a large part of this unfortunately, but the crux if the discussion was around unifying cross-distribution sound configuration for those who wish to disable PulseAudio.
The next topic we took up was base volumes, and whether they are useful to most end users. For those unfamiliar with the concept, we sometimes see sinks/sources where which support volume controls going to > 0dB (which is the no=attenuation point). We provide the maximum allowed gain in ALSA as the maximum volume, and suggest that UIs show a marker for the base volume.
It was felt that this concept was irrelevant, and probably confusing to most end users, and that we suggest that UIs do not show this information any more.
Relatedly, it was decided that having a per-port maximum volume configuration would be useful, so as to allow users to deal with hardware where the output might get too loud.
Devices with dynamic capabilities (HDMI)
Our next topic of discussion was finding a way to deal with devices such as those HDMI ports where the capabilities of the device could change at run time (for example, when you plug out a monitor and plug in a home theater receiver).
A few ideas to deal with this were discussed, and the best one seemed to be David’s proposal to always have a separate card for each HDMI device. The addition of dynamic profiles could then be exploited to only make profiles available when an actual device is plugged in (and conversely removed when the device is plugged out).
Splitting of configuration
It was suggested that we could split our current configuration files into three categories: core, policy and hardware adaptation. This was met with approval all-around, and the pre-existing ability to read configuration from subdirectories could be reused.
Another feature that was desired was the ability to ship multiple configurations for different hardware adaptations with a single package and have the correct one selected based on the hardware being run on. We did not know of a standard, architecture-independent way to determine hardware adaptation, so it was felt that the first step toward solving this problem would be to find or create such a mechanism. This could either then be used to set up configuration correctly in early boot, or by PulseAudio for do runtime configuration selection.
Relatedly, moving all distributed configuration to /usr/share/..., with overrides in /etc/pulse/... and $HOME were suggested.
Better drain/underrun reporting
David volunteered to implement a per-sink-input timer for accurately determining when drain was completed, rather than waiting for the period of the entire buffer as we currently do. Unsurprisingly, no objections were raised to this solution to the long-standing issue.
In a similar vein, redefining the underflow event to mean a real device underflow (rather than the client-side buffer running empty) was suggested. After some discussion, we agreed that a separate event for device underruns would likely be better.
We called it a day at this point and dispersed beer-wards.
David very kindly invited us to spend a day after the conference hacking at his house in Lund, Sweden, just a short hop away from Copenhagen. We spent a short while in the morning talking about one last item on the agenda — helping to build a more seamless user experience. The idea was to figure out some tools to help users with problems quickly converge on what problem they might be facing (or help developers do the same). We looked at the Ubuntu apport audio debugging tool that David has written, and will try to adopt it for more general use across distributions.
The rest of the day was spent in more discussions on topics from the previous day, poring over code for some specific problems, and rolling out the first release candidate for the upcoming 3.0 release.
I am very happy that this conference happened, and am looking forward to being able to do it again next year. As you can see from the length of this post, there are lot of things happening in this part of the stack, and lots more yet to come. It was excellent meeting all the fellow PulseAudio hackers, and my thanks to all of them for making it.
Finally, I wouldn’t be sitting here writing this report without support from Collabora, who sponsored my travel to the conference, so it’s fitting that I end this with a shout-out to them. :)
This problem seems to bite some of our hardened users a couple of times a year, so thought I’d blog about it. If you are using grsec and PulseAudio, you must not enable CONFIG_GRKERNSEC_SYSFS_RESTRICT in your kernel, else autodetection of your cards will fail.
PulseAudio’s module-udev-detect needs to access /sys to discover what cards are available on the system, and that kernel option disallows this for anyone but root.
For the lazy, these are some of the topics we’ll be covering:
- Vision and mission — where we are and where we want to be
- Improving our patch review process
- Routing infrastructure
- Improving low latency behaviour
- Revisiting system- and user-modes
- Devices with dynamic capabilities
- Improving surround sound behaviour
- Separating configuration for hardware adaptation
- Better drain/underrun reporting behaviour
Phew — and there are more topics that we probably will not have time to deal with!
For those of you who cannot attend, the Linaro Connect folks (who are graciously hosting us) are planning on running Google+ Hangouts for their sessions. Hopefully we should be able to do the same for our proceedings. Watch this space for details!
p.s.: A big thank you to my employer Collabora for sponsoring my travel to the conference.
For those of you who missed it, your friendly neighbourhood PulseAudio hackers are converging on Copenhagen in a month to discuss, plan and hack on the future of PulseAudio.
We’re doing this for the first time, so I’m super-excited! David has posted details so if this is of interest to you, you should definitely join us!
That’s right, it’s finally out! Thanks go out to all our contributors for the great work (there’s too many — see the shortlog!). The highlights of the release follow. Head over to the announcement or release notes for more details.
Dynamic sample rate switching by Pierre-Louis Bossart: This makes PulseAudio even more power efficient.
Jack detection by David Henningsson: Separate volumes for your laptop speakers and headphones, and more stuff coming soon.
Major echo canceller improvements by me: Based on the WebRTC.org audio processing library, we now do better echo cancellation, remove the need to fiddle with the mic volume knob and have fixed AEC between laptop speakers and a USB webcam mic.
A virtual surround module by Niels Ole Salscheider: Try it out for some virtual surround sound shininess!
Support for Xen guests by Giorgos Boutsiouki: Should make audio virtualisation in guests more efficient.
Special thanks from me to Collabora for giving me some time for upstream work.
Packages are available on Gentoo, Arch, and probably soon on other distributions if they’re not already there.
If you’ve got an autotools-based project that you’d like to build on Android, whether on the NDK or system-wide this is really useful.
Some of you might’ve noticed that there has been a bunch of work happening here at Collabora on making cool open source technologies such as GStreamer, Telepathy, Wayland and of course, PulseAudio available on Android.
Since my last blog post on this subject, I got some time to start looking at replacing AudioFlinger (recap: that’s Android’s native audio subsystem) with PulseAudio (recap: that’s the awesome Linux audio subsystem). This work can be broken up into 3 parts: playback, capture, and policy. The roles of playback and capture are obvious. For those who aren’t aware of system internals, the policy bits take care of audio routing, volumes, and other such things. For example, audio should play out of your headphones if they’re plugged in, off Bluetooth if you’ve got a headset paired, or the speakers if nothing’s plugged in. Also, depending on the device, the output volume might change based on the current output path.
I started by looking at solving the playback problem first. I’ve got the first 80% of this done (as we all know, the second 80% takes at least as long ;) ). This is done by replacing the native AudioTrack playback API with a simple wrapper that translates into the libpulse PulseAudio client API. There’s bits of the API that seem to be rarely used(loops and markers, primarily), and I’ve not gotten around to those yet. Basic playback works quite well, and here’s a video showing this. (Note: this and the next video will be served with yummy HTML5 goodness if you enabled the YouTube HTML5 beta).
(if the video doesn’t appear, you can watch it on YouTube)
Users of PulseAudio might have spotted that this now frees us up to do some fairly nifty things. One such thing is getting remote playback for free. For a long time now, there has been support for streaming audio between devices running PulseAudio. I wrote up a quick app to show this working on the Galaxy Nexus as well. Again, seeing this working is a lot more impressive than me describing it here, so here’s another video:
(if the video doesn’t appear, you can watch it on YouTube)
This is all clearly work in progress, but you can find the code for the AudioTrack wrapper as a patch for now. This will be a properly integrated tree that you can just pull and easily integrate into your Android build when it’s done. The PA Output Switcher app code is also available in a git repository.
I’m hoping to be able to continue hacking on the capture and policy bits. The latter, especially, promises to be involved, since there isn’t always a 1:1 mapping between AudioFlinger and PulseAudio concepts. Nothing insurmountable, though. :) Watch this space for more updates as I wade through the next bit.
If you’re a student participating in this year’s edition of Google Summer of Code and want to get your hands dirty with some fun low-level hacking, here’s a quick reminder that PulseAudio is a participating organisation for the first time, and we have some nice ideas for you to hack on.
The deadline for applications is 2 days away, so get those applications in soon! If you’ve got questions, feel free to drop by #pulseaudio on the Freenode IRC network and ping us. (I’m Ford_Prefect there for those who don’t know)
Most of you have no doubt already seen that Mozilla will be changing their position on H.264 support for HTML5 video in future releases. This is an extremely important decision that I’ve been hoping to see for a while now, and I am really glad this is being done.
There is no doubt that we need patent-unencumbered standards for web codecs (or as much as is possible given the dismal patent ecology today), and while much giddy anticipation followed Google/On2’s release of VP8 into the open, I don’t believe it ever made sense to expect the codec landscape to change drastically in the short timespan everyone expected. There’s a lot of the hardware and software out there that needs to change (see any SoCs with VP8 support yet?), not to mention the interests of the MPEG-LA
I love Firefox, both as a product and what it means for an open web (for those of you that know me, this might be hard to believe given all my ranting, but it’s true!). I’m glad Mozilla chose to live to fight another day rather than go out in a blaze of glory and (or a flicker of irrelevance).
p.s.: these are my views and do not necessarily represent those of my employer
p.p.s.: Alessandro’s been doing some great work to get the GStreamer multimedia backend going again (this makes so much more sense than going the NIH route!)