The third week of October was quite action-packed, with a whole bunch of conferences happening in Düsseldorf. The Linux audio developer community as well as the PulseAudio developers each had a whole day of discussions related to a wide range of topics. I’ll be summarising the events of the PulseAudio mini summit day here. The discussion was split into two parts, the first half of the day with just the current core developers and the latter half with members of the community participating as well.
I’d like to thank the Linux Foundation for sparing us a room to carry out these discussions — it’s fantastic that we are able to colocate such meetings with a bunch of other conferences, making it much easier than it would otherwise be for all of us to converge to a single place, hash out ideas, and generally have a good time in real life as well!
With a whole day of discussions, this is clearly going to be a long post, so you might want to grab a coffee now. :)
Release plan
We have a few blockers for 6.0, and some pending patches to merge (mainly HSP support). Once this is done, we can proceed to our standard freeze → release candidate → stable process.
Build simplification for BlueZ HFP/HSP backends
For simplifying packaging, it would be nice to be able to build all the available BlueZ module backends in one shot. There wasn’t much opposition to this idea, and David (Henningsson) said he might look at this. (as I update this before posting, he already has)
srbchannel plans
We briefly discussed plans around the recently introduced shared ringbuffer channel code for communication between PulseAudio clients and the server. We talked about the performance benefits, and future plans such as direct communication between the client and server-side I/O threads.
Routing framework patches
Tanu (Kaskinen) has a long-standing set of patches to add a generic routing framework to PulseAudio, developed by notably Jaska Uimonen, Janos Kovacs, and other members of the Tizen IVI team. This work adds a set of new concepts that we’ve not been entirely comfortable merging into the core. To unblock these patches, it was agreed that doing this work in a module and using a protocol extension API would be more beneficial. (Tanu later did a demo of the CLI extensions that have been made for the new routing concepts)
module-device-manager
As a consequence of the discussion around the routing framework, David mentioned that he’d like to take forward Colin’s priority list work in the mean time. Based on our discussions, it looked like it would be possible to extend module-device-manager to make it port aware and get the kind functionality we want (the ability to have a priority-order list of devices). David was to look into this.
Module writing infrastructure
Relatedly, we discussed the need to export the PA internal headers to allow externally built modules. We agreed that this would be okay to have if it was made abundantly clear that this API would have absolutely no stability guarantees, and is mostly meant to simplify packaging for specialised distributions.
Which led us to the other bit of infrastructure required to write modules more easily — making our protocol extension mechanism more generic. Currently, we have a static list of protocol extensions in our core. Changing this requires exposing our pa_tagstruct structure as public API, which we haven’t done. If we don’t want to do that, then we would expose a generic “throw this blob across the protocol” mechanism and leave it to the module/library to take care of marshalling/unmarshalling.
Resampler quality evaluation
Alexander shared a number of his findings about resampler quality on PulseAudio, vs. those found on Windows and Mac OS. Some questions were asked about other parameters, such as relative CPU consumption, etc. There was also some discussion on how to try to carry this work to a conclusion, but no clear answer emerged.
It was also agreed on the basis of this work that support for libsamplerate and ffmpeg could be phased out after deprecation.
Addition of a “hi-fi” mode
The discussion came around to the possibility of having a mode where (if the hardware supports it), PulseAudio just plays out samples without resampling, conversion, etc. This has been brought up in the past for “audiophile” use cases where the card supports 88.2/96 kHZ and higher sample rates.
No objections were raised to having such a mode — I’d like to take this up at some point of time.
LFE channel module
Alexander has some code for filtering low frequencies for the LFE channel, currently as a virtual sink, that could eventually be integrated into the core.
rtkit
David raised a question about the current status of rtkit and whether it needs to exist, and if so, where. Lennart brought up the fact that rtkit currently does not work on systemd+cgroups based setups (I don’t seem to have why in my notes, and I don’t recall off the top of my head).
The conclusion of the discussion was that some alternate policy method for deciding RT privileges, possibly within systemd, would be needed, but for now rtkit should be used (and fixed!)
kdbus/memfd
Discussions came up about the possibility of using kdbus and/or memfd for the PulseAudio transport. This is interesting to me, there doesn’t seem to be an immediately clear benefit over our SHM mechanism in terms of performance, and some work to evaluate how this could be used, and what the benefit would be, needs to be done.
ALSA controls spanning multiple outputs
David has now submitted patches for controls that affect multiple outputs (such as “Headphone+LO”). These are currently being discussed.
Audio groups
Tanu would like to add code to support collecting audio streams into “audio groups” to apply collective policy to them. I am supposed to help review this, and Colin mentioned that module-stream-restore already uses similar concepts.
Stream and device objects
Tanu proposed the addition of new objects to represent streams and objects. There didn’t seem to be consensus on adding these, but there was agreement of a clear need to consolidate common code from sink-input/source-output and sink/source implementations. The idea was that having a common parent object for each pair might be one way to do this. I volunteered to help with this if someone’s taking it up.
Filter sinks
Alexander brough up the need for a filter API in PulseAudio, and this is something I really would like to have. I am supposed to sketch out an API (though implementing this is non-trivial and will likely take time).
Dynamic PCM for HDMI
David plans to see if we can use profile availability to help determine when an HDMI device is actually available.
Browser volumes
The usability of flat-volumes for browser use cases (where the volume of streams can be controlled programmatically) was discussed, and my patch to allow optional opt-out by a stream from participating in flat volumes came up. Tanu and I are to continue the discussion already on the mailing list to come up with a solution for this.
Handling bad rewinding code
Alexander raised concerns about the quality of rewinding code in some of our filter modules. The agreement was that we needed better documentation on handling rewinds, including how to explicitly not allow rewinds in a sink. The example virtual sink/source code also needs to be adjusted accordingly.
BlueZ native backend
Wim Taymans’ work on adding back HSP support to PulseAudio came up. Since the meeting, I’ve reviewed and merged this code with the change we want. Speaking to Luiz Augusto von Dentz from the BlueZ side, something we should also be able to add back is for PulseAudio to act as an HSP headset (using the same approach as for HSP gateway support).
Containers and PA
Takashi Iwai raised a question about what a good way to run PA in a container was. The suggestion was that a tunnel sink would likely be the best approach.
Common ALSA configuration
Based on discussion from the previous day at the Linux Audio mini-summit, I’m supposed to look at the possibility of consolidating the various mixer configuration formats we currently have to deal with (primarily UCM and its implementations, and Android’s XML format).
(thanks to Tanu, David and Peter for reviewing this)
 
					
Alexander E. Patrakov
November 12, 2014 — 1:56 am
I can also state that the conclusions are reflected correctly in this post.
Arun
November 12, 2014 — 11:42 am
Thanks for reviewing this, Alexander!
Carlos Silva
November 12, 2014 — 4:01 am
Just out of curiosity, wouldn’t it be beneficial to PulseAudio (latency wise) to be integrated into the kernel? It can be a stupid question, but I’d like to know the answer to it :) The main reason behind the question is, the dbus was “pulled” into the kernel to achieve, among other things, was to get a “speed boost” passing the messages around. Can’t the same be done for PulseAudio? Or does the sound system simple lives better on userland?
Arun
November 12, 2014 — 11:47 am
We used to have the audio mixer in kernel back — there are a number of drawbacks to this approach, but to my mind, the two biggest ones are the complexity of pushing all this into the kernel, and that you can be a lot more flexible in userspace (our modules let you do a very wide range of things from controlling routing to implementing new types of audio outputs, such as sending audio over the network).
Note that D-Bus is slightly different in this case — IPC is basic underlying mechanism that many userspace programs need, and doing D-Bus in the kernel helps because now you don’t need an extra context switch between a sender, the D-Bus daemon, and the receiver.
Some day, we hope to use this or related mechanisms to decrease the number of context switches in PulseAudio client-server communication as well.
liam
November 12, 2014 — 5:21 am
Hi Arun,
Can you tell us what the results were regarding the resampling findings? Aiui, the only advantage of memfd over sum is that it allows for the possibility of immutable buffers (in your case). The cost of that, however, is two context switches, so, it seems it would only make sense if pa has issues with sum segments being unexpectedly overwritten.
Best/Liam
Arun
November 12, 2014 — 11:51 am
For resampling, you can find some of Alexander’s work here: http://lists.freedesktop.org/archives/pulseaudio-discuss/2014-October/021953.html
For memfd — one reason to use this was to make sure that misbehaving clients can’t cause problems on the server side (which is possible with standard SHM). The other was to try to use this infrastructure to rework communication so that clients directly talk to the I/O thread (right now, it’s client -> PA mainloop thread -> PA sink I/O thread). The last part doesn’t require memfd per se.
Nathael
November 12, 2014 — 7:10 am
Hi all :) Just a comment about the rtkit and kdbus/memfd discussions : I’m kind of a “simple user” but also an embeded system builder. Chosing a kernel is not always an easy thing on an embedded system (sometimes we don’t even have a choice), and init systems on embeded target often have to be specific. I think it usually better not to depend on specific kernel features when these are not related to the core functionnalities a software provides (audio for pulseaudio), neither it is to depend on a specific init system (like systemd) (i do not see what audio has to do with a system init steps …).
Lennart brought up the fact that rtkit currently does not work on systemd+cgroups based setups : sounds like “systemd+cgroups broke something, but you should change your code to adapt to what we broke”. Sorry if I’m wrong, but this is what it looks like from an outside point of view when reading your report.
Thanks for your job on pulsaudio (all of you) :) +++
Arun
November 12, 2014 — 11:54 am
On an embedded system, you probably would have ways to not be dependent on this mechanism, so hopefully that eases your concern. rtkit just made it easier for us to control who gets RT privs, etc. which is nice to have.
That said, even in that space, there is potential value in looking at systemd/cgroups to manage resource allocation.
John
November 12, 2014 — 9:24 am
Hello,
one thing that would be great for 6.0 is to work with the Wine devs, and see why pulseaudio has so many issues with wine. The wine devs, don’t seem to be interested in having a PA output, thinking wine is enough, but with the wine emulation it doesn’t work that well… Without things like PULSE_LATENCY_MSEC=30 or similar in config files, the sound either cracks, or goes too fast etc…
I have no idea who is wrong, or who to blame.. just that this has been annoying maybe gamers using Wine and PA….
Thanks!
Arun
November 12, 2014 — 11:56 am
Yes, this situation is a bit annoying. I don’t know if we’ll get to this in 6.0, and I think Alexander’s findings were that there’s a pretty big disconnect between PA’s model and DirectSound’s. Something to put on the list of things to fix!
Alexander E. Patrakov
November 12, 2014 — 12:42 pm
Well, I’d rather formulate this in a different way. Nobody except PulseAudio expects clients to listen to the “stream moved to a different device” events. Listening to such events is required in order to get notified about possibly-updated buffer metrics. And no clients handle such events meaningfully. To add insult to the injury, the pulse ALSA plugin silently swallows the situation when PulseAudio (due to hardware limitations) cannot provide the latency or wakeup period requested by the application, so Wine NEVER has an accurate idea of what is going on – it is basically being unconditionally told “yes, yes, you can get 40 ms latency and 10 ms period time”.
This is tracked as https://bugs.freedesktop.org/show_bug.cgi?id=66962
Alexander E. Patrakov
November 12, 2014 — 12:51 pm
Well, in short: the situation is thoroughly broken on multiple layers (pulseaudio itself, ALSA plugin, programmer expectations). Not something that can be correctly fixed for 6.0.
John
November 12, 2014 — 4:33 pm
Thanks for the replies guys! Too bad no-go for 6.0. Hopefully before 7.0 though :)
Felipe
November 12, 2014 — 10:22 am
Hi, thanks for the writeup. The problem with rtkit and systemd is documented at various links linked from a debian bug report.
I have tried to investigate this but so far have come empty handed. The systemd page for this problem documents solutions that are no longer available, and moreover the manual solution seems to imply a need for a kernel built with CONFIG_RT_GROUP_SCHED enabled (which debian at least doesn’t).
I really wish that someone (cough Lennart cough) that really understood this could come up with a way to resolve this issue.
Arun
November 12, 2014 — 11:56 am
Yes, it’s not clear at all how we can proceed. :-/
Seb
November 12, 2014 — 3:42 pm
The hi-fi mode is what I need right now! Please include this any time soon. It also does not seem to be so hard to simply not do something like resampling :)
josephk
December 2, 2014 — 3:13 am
hi Arun,
would it be possible to PA to resample a new stream to the rate of the stream that is playing at that moment by default (with no fixed default to 16/44)?
Example: I play a file @24/88, I open youtube and PA upsample that stream to that rate. Then I stop both streams and play a file @16/48 and PA plays it with no resampling.
thanks