Notes from the PipeWire Hackfest 2026: Part 2

(these notes are being posted in two parts to make the length more manageable, part 1 is here)

Continuing from where we left off, about topics discussed at the PipeWire hackfest in Nice…

DSP features

We discussed a number of features related to digital signal processing blocks which are typically realised on specialised hardware (often a DSP core that can directly interface with physical audio inputs and outputs on your laptop/phone/…).

There is currently no standard way for the firmware running on these DSPs to signal what features can be realised directly on DSP. We also would want to allow such features, if exposed from PipeWire, to be realisable on CPU.

Now we do have a way to hide away signal processing in a specific node, which is the filter-graph parameter on the audioconvert node that wraps all audio nodes.

We could extend this mechanism to allow the internal node (say the ALSA node implementation), to expose what filtering it can perform “in hardware” (i.e. the software running on DSP). This would allow the audioconvert to delegate some or all processing to the internal node, with fallbacks available on the CPU.

We would need a number of pieces to do this, including:

  • Some standard definition of filters and associated parameters, so different implementations could have a standard “API” to express any given filter.

  • The DSP block would need to expose what features it has and how they might be used. We could imagine extending the ALSA UCM configuration to do that.

  • The audioconvert node would need to have a way to push down filter-graph params to the internal node, and negotiate what work it is doing vs. what is being delegated

This is a non-trivial effort, but gives us some sketch of what might be possible.

More DSP features

In addition to standard filters, we spoke about two topics that have come up commonly in the past.

The first is some way to expose the processing graph in the DSP, so PipeWire and other userspace daemons have a better view of what is happening on the DSP. With the ability to push dynamic topologies to DSP, there was some renewed interest in exposing and using the ASoC DAPM widget graph. As always, the devil is in the details.

The second thing that came up is speaker calibration. There is a lot of processing and tuning that goes into driving speakers on modern devices as much as possible without destroying them. Some of these are one-time parameters decided at product design time, and some of these translate to runtime parameters based on voltage and current feedback from the speaker amplifier.

For some systems (like Qualcomm platforms), speaker calibration might be run on each system start to perform dynamic tuning. We had some discussion of how this might tie in with the rest of the system for both determining the parameters (separate startup daemon vs. in-process initialisation), as well as uploading parameters to the speaker (some ALSA UCM extensions to load parameters on PCM open but before start, or preloading parameters into ALSA kernel controls and having the driver feed them in at the right point).

Volume limits

A way to set a limit on the maximum volume for a given device has been a common user request ([1] [2]). We discussed the possibility of creating a per-route property (with a fallback to the node, if there are no routes), which WirePlumber could manage to provide users a simple interface to control.

Since the hackfest, Wim has already done some work on this, and we need to bubble this up as a more user-accessible setting.

Performance

A number of performance-related topics were discussed.

The first was an option of a combined DSP mode, where instead of one port per channel, a node would expose one port for all the channels of the stream (but continue to run in the configured “DSP” format/rate). This would improve stream performance for non-JACK-like use-cases, especially in resource-constrained environments.

On the WirePlumber side, there was a discussion about using LuaJIT instead of standard Lua. There are some compatibility issues to be determined there (such as language version supported, etc.), but there might be some quick performance wins to be made if this is feasible.

There is a plan to move some of the WirePlumber core to Rust, and that might be a good time to also port over some of the more standard functionality that tends not to change from Lua to Rust (though that could happen in a Lua->C transition and does not really need to wait on a Rust port).

Declarative Session Management

Another interesting, and broader, thread is the imperative nature of WirePlumber scripts – that is, policy decisions and associated action are often interwoven. It might be helpful to be able to make a clearer split where all policy decisions are first run, and then decisions are translated into actions at one go.

There are some historical choices that make this hard – for example, changing the profile of a device might create and destroy nodes, which makes it hard to be able to make decisions that are independent of the action. There were some ideas around redoing the profile concept such that all nodes are always exposed, but nodes could get a new state to signal availability (and profiles that would allow availability to change). That might make a declarative system possible to implement.

We also discussed the possibility of a “transaction” system. Something that would allow a client to submit a set of objects (think links between nodes), and then “commit” that transaction. This would also help reduce the number of roundtrips between PipeWire and WirePlumber, and generally help performance.

Bluetooth

Being colocated with the BlueZ face-to-face meeting, we had representation from the BlueZ community, so we were able to dive into a number of topics related to Bluetooth, primarily LE Audio.

The first topic was Auracast, the LE Audio system for broadcast audio, allowing listeners to tune into public broadcasts in a space, or to have a device stream audio to multiple headsets concurrently for shared listening. George had a demo system showing an implementation of Auracast with PipeWire, WirePlumber and BlueZ.

We had some discussion of where this feature should live, and the consensus was that we would probably want a separate daemon to manage Auracast settings and loading up the appropriate nodes (either for receiving or sending) based on users’ preferences.

This led to a more general discussion about the current split of the Bluetooth implementation in PipeWire being SPA modules, which include streaming and some policy, and a lot more policy living inside WirePlumber. We could, and likely should, move all of this into higher level PipeWire modules instead, which could make these easier to work with overall.

There was also a discussion about the complexities of LE Audio, and the state of the current user experience with actual devices:

  • Device interop is not always great, as the spec is new, the BlueZ implementation is still being completed, and device implementations seem of variable quality
  • Reliable pairing/feature detection is hard, partly due to how BlueZ exposes the ability to talk to devices in Bluetooth Classic or Bluetooth LE modes
  • Pairing left/right pairs currently needs individual pairing, which does not seem to be needed by other implementations (Android for example)
  • Inter-device synchronisation might need some work as well

While there is much work to be done here, the pieces are coming together for first-class LE Audio support on Linux-based systems.

Audio analytics

We also spoke about “analytics” – using local neural networks to implement things like text-to-speech, speech-to-text, language translation, or other forms of processing.

These pose an interesting problem, because they look like a standard-ish audio stream on one side, but are effectively a sparse stream on the other side if we are talking about text. Even conversion between languages does not look like a standard filter, because the underlying model might consume a varying amount of data before generating an output, and the input and output lengths are not tightly correlated.

While it should be possible to implement such a system with PipeWire, it is not quite clear whether we should. As the application space in this area becomes more mature, it may become clearer what the right place in the stack is for these features.

Click detection and elimination

We spoke about detecting and eliminating clicks at the stop or start of a stream.

If an application is playing back audio, and suddenly stops (i.e. feeds silence, or just nothing), then the sudden drop in the signal might cause a click to be output. If you think of the corresponding waveform as representing the physical displacement of the speaker, then the drop to zero is like a sudden brake to a halt, which isn’t possible, and manifests as a jolt that you hear as a clicky noise. The same analogy holds for resuming from a pause, but in the opposite direction.

The solution is usually to smooth out the end of the sound by fading out, but most applications do not do this, so this problem manifests quite clearly for most browser or application streams if you listen closely.

Wim described a number of experiments he has done for detecting such abrupt changes in audioconvert, but he was not happy with the results. We discussed some of these approaches, and what might work as acceptable tradeoffs to capture the most common cases while still trying to respect the integrity of the signal being sent by the application.

(sorry about the vagueness here, I missed taking more detailed notes)

Miscellanea

The rest of the discussion covered disparate topics that I don’t have long form notes on:

  • Hardware profiles: Shipping hardware-specific configuration for PipeWire and WirePlumber is hard. We discussed some approaches using context properties and conditions, but this is an area that needs more work.

  • Data loop management: PipeWire allows splitting work across data loops so different nodes in a graph can be assigned to different threads. This is currently an all-or-nothing system, where either all nodes go to a single data loop, or every node must be manually assigned a specific data loop. There was some desire to have the ability for there to be a default data loop to make the manual management less cumbersome.

  • ACP -> UCM: PipeWire inherits the ALSA card profile configuration from PulseAudio, which has been helpful in making the migration path smoother on most hardware. There was always some desire to have a single configuration system (probably ALSA UCM) for all hardware, but this likely needs some work on what we can express in UCM configuration, but we also need to clean up how we translate our UCM handling code (George has an RFC for this).

Thanks

That’s it, thank you for reading if you made it this far, and a shout out to George, Mark, and others organising the event!

It was great to see continued interest and so much exciting work that is yet to come. I hope to see more of the community in the next edition of the hackfest.

Notes from the PipeWire Hackfest 2026: Part 1

(these notes are being posted in two parts to make the length more manageable, part 2 is here)

The PipeWire community organised a hackfest in Nice, France, colocated with Embedded Recipes, the GStreamer hackfest, and a number of other events.

In attendance were members of the upstream community, as well as folks interested in PipeWire from Collabora, Red Hat, Qualcomm, Stream Unlimited, Texas Instruments, and Valve. In some cases these were the same person wearing upstream and professional hats, as some of us often do! :)

It was two days of fruitful and deep technical discussions, and lovely evenings hanging out in the Côte d’Azur. Shout out to George Kiagiadakis and Mark Filion for putting this together!

A photo of the waters in Nice from a rooftop
Beautiful view of the Côte d’Azur

The topics were disparate and can be somewhat esoteric for folks who are not familiar with the Linux audio space. I will try to strike a balance between providing context and summarising the finer details we discussed. Please feel free to write in if I missed or can expand on anything.

Multistream nodes

A recurring topic for the last couple of years has been supporting multistream nodes. The PipeWire API currently offers a pw_stream interface that can offer a node with single input or output (closer to the PulseAudio API), and the pw_filter interface that provides a lower-level freeform API to individually manage ports on a node (closer to the JACK API).

The stream API while convenient, can be a bit unwieldy for realising concepts such as loopbacks and filters, because each set of inputs and outputs needs to be implemented as an individual node. If you’ve ever loaded the loopback module, for example, you would have noticed that there are two nodes created for each instance.

Wim has created a version of the API that allows a node to provide multiple streams, which allows us to keep the conveniences of the stream API, but more easily express ideas like the loopbacks, filters, etc. Each stream is effectively a group of ports on the node, and nodes can have an arbitrary number of input and output streams.

The code on the PipeWire side is ready. The primary idea is there will be a PortConfig param per stream, and this is where the format of the stream, and other metadata expressed on port groups (which is essentially what a stream is) will live.

We discussed what is needed in WirePlumber to make sure the linking logic adapts to this concept, and Julian will be implementing that in the coming weeks.

Settings

PipeWire has a generic metadata system based on the JACK API that is used for storing metadata (allowing you to attach a key/type/value, optionally attached to an object). This is also used by WirePlumber to provide its settings system (see wpctl settings), along with some key features such as a schema and persistence.

We discussed that it might be nicer to have the concept of settings as a first-class citizen, and possibly even standardise some settings for desktop wide usage (such as common processing elements). There was consensus that:

  • A new settings interface (instead of extending metadata) would make sense
  • The API should be asynchronous, and can fail
  • A schema for valid settings and their types could be exposed as a well-known metadata key
  • Implementors of the interface would perform validation

Security

We spoke about the current state of security for applications using PipeWire. For context, PipeWire has a fine-grained permissions model where each client can have selective access to what objects are visible to it, and what actions it may perform. There is also a less granular system, where a “manager” application can connect to the manager socket for full access. We broadly think about restricted security for sandboxed applications (primarily Flatpak).

One scenario is sandboxed PulseAudio applications getting full access via the pipewire-pulse server on the host. The discussion on this concluded that there is a way for pipewire-pulse to forward enough security-related information from sandboxed applications for us to apply sandbox restrictions to them, and we need to make that system work.

There was a discussion that it might be reasonable for our default policies to apply for all applications connecting to the regular PipeWire socket to be restricted (this does not prevent malicious applications from accessing the manager socket, but helps applications not do bad things erroneously).

This might be disruptive to introduce as a default change, so we might implement it via an opt-in setting so that there can be some broader testing and refinement of default permissions before flipping the switch for all users.

There are a number of mechanisms related to how security context properties are relayed, and how those properties are used by WirePlumber to determine permissions. We need to document and verify the expected behaviour here.

Flatpak and Portals

Relatedly there was a discussion about how things should fit in with Flatpak, and Sebastian Wick from the Flatpak team joined us briefly on the second day.

There was some discussion of making sure the PulseAudio socket is provided to the sandbox in a similar way to the PipeWire socket, such that some additional security properties can be assigned from the host in a way that the sandboxed client cannot override.

We agreed that we needed the ability for applications to specify with some granularity what permissions they require (via portals), and for us to grant only that (with user intervention, if needed). Broadly this is:

  • Playback (optionally enumeration of sinks)
  • Capture (optionally enumeration of sources)
  • Default visibility of only the application’s own nodes

We also spoke about how we might want to associate PipeWire objects with applications. With Flatpak moving to using a cgroup for each application, this should become easier. We may also want to be able to have a way to associate a stream with a specific window (to, for example, share a window and its audio), which should be possible.

It was also noted that for some classes of applications, we may want a way for users to allow some of these permissions at install time (for example, a remote desktop application asking permission on every start can be annoying). This is already possible with Flatpak manifests (which are static, but we might need to add some more options here), and there is a potential entitlement system being discussed (for server-driven overrides to be distributed for malicious applications, for example).

Encapsulation and Collections

One topic that came up last year is the ability to encapsulate a group of nodes such that they appear as a single node to other applications in the system. This could be useful for:

  • Collapsing all the output from an application so it appears to be providing a single stream
  • Grouping all the filters for a sink or source node, and making it appear as a single node with all the processing hidden away

One piece to making such a system possible is to have a first-class notion of this group. Julian has an implementation of such an entity, called a “collection”. This is currently implemented on top of PipeWire metadata, but we agree that this is likely worth having an explicit PipeWire interface for.

Once that is in place, we discussed the possibility of having a smarter “proxy” node that can act as the interface that translates from the “outside” of the encapsulated region to the “inside”, so that format selection, volume changes, etc. can properly be proxied to the underlying device, for example.

Tooling improvements

It was noted that the tools we have (such as pw-top and pw-dot) can make it hard to get at some information, such as negotiated formats, rates, etc. They can also be “noisy” when we have a large number of filters and loopbacks.

While we did not have a concrete plan to tackle this, some of us have been playing with LLM-based tooling to generate some helper code for this sort of thing. At least my attempts have been too sloppy to share as yet, but it should be possible to get something useful with a structured approach.

That’s it for now. Watch this space for part 2!

Accessibility Update: Enabling Mono Audio

If you maintain a Linux audio settings component, we now have a way to globally enable/disable mono audio for users who do not want stereo separation of their audio (for example, due to hearing loss in one ear). Read on for the details on how to do this.

Background

Most systems support stereo audio via their default speaker output or 3.5mm analog connector. These devices are exposed as stereo devices to applications, and applications typically render stereo content to these devices.

Visual media use stereo for directional cues, and music is usually produced using stereo effects to separate instruments, or provide a specific experience.

It is not uncommon for modern systems to provide a “mono audio” option that allows users to have all stereo content mixed together and played to both output channels. The most common scenario is hearing loss in one ear.

PulseAudio and PipeWire have supported forcing mono audio on the system via configuration files for a while now. However, this is not easy to expose via user interfaces, and unfortunately remains a power-user feature.

Implementation

Recently, Julian Bouzas implemented a WirePlumber setting to force all hardware audio outputs (MR 721 and 769). This lets the system run in stereo mode, but configures the audioadapter around the device node to mix down the final audio to mono.

This can be enabled using the WirePlumber settings via API, or using the command line with:

wpctl settings node.features.audio.mono true

The WirePlumber settings API allows you to query the current value as well as clear the setting and restoring to the default state.

I have also added (MR 2646 and 2655) a mechanism to set this using the PulseAudio API (via the messaging system). Assuming you are using pipewire-pulse, PipeWire’s PulseAudio emulation daemon, you can use pa_context_send_message_to_object() or the command line:

pactl send-message /core pipewire-pulse:force-mono-output true

This API allows for a few things:

  • Query existence of the feature: when an empty message body is sent, if a null value is returned, feature is not supported
  • Query current value: when an empty message body is sent, the current value (true or false) is returned if the feature is supported
  • Setting a value: the requested setting (true or false) can be sent as the message body
  • Clearing the current value: sending a message body of null clears the current setting and restores the default

Looking ahead

This feature will become available in the next release of PipeWire (both 1.4.10 and 1.6.0).

I will be adding a toggle in Pavucontrol to expose this, and I hope that GNOME, KDE and other desktop environments will be able to pick this up before long.

Hit me up if you have any questions!

Rusty Pipes and Oxidized Wires

In case you missed it, the GStreamer Conference 2025 videos are up!

This includes my talk on the new PipeWire native Rust bindings. You’ll want to skip the first 1:20 to get to the start.

I talk a little bit about the motivation and structure of the project, and discuss my experience writing this low-level library in Rust.

There are a lot of great talks, so it’s worth catching up if you weren’t there (or, if like me, you were there and had to pick between the two tracks with great difficulty).

Comments and feedback are welcome! In the future, I’ll post a more long form update about the state of these bindings here as well.

Asymptotic on hiatus

Asymptotic was started 6 years ago, when I wanted to build something that would be larger than just myself.

We’ve worked with some incredible clients in this time, on a wide range of projects. I would be remiss to not thank all the teams that put their trust in us.

In addition to working on interesting challenges, our goal was to make sure we were making a positive impact on the open source projects that we are part of. I think we truly punched above our weight class (pardon the boxing metaphor), on this front – all the upstream work we have done stands testament to that.

Of course, the biggest single contributor to what we were able to achieve is our team. My partner, Deepa, was instrumental in shaping how the company was formed and run. Sanchayan (who took a leap of faith in joining us first), and Taruntej were stellar colleagues and friends on this journey.

It’s been an incredibly rewarding experience, but the time has come to move on to other things, and we have now paused operations. I’ll soon write about some recent work and what’s next.

The Unbearable Anger of Broken Audio

It should be surprising to absolutely nobody that the Linux audio stack is often the subject of varying levels of negative feedback, ranging from drive-by meme snark to apoplectic rage[1].

A lot of what computers are used for today involves audiovisual media in some form or the other, and having that not work can throw a wrench in just going about our day. So it is completely understandable for a person to get frustrated when audio on their device doesn’t work (or maybe worse, stops working for no perceivable reason).

It is also then completely understandable for this person to turn up on Matrix/IRC/Gitlab and make their displeasure known to us in the PipeWire (and previously PulseAudio) community. After all, we’re the maintainers of the part of the audio stack most visible to you.

To add to this, we have two and a half decades’ worth of history in building the modern Linux desktop audio stack, which means there are historical artifacts in the stack (OSS -> ALSA -> ESD/aRTs -> PulseAudio/JACK -> PipeWire). And a lot of historical animus that apparently still needs venting.

In large centralised organisations, there is a support function whose (thankless) job it is to absorb some of that impact before passing it on to the people who are responsible for fixing the problem. In the F/OSS community, sometimes we’re lucky to have folks who step up to help users and triage issues. Usually though, it’s just maintainers managing this.

This has a number of … interesting … impacts for those of us who work in the space. For me this includes:

  1. Developing thick skin
  2. Trying to maintain equanimity while being screamed at
  3. Knowing to step away from the keyboard when that doesn’t work
  4. Repeated reminders that things do work for millions of users every day

So while the causes for the animosity are often sympathetic, this is not a recipe for a healthy community. I try to be judicious while invoking the fd.o Code of Conduct, but thick skin or not, abusive behaviour only results in a toxic community, so there are limits to that.

While I paint a picture of doom and gloom, most recent user feedback and issue reporting in the PipeWire community has been refreshingly positive. Even the trigger for this post is an issue from an extremely belligerent user (who I do sympathise with), who was quickly supplanted by someone else who has been extremely courteous in the face of what is definitely a frustrating experience.

So if I had to ask something of you, dear reader – the next time you’re angry with the maintainers of some free software you depend on, please get some of the venting out of your system in private (tell your friends how terrible we are, or go for a walk maybe), so we can have a reasonable conversation and make things better.

Thank you for reading!


  1. I’m not linking to examples, because that’s not the point of this post. ↩︎

PipeWire ♥ Sovereign Tech Agency

In my previous post, I alluded to an exciting development for PipeWire. I’m now thrilled to officially announce that Asymptotic will be undertaking several important tasks for the project, thanks to funding from the Sovereign Tech Fund (now part of the Sovereign Tech Agency).

Some of you might be familiar with the Sovereign Tech Fund from their funding for GNOME, GStreamer and systemd – they have been investing in foundational open source technology, supporting the digital commons in key areas, a mission closely aligned with our own.

We will be tackling three key areas of work.

ASHA hearing aid support

I wrote a bit about our efforts on this front. We have already completed the PipeWire support for single ASHA hearing aids, and are actively working on support for stereo pairs.

Improvements to GStreamer elements

We have been working through the GStreamer+PipeWire todo list, fixing bugs and making it easier to build audio and video streaming pipelines on top of PipeWire. A number of usability improvements have already landed, and more work on this front continues

A Rust-based client library

While we have a pretty functional set of Rust bindings around the C-based libpipewire already, we will be creating a pure Rust implementation of a PipeWire client, and provide that via a C API as well.

There are a number of advantages to this: type and memory safety being foremost, but we can also leverage Rust macros to eliminate a lot of boilerplate (there are community efforts in this direction already that we may be able to build upon).

This is a large undertaking, and this funding will allow us to tackle a big chunk of it – we are excited, and deeply appreciative of the work the Sovereign Tech Agency is doing in supporting critical open source infrastructure.

Watch this space for more updates!

A Brimful of ASHA

It’s 2025(!), and I thought I’d kick off the year with a post about some work that we’ve been doing behind the scenes for a while. Grab a cup of $beverage_of_choice, and let’s jump in with some context.

History: Hearing aids and Bluetooth

Various estimates put the number of people with some form of hearing loss at 5% of the population. Hearing aids and cochlear implants are commonly used to help deal with this (I’ll use “hearing aid” or “HA” in this post, but the same ideas apply to both). Historically, these have been standalone devices, with some primitive ways to receive audio remotely (hearing loops and telecoils).

As you might expect, the last couple of decades have seen advances that allow consumer devices (such as phones, tablets, laptops, and TVs) to directly connect to hearing aids over Bluetooth. This can provide significant quality of life improvements – playing audio from a device’s speakers means the sound is first distorted by the speakers, and then by the air between the speaker and the hearing aid. Avoiding those two steps can make a big difference in the quality of sound that reaches the user.

An illustration of the audio path through air vs. wireless audio (having higher fidelity)
Comparison of audio paths

Unfortunately, the previous Bluetooth audio standards (BR/EDR and A2DP – used by most Bluetooth audio devices you’ve come across) were not well-suited for these use-cases, especially from a power-consumption perspective. This meant that HA users would either have to rely on devices using proprietary protocols (usually limited to Apple devices), or have a cumbersome additional dongle with its own battery and charging needs.

Recent Past: Bluetooth LE

The more recent Bluetooth LE specification addresses some of the issues with the previous spec (now known as Bluetooth Classic). It provides a low-power base for devices to communicate with each other, and has been widely adopted in consumer devices.

On top of this, we have the LE Audio standard, which provides audio streaming services over Bluetooth LE for consumer audio devices and HAs. The hearing aid industry has been an active participant in its development, and we should see widespread support over time, I expect.

The base Bluetooth LE specification has been around from 2010, but the LE Audio specification has only been public since 2021/2022. We’re still seeing devices with LE Audio support trickle into the market.

In 2018, Google partnered with a hearing aid manufacturer to announce the ASHA (Audio Streaming for Hearing Aids) protocol, presumably as a stop-gap. The protocol uses Bluetooth LE (but not LE Audio) to support low-power audio streaming to hearing aids, and is publicly available. Several devices have shipped with ASHA support in the last ~6 years.

A brief history of Bluetooth LE and audio

Hot Take: Obsolescence is bad UX

As end-users, we understand the push/pull of technological advancement and obsolescence. As responsible citizens of the world, we also understand the environmental impact of this.

The problem is much worse when we are talking about medical devices. Hearing aids are expensive, and are expected to last a long time. It’s not uncommon for people to use the same device for 5-10 years, or even longer.

In addition to the financial cost, there is also a significant emotional cost to changing devices. There is usually a period of adjustment during which one might be working with an audiologist to tune the device to one’s hearing. Neuroplasticity allows the brain to adapt to the device and extract more meaning over time. Changing devices effectively resets the process.

All this is to say that supporting older devices is a worthy goal in itself, but has an additional set of dimensions in the context of accessibility.

HAs and Linux-based devices

Because of all this history, hearing aid manufacturers have traditionally focused on mobile devices (i.e. Android and iOS). This is changing, with Apple supporting its proprietary MFi (made for iPhone/iPad/iPod) protocol on macOS, and Windows adding support for LE Audio on Windows 11.

This does leave the question of Linux-based devices, which is our primary concern – can users of free software platforms also have an accessible user experience?

A lot of work has gone into adding Bluetooth LE support in the Linux kernel and BlueZ, and more still to add LE Audio support. PipeWire’s Bluetooth module now includes support for LE Audio, and there is continuing effort to flesh this out. Linux users with LE Audio-based hearing aids will be able to take advantage of all this.

However, the ASHA specification was only ever supported on Android devices. This is a bit of a shame, as there are likely a significant number of hearing aids out there with ASHA support, which will hopefully still be around for the next 5+ years. This felt like a gap that we could help fill.

Step 1: A Proof-of-Concept

We started out by looking at the ASHA specification, and the state of Bluetooth LE in the Linux kernel. We spotted some things that the Android stack exposes that BlueZ does not, but it seemed like all the pieces should be there.

Friend-of-Asymptotic, Ravi Chandra Padmala spent some time with us to implement a proof-of-concept. This was a pretty intense journey in itself, as we had to identify some good reference hardware (we found an ASHA implementation on the onsemi RSL10), and clean out the pipes between the kernel and userspace (LE connection-oriented channels, which ASHA relies on, weren’t commonly used at that time).

We did eventually get the proof-of-concept done, and this gave us confidence to move to the next step of integrating this into BlueZ – albeit after a hiatus of paid work. We have to keep the lights on, after all!

Step 2: ASHA in BlueZ

The BlueZ audio plugin implements various audio profiles within the BlueZ daemon – this includes A2DP for Bluetooth Classic, as well as BAP for LE Audio.

We decided to add ASHA support within this plugin. This would allow BlueZ to perform privileged operations and then hand off a file descriptor for the connection-oriented channel, so that any userspace application (such as PipeWire) could actually stream audio to the hearing aid.

I implemented an initial version of the ASHA profile in the BlueZ audio plugin last year, and thanks to Luiz Augusto von Dentz’ guidance and reviews, the plugin has landed upstream.

This has been tested with a single hearing aid, and stereo support is pending. In the process, we also found a small community of folks with deep interest in this subject, and you can join us on #asha on the BlueZ Slack.

Step 3: PipeWire support

To get end-to-end audio streaming working with any application, we need to expose the BlueZ ASHA profile as a playback device on the audio server (i.e., PipeWire). This would make the HAs appear as just another audio output, and we could route any or all system audio to it.

My colleague, Sanchayan Maity, has been working on this for the last few weeks. The code is all more or less in place now, and you can track our progress on the PipeWire MR.

Step 4 and beyond: Testing, stereo support, …

Once we have the basic PipeWire support in place, we will implement stereo support (the spec does not support more than 2 channels), and then we’ll have a bunch of testing and feedback to work with. The goal is to make this a solid and reliable solution for folks on Linux-based devices with hearing aids.

Once that is done, there are a number of UI-related tasks that would be nice to have in order to provide a good user experience. This includes things like combining the left and right HAs to present them as a single device, and access to any tuning parameters.

Getting it done

This project has been on my mind since the ASHA specification was announced, and it has been a long road to get here. We are in the enviable position of being paid to work on challenging problems, and we often contribute our work upstream. However, there are many such projects that would be valuable to society, but don’t necessarily have a clear source of funding.

In this case, we found ourselves in an interesting position – we have the expertise and context around the Linux audio stack to get this done. Our business model allows us the luxury of taking bites out of problems like this, and we’re happy to be able to do so.

However, it helps immensely when we do have funding to take on this work end-to-end – we can focus on the task entirely and get it done faster.

Onward…

I am delighted to announce that we were able to find the financial support to complete the PipeWire work! Once we land basic mono audio support in the MR above, we’ll move on to implementing stereo support in the BlueZ plugin and the PipeWire module. We’ll also be testing with some real-world devices, and we’ll be leaning on our community for more feedback.

This is an exciting development, and I’ll be writing more about it in a follow-up post in a few days. Stay tuned!

GStreamer + PipeWire: A Todo List

I wrote about our time at the GStreamer Conference in October, and one important thing I was able to do is spend some time with all-around great guy George reflecting on where the GStreamer plugins for PipeWire are, and what we need to do to get them to a rock-solid state.

This is a summary of our conversation, in the form of a to-do list of sorts…

Status Quo

Currently, we have two elements: pipewiresrc and pipewiresink. The two plugins work with both audio and video, and instantiate a PipeWire capture and playback stream, respectively. The stream, as with any PipeWire client, appears as a node in the PipeWire.

Buffers are managed in the GStreamer pipeline using bufferpools, and recently Wim re-enabled exposing the stream clock as a GStreamer clock.

There have been a number of issues that have cropped up over time, and we’ve been plugging away at addressing them, but it was worth stepping back and looking at the whole for a bit.

Use Cases

The straightforward uses of these elements might be to represent client streams: pipewiresrc might connect to an audio capture device (like a microphone), or video capture device (like a webcam), and provide the data for downstream elements to consume. Similarly pipewiresink might be used to play audio to the system output (speakers or headphones, perhaps).

Because of the flexibility of the PipeWire API, these elements may also be used to provide a virtual capture or playback device though. So pipewiresrc might provide a virtual audio sink, which applications could connect to to stream audio over the network (like say a WebRTC stream).

Conversely, it is possible to use pipewiresink to provide a virtual capture device – for example, the pipeline might generate a video stream and expose it a virtual camera for other applications to use.

We might even combine the two cases, one might connect to a webcam as a client, apply some custom video processing, and then expose that stream back as a virtual camera source as easily as:

pipewiresrc target-object="MyCamera" ! <some video filters> ! \
  pipewiresink provide=true stream-properties="props,media.class=Video/Source,media.role=Camera"

So we have a minor combinatorial explosion across 3 axes, and all combinations are valid:

  • pipewiresrc vs. pipewiresink
  • audio vs. video
  • stream vs. virtual device

For each of these combinations, we might have different behaviour across the various issues below.

Split ’em up?

Before we look at specific issues, it is worth pointing out that the PipeWire elements are unusual in that they support both audio and video with the same code. This seems like a tantalisingly elegant idea, and it’s quite neat that we are able to get this far with this unified approach.

However, as we examine the specific issues we are seeing, it does seem to emerge that the audio and video paths diverge in several ways. It may be time to consider whether the divergence merits just splitting them up into separate audio and video elements.

Linking

The first issue that comes to mind is how we might want PipeWire or WirePlumber to manage linking the nodes from the GStreamer pipeline with other nodes (devices or streams).

For the playback/capture stream use-cases, we would want the nodes to automatically be connected to a sink/source node when the GStreamer pipeline goes to the PAUSED or PLAYING state, and for that link to be torn down when leaving those states. It might be possible for the link to “move” if, for example, the default playback or capture device changes, though a “move” is really the removal of the current link with a new link following.

For the virtual device use-cases, the pipeline state should likely follow the link state. That is, when a node is connected to our virtual device, we want the GStreamer pipeline to start producing/consuming data, and when disconnected, it should go back to “sleep”, possibly running again later.

The latter is something that a GStreamer application using these plugins might have to manage manually, but simplifying this and supporting this via gst-launch-1.0 for easy command-line use would be nice to have.

There are already the beginnings of support for such usage via the provide property on pipewiresink, but more work is needed for this to make this truly usable.

Bufferpools

Closely related to linking are buffers and bufferpools, as the process of linking nodes is what makes buffers for data exchange available to PipeWire nodes.

While bufferpools are a valuable concept for memory efficiency and avoiding unnecessary memcpy()s, they come with some complexity overhead in managing the pipeline. For one, as the number of buffers in a bufferpool is limited, it is possible to exhaust the set of buffers (with a large queue for example).

There are also some lifecycle complexities that arise from links coming and going, as the corresponding buffers also then go away from under us, something that GStreamer bufferpools are not designed for.

A solution to the first problem might be to avoid using bufferpools for some cases (for example, they might not be very valuable for audio). The solution to the lifecycle problem is a trickier one, and no clear answer is apparent yet, at least with the APIs as they stand.

We might also need to support resizing bufferpools for some cases, and that is not something that is easy to support with how buffer management currently happens in PipeWire (the stream API does not really give us much of a handle on this).

Formats

In order to support the various use-cases, we want to be able to support both a fixed format (if we know what we are providing), or a negotiated format (if we can adapt in the GStreamer pipeline based on what PipeWire has/wants).

There is also a large surface area of formats that PipeWire supports that we need to make sure we support well:

  • There are known issues with some planar video formats being presented correctly from pipewiresrc
  • We do not expose planar audio formats, although both GStreamer and PipeWire support them
  • Support for DSD and passthrough audio (e.g. Dolby/DTS over HDMI) needs to be wired up
  • Support for compressed formats (we added PipeWire support for decode + render on a DSP)

Rate matching

While Wim recently added some rate matching code to pipewiresink, there is work to be done to make sure that if there is skew between the GStreamer pipeline’s data rate and the audio device rate, we can use PipeWire’s rate adaptation features to compensate for such skew. This should work in both pipewiresink and pipewiresrc.

For some background on this topic, check out my talk on clock rate matching from a couple of months ago.

Device provider conflicts

While we are improving the out-of-the-box experience of these elements, unfortunately the PipeWire device provider currently supersedes all others (the GStreamer Device Provider API allows for discovering devices on the system and the elements used to access them).

The higher rank might make sense for video (as system integrators likely want to start preferring PipeWire elements over V4L2), but it can lead to a bad experience for audio (the PulseAudio elements work better today via PipeWire’s PulseAudio emulation layer).

We might temporarily drop the rank of PipeWire elements for audio to avoid autoplugging them while we fix the problems we have.

Probing formats

We create a “probe” stream in pipewiresink while getting ready to play audio, in order to discover what formats the device we would play to supports. This is required to detect supported formats and make decisions about whether to decode in GStreamer, what sample rate and format are preferred, etc.

Unfortunately, that also causes a “false” playback device startup sequence, which might manifest as a click or glitch on some hardware. Having a way to set up a probe that does not actually open the device would be a nice improvement.

Player state

There are a couple of areas where policy actions do not always surface well to the application/UI layer. One instance of this is where a stream is “corked” (maybe because only one media player should be active at a time) – we want to let the player know it has been paused, so it can update its state and let the UI know too. There is limited infrastructure for this already, via GST_MESSAGE_REQUEST_STATE.

Also, more of a session management (i.e. WirePlumber / system integration) thing, we do not really have the concept of a system-wide media player state. This would be useful if we want to exercise policy like “don’t let any media player play while we’re on a call”, and have that state be consistent across UI interactions (i.e. hitting play during a call does not trigger playback, maybe even lets the user know why it’s not working / provides an override).

GStreamer Conference 2024

All of us at Asymptotic are back home from the exciting week at GStreamer Conference 2024 in Montréal, Canada last month. It was great to hang out with the community and see all the great work going on in the GStreamer ecosystem.

Montréal sunsets are 😍

There were some visa-related adventures leading up to the conference, but thanks to the organising team (shoutout to Mark Filion and Tim-Philipp Müller), everything was sorted out in time and Sanchayan and Taruntej were able to make it.

This conference was also special because this year marks the 25th anniversary of the GStreamer project!

Happy birthday to us! 🎉

Talks

We had 4 talks at the conference this year.

GStreamer & QUIC (video)

Sancyahan speaking about GStreamer and QUIC

Sanchayan spoke about his work with the various QUIC elements in GStreamer. We already have the quinnquicsrc and quinquicsink upstream, with a couple of plugins to allow (de)multiplexing of raw streams as well as an implementation or RTP-over-QUIC (RoQ). We’ve also started work on Media-over-QUIC (MoQ) elements.

This has been a fun challenge for us, as we’re looking to build out a general-purpose toolkit for building QUIC application-layer protocols in GStreamer. Watch this space for more updates as we build out more functionality, especially around MoQ.

Clock Rate Matching in GStreamer & PipeWire (video)

Arun speaking about PipeWire delay-locked loops
Photo credit: Francisco

My talk was about an interesting corner of GStreamer, namely clock rate matching. This is a part of live pipelines that is often taken for granted, so I wanted to give folks a peek under the hood.

The idea of doing this talk was was born out of some recent work we did to allow splitting up the graph clock in PipeWire from the PTP clock when sending AES67 streams on the network. I found the contrast between the PipeWire and GStreamer approaches thought-provoking, and wanted to share that with the community.

GStreamer for Real-Time Audio on Windows (video)

Next, Taruntej dove into how we optimised our usage of GStreamer in a real-time audio application on Windows. We had some pretty tight performance requirements for this project, and Taruntej spent a lot of time profiling and tuning the pipeline to meet them. He shared some of the lessons learned and the tools he used to get there.

Simplifying HLS playlist generation in GStreamer (video)

Sanchayan also walked us through the work he’s been doing to simplify HLS (HTTP Live Streaming) multivariant playlist generation. This should be a nice feature to round out GStreamer’s already strong support for generating HLS streams. We are also exploring the possibility of reusing the same code for generating DASH (Dynamic Adaptive Streaming over HTTP) manifests.

Hackfest

As usual, the conference was followed by a two-day hackfest. We worked on a few interesting problems:

  • Sanchayan addressed some feedback on the QUIC muxer elements, and then investigated extending the HLS elements for SCTE-35 marker insertion and DASH support

  • Taruntej worked on improvements to the threadshare elements, specifically to bring some ts-udpsrc element features in line with udpsrc

  • I spent some time reviewing a long-pending merge request to add soft-seeking support to the AWS S3 sink (so that it might be possible to upload seekable MP4s, for example, directly to S3). I also had a very productive conversation with George Kiagiadakis about how we should improve the PipeWire GStreamer elements (more on this soon!)

All in all, it was a great time, and I’m looking forward to the spring hackfest and conference in the the latter part next year!