Tag: work

A Brimful of ASHA

It’s 2025(!), and I thought I’d kick off the year with a post about some work that we’ve been doing behind the scenes for a while. Grab a cup of $beverage_of_choice, and let’s jump in with some context.

History: Hearing aids and Bluetooth

Various estimates put the number of people with some form of hearing loss at 5% of the population. Hearing aids and cochlear implants are commonly used to help deal with this (I’ll use “hearing aid” or “HA” in this post, but the same ideas apply to both). Historically, these have been standalone devices, with some primitive ways to receive audio remotely (hearing loops and telecoils).

As you might expect, the last couple of decades have seen advances that allow consumer devices (such as phones, tablets, laptops, and TVs) to directly connect to hearing aids over Bluetooth. This can provide significant quality of life improvements – playing audio from a device’s speakers means the sound is first distorted by the speakers, and then by the air between the speaker and the hearing aid. Avoiding those two steps can make a big difference in the quality of sound that reaches the user.

An illustration of the audio path through air vs. wireless audio (having higher fidelity)
Comparison of audio paths

Unfortunately, the previous Bluetooth audio standards (BR/EDR and A2DP – used by most Bluetooth audio devices you’ve come across) were not well-suited for these use-cases, especially from a power-consumption perspective. This meant that HA users would either have to rely on devices using proprietary protocols (usually limited to Apple devices), or have a cumbersome additional dongle with its own battery and charging needs.

Recent Past: Bluetooth LE

The more recent Bluetooth LE specification addresses some of the issues with the previous spec (now known as Bluetooth Classic). It provides a low-power base for devices to communicate with each other, and has been widely adopted in consumer devices.

On top of this, we have the LE Audio standard, which provides audio streaming services over Bluetooth LE for consumer audio devices and HAs. The hearing aid industry has been an active participant in its development, and we should see widespread support over time, I expect.

The base Bluetooth LE specification has been around from 2010, but the LE Audio specification has only been public since 2021/2022. We’re still seeing devices with LE Audio support trickle into the market.

In 2018, Google partnered with a hearing aid manufacturer to announce the ASHA (Audio Streaming for Hearing Aids) protocol, presumably as a stop-gap. The protocol uses Bluetooth LE (but not LE Audio) to support low-power audio streaming to hearing aids, and is publicly available. Several devices have shipped with ASHA support in the last ~6 years.

A brief history of Bluetooth LE and audio

Hot Take: Obsolescence is bad UX

As end-users, we understand the push/pull of technological advancement and obsolescence. As responsible citizens of the world, we also understand the environmental impact of this.

The problem is much worse when we are talking about medical devices. Hearing aids are expensive, and are expected to last a long time. It’s not uncommon for people to use the same device for 5-10 years, or even longer.

In addition to the financial cost, there is also a significant emotional cost to changing devices. There is usually a period of adjustment during which one might be working with an audiologist to tune the device to one’s hearing. Neuroplasticity allows the brain to adapt to the device and extract more meaning over time. Changing devices effectively resets the process.

All this is to say that supporting older devices is a worthy goal in itself, but has an additional set of dimensions in the context of accessibility.

HAs and Linux-based devices

Because of all this history, hearing aid manufacturers have traditionally focused on mobile devices (i.e. Android and iOS). This is changing, with Apple supporting its proprietary MFi (made for iPhone/iPad/iPod) protocol on macOS, and Windows adding support for LE Audio on Windows 11.

This does leave the question of Linux-based devices, which is our primary concern – can users of free software platforms also have an accessible user experience?

A lot of work has gone into adding Bluetooth LE support in the Linux kernel and BlueZ, and more still to add LE Audio support. PipeWire’s Bluetooth module now includes support for LE Audio, and there is continuing effort to flesh this out. Linux users with LE Audio-based hearing aids will be able to take advantage of all this.

However, the ASHA specification was only ever supported on Android devices. This is a bit of a shame, as there are likely a significant number of hearing aids out there with ASHA support, which will hopefully still be around for the next 5+ years. This felt like a gap that we could help fill.

Step 1: A Proof-of-Concept

We started out by looking at the ASHA specification, and the state of Bluetooth LE in the Linux kernel. We spotted some things that the Android stack exposes that BlueZ does not, but it seemed like all the pieces should be there.

Friend-of-Asymptotic, Ravi Chandra Padmala spent some time with us to implement a proof-of-concept. This was a pretty intense journey in itself, as we had to identify some good reference hardware (we found an ASHA implementation on the onsemi RSL10), and clean out the pipes between the kernel and userspace (LE connection-oriented channels, which ASHA relies on, weren’t commonly used at that time).

We did eventually get the proof-of-concept done, and this gave us confidence to move to the next step of integrating this into BlueZ – albeit after a hiatus of paid work. We have to keep the lights on, after all!

Step 2: ASHA in BlueZ

The BlueZ audio plugin implements various audio profiles within the BlueZ daemon – this includes A2DP for Bluetooth Classic, as well as BAP for LE Audio.

We decided to add ASHA support within this plugin. This would allow BlueZ to perform privileged operations and then hand off a file descriptor for the connection-oriented channel, so that any userspace application (such as PipeWire) could actually stream audio to the hearing aid.

I implemented an initial version of the ASHA profile in the BlueZ audio plugin last year, and thanks to Luiz Augusto von Dentz’ guidance and reviews, the plugin has landed upstream.

This has been tested with a single hearing aid, and stereo support is pending. In the process, we also found a small community of folks with deep interest in this subject, and you can join us on #asha on the BlueZ Slack.

Step 3: PipeWire support

To get end-to-end audio streaming working with any application, we need to expose the BlueZ ASHA profile as a playback device on the audio server (i.e., PipeWire). This would make the HAs appear as just another audio output, and we could route any or all system audio to it.

My colleague, Sanchayan Maity, has been working on this for the last few weeks. The code is all more or less in place now, and you can track our progress on the PipeWire MR.

Step 4 and beyond: Testing, stereo support, …

Once we have the basic PipeWire support in place, we will implement stereo support (the spec does not support more than 2 channels), and then we’ll have a bunch of testing and feedback to work with. The goal is to make this a solid and reliable solution for folks on Linux-based devices with hearing aids.

Once that is done, there are a number of UI-related tasks that would be nice to have in order to provide a good user experience. This includes things like combining the left and right HAs to present them as a single device, and access to any tuning parameters.

Getting it done

This project has been on my mind since the ASHA specification was announced, and it has been a long road to get here. We are in the enviable position of being paid to work on challenging problems, and we often contribute our work upstream. However, there are many such projects that would be valuable to society, but don’t necessarily have a clear source of funding.

In this case, we found ourselves in an interesting position – we have the expertise and context around the Linux audio stack to get this done. Our business model allows us the luxury of taking bites out of problems like this, and we’re happy to be able to do so.

However, it helps immensely when we do have funding to take on this work end-to-end – we can focus on the task entirely and get it done faster.

Onward…

I am delighted to announce that we were able to find the financial support to complete the PipeWire work! Once we land basic mono audio support in the MR above, we’ll move on to implementing stereo support in the BlueZ plugin and the PipeWire module. We’ll also be testing with some real-world devices, and we’ll be leaning on our community for more feedback.

This is an exciting development, and I’ll be writing more about it in a follow-up post in a few days. Stay tuned!

GStreamer Conference 2024

All of us at Asymptotic are back home from the exciting week at GStreamer Conference 2024 in Montréal, Canada last month. It was great to hang out with the community and see all the great work going on in the GStreamer ecosystem.

Montréal sunsets are 😍

There were some visa-related adventures leading up to the conference, but thanks to the organising team (shoutout to Mark Filion and Tim-Philipp Müller), everything was sorted out in time and Sanchayan and Taruntej were able to make it.

This conference was also special because this year marks the 25th anniversary of the GStreamer project!

Happy birthday to us! 🎉

Talks

We had 4 talks at the conference this year.

GStreamer & QUIC (video)

Sancyahan speaking about GStreamer and QUIC

Sanchayan spoke about his work with the various QUIC elements in GStreamer. We already have the quinnquicsrc and quinquicsink upstream, with a couple of plugins to allow (de)multiplexing of raw streams as well as an implementation or RTP-over-QUIC (RoQ). We’ve also started work on Media-over-QUIC (MoQ) elements.

This has been a fun challenge for us, as we’re looking to build out a general-purpose toolkit for building QUIC application-layer protocols in GStreamer. Watch this space for more updates as we build out more functionality, especially around MoQ.

Clock Rate Matching in GStreamer & PipeWire (video)

Arun speaking about PipeWire delay-locked loops
Photo credit: Francisco

My talk was about an interesting corner of GStreamer, namely clock rate matching. This is a part of live pipelines that is often taken for granted, so I wanted to give folks a peek under the hood.

The idea of doing this talk was was born out of some recent work we did to allow splitting up the graph clock in PipeWire from the PTP clock when sending AES67 streams on the network. I found the contrast between the PipeWire and GStreamer approaches thought-provoking, and wanted to share that with the community.

GStreamer for Real-Time Audio on Windows (video)

Next, Taruntej dove into how we optimised our usage of GStreamer in a real-time audio application on Windows. We had some pretty tight performance requirements for this project, and Taruntej spent a lot of time profiling and tuning the pipeline to meet them. He shared some of the lessons learned and the tools he used to get there.

Simplifying HLS playlist generation in GStreamer (video)

Sanchayan also walked us through the work he’s been doing to simplify HLS (HTTP Live Streaming) multivariant playlist generation. This should be a nice feature to round out GStreamer’s already strong support for generating HLS streams. We are also exploring the possibility of reusing the same code for generating DASH (Dynamic Adaptive Streaming over HTTP) manifests.

Hackfest

As usual, the conference was followed by a two-day hackfest. We worked on a few interesting problems:

  • Sanchayan addressed some feedback on the QUIC muxer elements, and then investigated extending the HLS elements for SCTE-35 marker insertion and DASH support

  • Taruntej worked on improvements to the threadshare elements, specifically to bring some ts-udpsrc element features in line with udpsrc

  • I spent some time reviewing a long-pending merge request to add soft-seeking support to the AWS S3 sink (so that it might be possible to upload seekable MP4s, for example, directly to S3). I also had a very productive conversation with George Kiagiadakis about how we should improve the PipeWire GStreamer elements (more on this soon!)

All in all, it was a great time, and I’m looking forward to the spring hackfest and conference in the the latter part next year!

Asymptotic: A 2023 Review

It’s been a busy few several months, but now that we have some breathing room, I wanted to take stock of what we have done over the last year or so.

This is a good thing for most people and companies to do of course, but being a scrappy, (questionably) young organisation, it’s doubly important for us to introspect. This allows us to both recognise our achievements and ensure that we are accomplishing what we have set out to do.

One thing that is clear to me is that we have been lagging in writing about some of the interesting things that we have had the opportunity to work on, so you can expect to see some more posts expanding on what you find below, as well as some of the newer work that we have begun.

(note: I write about our open source contributions below, but needless to say, none of it is possible without the collaboration, input, and reviews of members of the community)

WHIP/WHEP client and server for GStreamer

If you’re in the WebRTC world, you likely have not missed the excitement around standardisation of HTTP-based signalling protocols, culminating in the WHIP and WHEP specifications.

Tarun has been driving our client and server implementations for both these protocols, and in the process has been refactoring some of the webrtcsink and webrtcsrc code to make it easier to add more signaller implementations. You can find out more about this work in his talk at GstConf 2023 and we’ll be writing more about the ongoing effort here as well.

Low-latency embedded audio with PipeWire

Some of our work involves implementing a framework for very low-latency audio processing on an embedded device. PipeWire is a good fit for this sort of application, but we have had to implement a couple of features to make it work.

It turns out that doing timer-based scheduling can be more CPU intensive than ALSA period interrupts at low latencies, so we implemented an IRQ-based scheduling mode for PipeWire. This is now used by default when a pro-audio profile is selected for an ALSA device.

In addition to this, we also implemented rate adaptation for USB gadget devices using the USB Audio Class “feedback control” mechanism. This allows USB gadget devices to adapt their playback/capture rates to the graph’s rate without having to perform resampling on the device, saving valuable CPU and latency.

There is likely still some room to optimise things, so expect to more hear on this front soon.

Compress offload in PipeWire

Sanchayan has written about the work we did to add support in PipeWire for offloading compressed audio. This is something we explored in PulseAudio (there’s even an implementation out there), but it’s a testament to the PipeWire design that we were able to get this done without any protocol changes.

This should be useful in various embedded devices that have both the hardware and firmware to make use of this power-saving feature.

GStreamer LC3 encoder and decoder

Tarun wrote a GStreamer plugin implementing the LC3 codec using the liblc3 library. This is the primary codec for next-generation wireless audio devices implementing the Bluetooth LE Audio specification. The plugin is upstream and can be used to encode and decode LC3 data already, but will likely be more useful when the existing Bluetooth plugins to talk to Bluetooth devices get LE audio support.

QUIC plugins for GStreamer

Sanchayan implemented a QUIC source and sink plugin in Rust, allowing us to start experimenting with the next generation of network transports. For the curious, the plugins sit on top of the Quinn implementation of the QUIC protocol.

There is a merge request open that should land soon, and we’re already seeing folks using these plugins.

AWS S3 plugins

We’ve been fleshing out the AWS S3 plugins over the years, and we’ve added a new awss3putobjectsink. This provides a better way to push small or sparse data to S3 (subtitles, for example), without potentially losing data in case of a pipeline crash.

We’ll also be expecting this to look a little more like multifilesink, allowing us to arbitrary split up data and write to S3 directly as multiple objects.

Update to webrtc-audio-processing

We also updated the webrtc-audio-processing library, based on more recent upstream libwebrtc. This is one of those things that becomes surprisingly hard as you get into it — packaging an API-unstable library correctly, while supporting a plethora of operating system and architecture combinations.

Clients

We can’t always speak publicly of the work we are doing with our clients, but there have been a few interesting developments we can (and have spoken about).

Both Sanchayan and I spoke a bit about our work with WebRTC-as-a-service provider, Daily. My talk at the GStreamer Conference was a summary of the work I wrote about previously about what we learned while building Daily’s live streaming, recording, and other backend services. There were other clients we worked with during the year with similar experiences.

Sanchayan spoke about the interesting approach to building SIP support that we took for Daily. This was a pretty fun project, allowing us to build a modern server-side SIP client with GStreamer and SIP.js.

An ongoing project we are working on is building AES67 support using GStreamer for FreeSWITCH, which essentially allows bridging low-latency network audio equipment with existing SIP and related infrastructure.

As you might have noticed from previous sections, we are also working on a low-latency audio appliance using PipeWire.

Retrospective

All in all, we’ve had a reasonably productive 2023. There are things I know we can do better in our upstream efforts to help move merge requests and issues, and I hope to address this in 2024.

We have ideas for larger projects that we would like to take on. Some of these we might be able to find clients who would be willing to pay for. For the ideas that we think are useful but may not find any funding, we will continue to spend our spare time to push forward.

If you made this this far, thank you, and look out for more updates!

GStreamer for your backend services

For the last year and a half, we at Asymptotic have been working with the excellent team at Daily. I’d like to share a little bit about what we’ve learned.

Daily is a real time calling platform as a service. One standard feature that users have come to expect in their calls is the ability to record them, or to stream their conversations to a larger audience. This involves mixing together all the audio/video from each participant and then storing it, or streaming it live via YouTube, Twitch, or any other third-party service.

As you might expect, GStreamer is a good fit for building this kind of functionality, where we consume a bunch of RTP streams, composite/mix them, and then send them out to one or more external services (Amazon’s S3 for recordings and HLS, or a third-party RTMP server).

I’ve written about how we implemented this feature elsewhere, but I’ll summarise briefly.

This is a slightly longer post than usual, so grab a cup of your favourite beverage, or jump straight to the summary section for the tl;dr.

Read More

A Quick Update

Happy 2016 everyone!

While I did mention a while back (almost two years ago, wow) that I was taking a break, I realised recently that I hadn’t posted an update from when I started again.

For the last year and a half, I’ve been providing freelance consulting around PulseAudio, GStreamer, and various other directly and tangentially related projects. There’s a brief list of the kind of work I’ve been involved in.

If you’re looking for help with PulseAudio, GStreamer, multimedia middleware or anything else you might’ve come across on this blog, do get in touch!

Four years

Four years and what seems like a lifetime ago, I jumped aboard the ship Collabora Multimedia, and set sail for adventure and lands unknown. We sailed through strange new seas, to exotic lands, defeated many monsters, and, I feel, had some positive impact on the world around us. Last Friday, on my request, I got dropped back at the shore.

I’ve had an insanely fun time at Collabora, working with absurdly talented and dedicated people. Nevertheless, I’ve come to the point where I feel like I need something of a break. I’m not sure what’s next, other than a month or two of rest and relaxation — reading, cycling, travel, and catching up with some of the things I’ve been promising to do if only I had more time. Yes, that includes PulseAudio and GStreamer hacking as well. :-)

And there’ll be more updates and blog posts too!

PulseAudio 4.0 and more

And we’re back … PulseAudio 4.0 is out! There’s both a short and super-detailed changelog in the release notes. For the lazy, this release brings a bunch of Bluetooth stability updates, better low latency handling, performance improvements, and a whole lot more. :)

One interesting thing is that for this release, we kept a parallel next branch open while master was frozen for stabilising and releasing. As a result, we’re already well on our way to 5.0 with 52 commits since 4.0 already merged into master.

And finally, I’m excited to announce PulseAudio is going to be carrying out two great projects this summer, as part of the Google Summer of Code! We are going to have Alexander Couzens (lynxis) working on a rewrite of module-tunnel using libpulse, mentored by Tanu Kaskinen. In addition to this, Damir Jelić (poljar) working on improvements to resampling, mentored by Peter Meerwald.

That’s just some of the things to look forward to in coming months. I’ve got a few more things I’d like to write about, but I’ll save that for another post.

PulseConf Schedule

David has now published a tentative schedule for the PulseAudio Mini-conference (I’m just going to call it PulseConf — so much easier on the tongue).

For the lazy, these are some of the topics we’ll be covering:

  • Vision and mission — where we are and where we want to be
  • Improving our patch review process
  • Routing infrastructure
  • Improving low latency behaviour
  • Revisiting system- and user-modes
  • Devices with dynamic capabilities
  • Improving surround sound behaviour
  • Separating configuration for hardware adaptation
  • Better drain/underrun reporting behaviour

Phew — and there are more topics that we probably will not have time to deal with!

For those of you who cannot attend, the Linaro Connect folks (who are graciously hosting us) are planning on running Google+ Hangouts for their sessions. Hopefully we should be able to do the same for our proceedings. Watch this space for details!

p.s.: A big thank you to my employer Collabora for sponsoring my travel to the conference.

Androidifying your autotools build the easy way

Derek Foreman has finally written up a nice blog post about his Androgenizer tool, which we’ve used for porting PulseAudio, GStreamer, Wayland, Telepathy and most of their dependencies to Android.

If you’ve got an autotools-based project that you’d like to build on Android, whether on the NDK or system-wide this is really useful.

PulseAudio on Android: Part 2

Some of you might’ve noticed that there has been a bunch of work happening here at Collabora on making cool open source technologies such as GStreamer, Telepathy, Wayland and of course, PulseAudio available on Android.

Since my last blog post on this subject, I got some time to start looking at replacing AudioFlinger (recap: that’s Android’s native audio subsystem) with PulseAudio (recap: that’s the awesome Linux audio subsystem). This work can be broken up into 3 parts: playback, capture, and policy. The roles of playback and capture are obvious. For those who aren’t aware of system internals, the policy bits take care of audio routing, volumes, and other such things. For example, audio should play out of your headphones if they’re plugged in, off Bluetooth if you’ve got a headset paired, or the speakers if nothing’s plugged in. Also, depending on the device, the output volume might change based on the current output path.

I started by looking at solving the playback problem first. I’ve got the first 80% of this done (as we all know, the second 80% takes at least as long ;) ). This is done by replacing the native AudioTrack playback API with a simple wrapper that translates into the libpulse PulseAudio client API. There’s bits of the API that seem to be rarely used(loops and markers, primarily), and I’ve not gotten around to those yet. Basic playback works quite well, and here’s a video showing this. (Note: this and the next video will be served with yummy HTML5 goodness if you enabled the YouTube HTML5 beta).

(if the video doesn’t appear, you can watch it on YouTube)

Users of PulseAudio might have spotted that this now frees us up to do some fairly nifty things. One such thing is getting remote playback for free. For a long time now, there has been support for streaming audio between devices running PulseAudio. I wrote up a quick app to show this working on the Galaxy Nexus as well. Again, seeing this working is a lot more impressive than me describing it here, so here’s another video:

(if the video doesn’t appear, you can watch it on YouTube)

This is all clearly work in progress, but you can find the code for the AudioTrack wrapper as a patch for now. This will be a properly integrated tree that you can just pull and easily integrate into your Android build when it’s done. The PA Output Switcher app code is also available in a git repository.

I’m hoping to be able to continue hacking on the capture and policy bits. The latter, especially, promises to be involved, since there isn’t always a 1:1 mapping between AudioFlinger and PulseAudio concepts. Nothing insurmountable, though. :) Watch this space for more updates as I wade through the next bit.