Tag: technical

GStreamer Conference 2024

All of us at Asymptotic are back home from the exciting week at GStreamer Conference 2024 in Montréal, Canada last month. It was great to hang out with the community and see all the great work going on in the GStreamer ecosystem.

Montréal sunsets are 😍

There were some visa-related adventures leading up to the conference, but thanks to the organising team (shoutout to Mark Filion and Tim-Philipp Müller), everything was sorted out in time and Sanchayan and Taruntej were able to make it.

This conference was also special because this year marks the 25th anniversary of the GStreamer project!

Happy birthday to us! 🎉

Talks

We had 4 talks at the conference this year.

GStreamer & QUIC (video)

Sancyahan speaking about GStreamer and QUIC

Sanchayan spoke about his work with the various QUIC elements in GStreamer. We already have the quinnquicsrc and quinquicsink upstream, with a couple of plugins to allow (de)multiplexing of raw streams as well as an implementation or RTP-over-QUIC (RoQ). We’ve also started work on Media-over-QUIC (MoQ) elements.

This has been a fun challenge for us, as we’re looking to build out a general-purpose toolkit for building QUIC application-layer protocols in GStreamer. Watch this space for more updates as we build out more functionality, especially around MoQ.

Clock Rate Matching in GStreamer & PipeWire (video)

Arun speaking about PipeWire delay-locked loops
Photo credit: Francisco

My talk was about an interesting corner of GStreamer, namely clock rate matching. This is a part of live pipelines that is often taken for granted, so I wanted to give folks a peek under the hood.

The idea of doing this talk was was born out of some recent work we did to allow splitting up the graph clock in PipeWire from the PTP clock when sending AES67 streams on the network. I found the contrast between the PipeWire and GStreamer approaches thought-provoking, and wanted to share that with the community.

GStreamer for Real-Time Audio on Windows (video)

Next, Taruntej dove into how we optimised our usage of GStreamer in a real-time audio application on Windows. We had some pretty tight performance requirements for this project, and Taruntej spent a lot of time profiling and tuning the pipeline to meet them. He shared some of the lessons learned and the tools he used to get there.

Simplifying HLS playlist generation in GStreamer (video)

Sanchayan also walked us through the work he’s been doing to simplify HLS (HTTP Live Streaming) multivariant playlist generation. This should be a nice feature to round out GStreamer’s already strong support for generating HLS streams. We are also exploring the possibility of reusing the same code for generating DASH (Dynamic Adaptive Streaming over HTTP) manifests.

Hackfest

As usual, the conference was followed by a two-day hackfest. We worked on a few interesting problems:

  • Sanchayan addressed some feedback on the QUIC muxer elements, and then investigated extending the HLS elements for SCTE-35 marker insertion and DASH support

  • Taruntej worked on improvements to the threadshare elements, specifically to bring some ts-udpsrc element features in line with udpsrc

  • I spent some time reviewing a long-pending merge request to add soft-seeking support to the AWS S3 sink (so that it might be possible to upload seekable MP4s, for example, directly to S3). I also had a very productive conversation with George Kiagiadakis about how we should improve the PipeWire GStreamer elements (more on this soon!)

All in all, it was a great time, and I’m looking forward to the spring hackfest and conference in the the latter part next year!

GStreamer and WebRTC HTTP signalling

The WebRTC nerds among us will remember the first thing we learn about WebRTC, which is that it is a specification for peer-to-peer communication of media and data, but it does not specify how signalling is done.

Or put more simply, if you want call someone on the web, WebRTC tells you how you can transfer audio, video and data, but it leaves out the bit about how you make the call itself: how do you locate the person you’re calling, let them know you’d like to call them, and a few following steps before you can see and talk to each other.

WebRTC signalling
WebRTC signalling

While this allows services to provide their own mechanisms to manage how WebRTC calls work, the lack of a standard mechanism means that general-purpose applications need to individually integrate each service that they want to support. For example, GStreamer’s webrtcsrc and webrtcsink elements support various signalling protocols, including Janus Video Rooms, LiveKit, and Amazon Kinesis Video Streams.

However, having a standard way for clients to do signalling would help developers focus on their application and worry less about interoperability with different services.

Standardising Signalling

With this motivation, the IETF WebRTC Ingest Signalling over HTTPS (WISH) workgroup has been working on two specifications:

(author’s note: the puns really do write themselves :))

As the names suggest, the specifications provide a way to perform signalling using HTTP. WHIP gives us a way to send media to a server, to ingest into a WebRTC call or live stream, for example.

Conversely, WHEP gives us a way for a client to use HTTP signalling to consume a WebRTC stream – for example to create a simple web-based consumer of a WebRTC call, or tap into a live streaming pipeline.

WHIP and WHEP
WHIP and WHEP

With this view of the world, WHIP and WHEP can be used both for calling applications, but also as an alternative way to ingest or play back live streams, with lower latency and a near-ubiquitous real-time communication API.

In fact, several services already support this including Dolby Millicast, LiveKit and Cloudflare Stream.

WHIP and WHEP with GStreamer

We know GStreamer already provides developers two ways to work with WebRTC streams:

  • webrtcbin: provides a low-level API, akin to the PeerConnection API that browser-based users of WebRTC will be familiar with

  • webrtcsrc and webrtcsink: provide high-level elements that can respectively produce/consume media from/to a WebRTC endpoint

At Asymptotic, my colleagues Tarun and Sanchayan have been using these building blocks to implement GStreamer elements for both the WHIP and WHEP specifications. You can find these in the GStreamer Rust plugins repository.

Our initial implementations were based on webrtcbin, but have since been moved over to the higher-level APIs to reuse common functionality (such as automatic encoding/decoding and congestion control). Tarun covered our work in a talk at last year’s GStreamer Conference.

Today, we have 4 elements implementing WHIP and WHEP.

Clients

  • whipclientsink: This is a webrtcsink-based implementation of a WHIP client, using which you can send media to a WHIP server. For example, streaming your camera to a WHIP server is as simple as:
  • whepclientsrc: This is work in progress and allows us to build player applications to connect to a WHEP server and consume media from it. The goal is to make playing a WHEP stream as simple as:

The client elements fit quite neatly into how we might imagine GStreamer-based clients could work. You could stream arbitrary stored or live media to a WHIP server, and play back any media a WHEP server provides. Both pipelines implicitly benefit from GStreamer’s ability to use hardware-acceleration capabilities of the platform they are running on.

GStreamer WHIP/WHEP clients
GStreamer WHIP/WHEP clients

Servers

  • whipserversrc: Allows us to create a WHIP server to which clients can connect and provide media, each of which will be exposed as GStreamer pads that can be arbitrarily routed and combined as required. We have an example server that can play all the streams being sent to it.

  • whepserversink: Finally we have ongoing work to publish arbitrary streams over WHEP for web-based clients to consume this media.

The two server elements open up a number of interesting possibilities. We can ingest arbitrary media with WHIP, and then decode and process, or forward it, depending on what the application requires. We expect that the server API will grow over time, based on the different kinds of use-cases we wish to support.

GStreamer WHIP/WHEP server
GStreamer WHIP/WHEP server

This is all pretty exciting, as we have all the pieces to create flexible pipelines for routing media between WebRTC-based endpoints without having to worry about service-specific signalling.

If you’re looking for help realising WHIP/WHEP based endpoints, or other media streaming pipelines, don’t hesitate to reach out to us!

GStreamer for your backend services

For the last year and a half, we at Asymptotic have been working with the excellent team at Daily. I’d like to share a little bit about what we’ve learned.

Daily is a real time calling platform as a service. One standard feature that users have come to expect in their calls is the ability to record them, or to stream their conversations to a larger audience. This involves mixing together all the audio/video from each participant and then storing it, or streaming it live via YouTube, Twitch, or any other third-party service.

As you might expect, GStreamer is a good fit for building this kind of functionality, where we consume a bunch of RTP streams, composite/mix them, and then send them out to one or more external services (Amazon’s S3 for recordings and HLS, or a third-party RTMP server).

I’ve written about how we implemented this feature elsewhere, but I’ll summarise briefly.

This is a slightly longer post than usual, so grab a cup of your favourite beverage, or jump straight to the summary section for the tl;dr.

Read More

Introducing peerflixsrc

Some of you might have been following all the brouhaha over Popcorn Time. I won’t get into the arguments that can be made for and against at the moment.

While poking around at what it was that Popcorn Time was doing, I stumbled upon peerflix, a Node.js-based application that takes a .torrent file that points to one big video file, and presents that as an HTTP stream. It has its own BitTorrent implementation where it prioritises early chunks of the file so that it is possible to start watching the video before the entire file has been downloaded. It also seeds the file while the video is being watched locally.

Seeing as I was at the GStreamer Hackfest in Munich when this came up in discussions, it seemed topical to have a GStreamer element to wrap this neat bit of functionality. Thus was peerflixsrc born. This is a simple source element that takes a URI to a torrent file (something like torrent+http://archive.org/some/video.torrent), fires up peerflix in the background, and provides the data from the corresponding HTTP stream. Conveniently enough, this can be launched using playbin or Totem (hinting at the possibilities of what can come next!). Here’s what it looks like…

Screenshot of Totem playing a torrent file directly using peerflixsrc

Screenshot of Totem playing a torrent file directly using peerflixsrc

The code is available now. To use it, build this copy of gst-plugins-bad using your favourite way, make sure you have peerflix installed (sudo npm install -g peerflix), and you’re good to go.

This is not quite mature enough to go into upstream GStreamer. The ugliest part is firing up a Node.js server to make this work, not the least because managing child processes on Linux is not the prettiest code you can write. Maybe someone wants to look at rewriting the torrent bits from peerflix in C? There don’t seem to be any decent C-based libraries for this out there, though.

In the mean time, enjoy this, and comments / patches welcome!

GStreamer Hackfest 2014

Last weekend, I was at the GStreamer Hackfest in Munich. As usual, it was a blast — we got much done, and it was a pleasure to meet the fine folks who bring you your favourite multimedia framework again. Thanks to the conference for providing funding to make this possible!

My plan was to work on making Totem’s support for passthrough audio work flawlessly (think allowing your A/V receiver to decode AC3/DTS if it allows it, with more complex things coming the future as we support it). We’ve had the pieces in place in GStreamer for a while now, and not having that just work with Totem has been a bit of a bummer for me.

The immediate blocker so far has been that Totem needs to add a filter (scaletempo) before the audio sink, which forces negotiation to always pick a software decoder. We solved this by adding the ability for applications to specify audio/video filters for playbin to plug in if it can. There’s a now-closed bug about it, for the curious. Hopefully, I’ll get the rest of the work to make Totem use this done soon, so things just work.

Now the reason that didn’t happen at the hackfest is that I got a bit … distracted … at the hackfest by another problem. More details in an upcoming post!

Picking your battles

Most of you have no doubt already seen that Mozilla will be changing their position on H.264 support for HTML5 video in future releases. This is an extremely important decision that I’ve been hoping to see for a while now, and I am really glad this is being done.

There is no doubt that we need patent-unencumbered standards for web codecs (or as much as is possible given the dismal patent ecology today), and while much giddy anticipation followed Google/On2’s release of VP8 into the open, I don’t believe it ever made sense to expect the codec landscape to change drastically in the short timespan everyone expected. There’s a lot of the hardware and software out there that needs to change (see any SoCs with VP8 support yet?), not to mention the interests of the MPEG-LA mafconsortium.

I love Firefox, both as a product and what it means for an open web (for those of you that know me, this might be hard to believe given all my ranting, but it’s true!). I’m glad Mozilla chose to live to fight another day rather than go out in a blaze of glory and (or a flicker of irrelevance).

p.s.: these are my views and do not necessarily represent those of my employer

p.p.s.: Alessandro’s been doing some great work to get the GStreamer multimedia backend going again (this makes so much more sense than going the NIH route!)

More PulseAudio power goodness

[tl;dr — if you’re using GNOME or a GStreamer-based player, not using the Rhythmbox crossfading backend, and want to try to save ~0.5 W of power, jump to end of the post]

Lennart pointed to another blog post about actually putting PulseAudio’s power-saving capabilities to use on your system. The latter provides a hack-ish way to increase buffering in PulseAudio to the maximum possible, reducing the number of wakeups. I’m going to talk about that a bit.

Summarising the basic idea, we want music players to decode a large chunk of data and give it to PA so that we can then fill up ALSA’s hardware buffer, sleep till it’s almost completely consumed, fill it again, sleep, repeat. More details in this post from Lennart.

The native GNOME audio/video players don’t talk to PulseAudio directly — they use GStreamer, which has a pulsesink element that actually talks to PulseAudio. We could configure things so that we send a large amount (say 2 seconds’ worth) to PulseAudio, sleep, and then wake up periodically to push out more. Now in the audio player (say Rhythmbox), the user hits next, prev, or pause. We need to effect this change immediately, even though we’ve already sent out 2 seconds of data (it would suck if you hit pause and the actual pause happened 2 seconds later, wouldn’t it?). PulseAudio already solves because it can internally “rewind” the buffer and overwrite it if required. GStreamer can and does take advantage of this by sending pause and other control messages out of band from the data.

This all works well for relatively simple GStreamer pipelines. However, if you want to do something more complicated, like Rhythmbox’ crossfading backend, things start to break. PulseAudio doesn’t offer an API to do fades, and since we don’t do rewinds in GStreamer, we need to apply effects such as fades with a latency equal to the amount of buffering we’re asking PulseAudio to do. This makes for unhappy users.

Well, all is not as bleak as it seems. There was some discussion on the PA mailing list, and the need for a proper fade API (really, a generic effects API) is clear. There have even been attempts to solve this in GStreamer.

But you want to save 0.5 W of power now! Okay, if you’re not using the Rhythmbox crossfading backend (or are okay with disabling it), this will make Rhythmbox, Banshee, pre-3.0 Totem (and really any GNOMEy player that uses gconfaudiosink, which will soon be replaced by gsettingsaudiosink, I guess), you can run this on the command line:

gconftool-2 --type string \
    --set /system/gstreamer/0.10/default/musicaudiosink \
    "pulsesink latency-time=100000 buffer-time=2000000"

On my machine, this brings down the number of wakeups per second because of alsa-sink to ~2.7 (corresponding nicely to the ~350ms of hardware buffer that I have). With Totem 3.0, this may or may not work, depending on whether your distribution gives gconfaudiosink a higher rank than pulseaudiosink.

This is clearly just a stop-gap till we can get things done the Right Way™ at the system level, so really, if things break, you get to keep the pieces. If you need to, you can undo this change by running the same command without the latency-time=… and buffer-time=… bits. That said, if something does break, do leave a comment below so I can add it to the list of things that we need to test the final solution with.