MoQ is “Media over QUIC,” although I suspect it’ll eventually go the npm path and just end up being MoQ means MoQ.
Depending on the use case you could think of it as an alternative to WebRTC with lower level control, but honestly it’s a lot more open ended than that.
That framing helps. When people compare MoQ with WebRTC, is the main attraction lower-level control over transport/media semantics, or are there cases where MoQ is expected to be materially better for latency or reliability?
I’m trying to understand whether it’s mainly a replacement for specific WebRTC use cases, or more of a building block for new kinds of real-time systems.
There's a few cases where WebRTC falls apart that I think MoQ could help with.
It doesn't work so well for having a low-latency broadcast. Your choices right now are - use WebRTC and deploy selective forwarding units, which are going to be something custom, and likely involve spinning up a bunch of geographically-distributed virtual machines, figuring out signalling and whatnot. Or - use HLS so you can use more standard HTTP CDN tech, but you gain orders of magnitude of latency.
MoQ should allow for a standardized CDN stack, meaning we should be able to have a more abstract service (instead of spinning up VMs, you just employ some company's CDN service and tell it where to get media from).
There's a lot of other little issues with WebRTC for certain, specific applications. Like - last I tried it, browsers will subtly speed up audio/video to keep everything in sync, and you can have scenarios where you'd rather just let the viewer fall behind a bit and skip ahead later (say you're listening to music, speeding it up isn't ideal).
Or - say you want to have a group call and capture each participant's audio individually and edit it together later for something like a podcast. It's been a while since I've tried this, but I recall it being pretty difficult to do that with WebRTC. I remember all the mixing would happen in the browser's libwebrtc and I had really limited control over things.
> use WebRTC and deploy selective forwarding units, which are going to be something custom
Would you mind explaining more? If you are doing WHIP/WHEP you should be able to drop in Broadcast Box/MediaMTX etc... and switch out servers and no one should notice. You can use browser/mobile/ffmpeg/OBS etc... get the same behavior. I care a lot about the broadcast space, want to learn about other problems.
> subtly speed up audio/video to keep everything in sync
Regarding SFUs - with something like HLS, I can really easily scale up using something like a caching CDN (not entirely sure if that's the right term). But the idea goes: I can distribute the HLS media playlist, and have my media segment entries prefixed with a caching/CDN service. The service will be configured with the actual origin server, and when a segment isn't in the CDN, the CDN fetches from the origin, on-demand. That was a nice option when I was doing owncast streaming since I really only paid based on viewership, and just had to make sure I had the correct cache-related headers on my media segments.
Or alternatively - I can push media segments up to a CDN and distribute that way, using an s3-compatible service, or just rsyncing to a server with better bandwidth, etc. One thing I didn't care for - again back when I was broadcasting with Owncast - was that I needed to make sure old media segments were expired, otherwise I would rack up an insane bill. I had a 24/7 owncast stream and if you're not on top of expiring media segments with your CDN, it gets expensive fast.
The overall idea is - serving HLS is ultimately serving files and there's a good amount of tooling for that, right.
Now that you mention it, I think WHIP/WHEP can solve some of that. I just don't know of any service where I can have that same cache/CDN-like experience, of either having the CDN connect to the origin as needed and fan-out, or where I can push up and let the service distribute. (though - now I'm googling for "webrtc sfu as a service" and see that is a thing!).
Didn't know about the playout delay extension.
Whether capturing individual audio is easier with RtpTransport or insertable streams - I'm unsure. Possibly? I just figure since MoQ is going to rely on things like WebCodec/WebAudio there's hopefully a bit more control over what happens with audio as it comes in.
I'll admit though - I've started noticing how often podcasts are clearly recorded using something that doesn't allow per-participant recordings and, I'm guessing as long as the quality is good enough most aren't worrying about it.
EDIT: feel like I should mention Pion rules, I used it a few years ago to put together an SRT-to-WebRTC thing and RTMP-to-WebRTC thing to use with Janus Gateway, it was so easy.
Ah...you're scratching at some scabs with this totally reasonable question.
We learned some tough lessons with media-chrome[1] and Mux Player, where we tried to just write web components. The React side of things was a bit of a thorn, so we created React shims that provided a more idiomatic React experience and rendered the web components...which was mostly fine, but created a new set of issues. The reason we chose web components was to not have to write framework-specific code, and then we found ourselves doing both anyway.
With VJS 10 I think we've landed on a pretty reasonable middle ground. The core library is "headless," and then the rendering layer sits on top of it. Benefit is true React components and nice web components.
I'm a little shocked that someone like 1Password hasn't released a vault access + approvals model. There are a lot of little menial tasks that I'd love for an agent to take care of ("book me a hair appointment next week when my calendar says I'm free"). Agent has access to a locally synced calendar and can see the existence of a password for the booking portal in my vault, asks to use it, I get a push notification and can approve.
These kinds of things aren't common enough for me to want to set up a programmatic policy, and are also low sensitivity enough that I don't mind giving access to complete the task. If it later asks to log into my bank, I decline.
I know the devil's in the details for how to actually do this well, but I would love if someone figured it out.
Big +1 to Pi[1]. The simplicity makes it really easy to extend yourself too, so at this point I have a pretty nice little setup that's very specific to my personal workflows. The monorepo for the project also has other nice utilities like a solid agent SDK. I also use other tools like Claude Code for "serious" work, but I do find myself reaching for Pi more consistently as I've gotten more confident with my setup.
generate_gazette.sh Calls OpenAI to generate "The Civ Chronicle" — an era-appropriate, unreliable wartime newspaper article for each turn.
For a long-running game like this, that's a pretty clever little twist to keep the group engaged. I have extremely low confidence I could convince enough friends to do it with me for long enough to get through a game, but this seems like such a fun idea.
This was a friends suggestion after I initially proposed something that exposed a bit too much detail. I wanted to show the diplomacy states and unit/city additions per player as a highlight on the home page, but we instead kept the raw files that generated those UI elements and fed them into OpenAI with the prompt and the Gazette was born.
You can read the code, but my suggestion on how to do this is to just ingest the save game files. They're super easy to work with, especially now we have the magic of LLMs.
In my case i'm creating various json files that describe game state, players, diplomacy, attendance, etc. Then i just throw that at the LLM and give it the goal of writing a newspaper article about where the game is at. The json files are incremental, so i'm not reading all the past versions on every turn. At the end of the current turn it just appends the current turn data to the json file, and then does the generation. These files are also what power the frontend UI, so it's all super lightweight and fast.
To make the graphs, you need to know state at the end of the current turn as the save files don't retain any history. So i store the most recent save file from each turn in order to achieve this.
It's been a journey of reverse engineering this game, but that's kind of the joy of it all.
We don't use New Relic or Datadog (and never have, afaik), so I'm not sure what post you could be referring to for those two? We have talked publicly about our Grafana use, though, and going from an in-house stack to their cloud product. Actual OP can probably hop in later with a better answer, but it was hitting rate limits on the logging agent, not the logging system.
I went down this path a bit the other night, curious what OP's answer is. My mental model was that they could be complimentary? Jido for agent lifecycle, supervision, state management, etc, LangChain for the LLM interactions, prompt chains, RAG, etc. Looks like you could do everything in Jido 2.0, but if you like/are familiar with LangChain it seems like they could work well together.
I haven't used Jido for anything yet, but it's one of those projects I check in on once a month or so. BEAM does seem like a perfect fit for an agent framework, but the ecosystem seeming limited has held me back from going too far down that path. Excited to see 2.0!
Just a heads up, some of your code samples seem to be having an issue with entity escaping.
Depending on the use case you could think of it as an alternative to WebRTC with lower level control, but honestly it’s a lot more open ended than that.