Untangling the WebRTC Flow

moron4hire · on Oct 22, 2014

Your article is very good, but I only know that because I've been through this myself. I'm not sure I would have completely understood WebRTC until I implemented it in an application by myself for the first time. Once you get it, it's actually quite simple. I think it's the unfamiliarity of the various terms like ICE, STUN/TURN, and SDP, and everyone treating the issue like everyone should A) already know what those are, or B) be able to figure out the implications once given the acronym definition. If you're going from WebSockets to WebRTC, there is a lot of new terminology to swallow all at once.

But you definitely helped clear up some of the rougher points for me. Like, I didn't realize that createOffer should be called after the negotiationNeeded event occurred. I just call it immediately after creating my DataChannel (I'm doing text-only transmissions as part of a system to combine inputs from multiple devices to play a game). I had wondered what the negotiationNeeded event was all about, but none of the articles I'd found had really said anything clear about it, and it wasn't in any of the example code I found.

And I found all of this through MDN! The existing documentation for WebRTC is apparently really bad. And that's quite disappointing, because I've usually found MDN to either be excellent or merely incomplete, not specifically having documentation that was poor.

Again, thanks for this.

pkcsecurity · on Oct 22, 2014

Thanks! Yeah, the major motivation behind this was to fill in some missing gaps in the documentation. There unfortunately is a bunch of incomplete and/or inaccurate info that's out there (maybe because the spec is still in flux).

AjithAntony · on Oct 22, 2014

Does anyone happen to know how one is expected to choose the output audio device? Seems like every WebRTC app just uses the default output which is completely the wrong thing to do when using a headset.

MichaelGG · on Oct 23, 2014

Why would any sane person bring SDP into WebRTC? They had a fresh slate to get things right, and instead they end up making it worse than the SIP headache. SDP, the spec that suggests people might use SDP to setup chess games is a terrible format. In fact, unless they changed the offer answer model, there's actually no way to determine what single codec is in use. It's one of those things where every vendor sorta does something, and relies on tons of interop and guessing to actually come up with an implementation. Plus it's yet another custom parser with a strange format. And deliberately clipped in verbosity (for a text format) because the SIP authors were terrified of IP fragmentation and actually spec'd a hardcoded MTU size.

Edit: Frustration aside, this is a really nice explanation, thanks.

wahern · on Oct 23, 2014

Where is the MTU of SDP specified? I can't find it in RFC 4566.

Furthermore, the SDP configurations that Windows Media Services emits can include a multi-kilobyte base64-encoded ASF configuration element. So it's not like applications can't choose to ignore such a rule, or agree on a standard that drops such a limitation.

If I had to guess why they used SDP it's because 1) it's simple to parse and 2) WebRTC also uses RTP, and RTP libraries already typically include or are accompanied by SDP libraries.

I dislike standards that provide so much rope for people to hang themselves. Multimedia standards are horrible rats nests of options, sub-options, and sub-sub-options, the vast majority of which nobody uses or even parse correctly. And while SDP is simple, it aids & abets such behavior by being such a generic format.

Still, it's common and there are plenty of standards which define various SDP configurations.

MichaelGG · on Oct 23, 2014

The MTU is specified in the SAP protocol. Which is even more strange; a mixed binary and ASCII system, rife with the IETFs poorly conceived SHOULD and MAY clauses, which serve no purpose other than to cause interop issues. Anyways, perhaps networks in 1999 didn't know how to handle IP fragmentation. SIP made the MTU issue worse, requiring switching to TCP on-the-fly when a message exceeded their arbitrary MTU. I'm saying perhaps these considerations are why SDP has such strange choices in formatting.

In context of WebRTP, SDP provides no benefits. Might as well used a JSON representation of the media negotiation, requiring no additional parser. Things are already ambiguous and poorly handled when using SDP in the real world. I think it must be something about "simple" permissive protocols that encourage implementation issues.

acron0 · on Oct 22, 2014

I found http://peerjs.com/ really, really helpful when attempting a webRTC project. Like most people, I found the spec to be a lot of gibberish and even the majority of examples are hard to follow. PeerJS 'just works'.