VoIP Fundamentals | OnSIP

RTCPeerConnection - WebRTC Explained

Written by OnSIP | April 29, 2014 at 5:02 PM

WebRTC offers unprecedented opportunities for developers who want to incorporate real-time communications into their apps. The WebRTC APIs - getUserMedia, RTCPeerConnection, and RTCDataChannel - each play its own part in capturing, transmitting, and streaming real-time data (from a computer’s webcam and microphone) to another browser, without requiring a user to download plugins or addons. RTCPeerConnection is responsible for connecting two browsers together so that they can share real-time media. It sounds straightforward, but the journey that the WebRTC packets take across the internet is highly perilous to navigate. This is where RTCPeerConnection and a reliable signaling platform come in.

Agnostic Signaling: The Premise of RTCPeerConnection

RTCPeerConnection coordinates the exchange of crucial metadata between two browsers. This data defines what a browser’s publicly identifiable IP number and port address so that real-time media can be exchanged. For two WebRTC endpoints to begin talking to each other, three kinds of information must be relayed. Session control information determines when to initialize, close, and modify communications sessions. Network Data relays the IP address and port number of each endpoint so that callers can find callees. Media Data relates what codecs and media types the callers have in common.

In order for the connection to work, RTCPeerConnection must acquire local media conditions (resolution and codec capabilities, for instance) for metadata, and gather possible network addresses for the application's host. But the signaling mechanism that passes this data from one browser to the next is not built into the RTCPeerConnection API. WebRTC is signaling agnostic, which means that signaling is mandatory, but there are no defined signaling standards that all WebRTC applications are required to use. So there’s no way of sidestepping the signaling issue, but there are pre-built signaling platforms that handle the complex task of ushering WebRTC packets across the internet.

RTCPeerConnection Line by Line

The RTCPeerConnection API is built for JavaScript. The commands themselves are relatively intuitive and easy to learn for the average web developer. Take a glance at Google’s documentation of the various WebRTC commands here. How does RTCPeerConnection work on a line by line level? Let’s take a look at some code.

After obtaining a media stream through the getUserMedia API, a WebRTC app’s next step is to transmit the captured real-time data to another browser. RTCPeerConnection essentially standardizes this process. After an RTCPeerConnection is created and a local stream (from getUserMedia) is added to it, an offer is created and extended to the other browser. This offer enumerates the potential codecs, encryption methods, and other initiating information available for a WebRTC session. This process uses the session description protocol (SDP), which also happens to be used by SIP. After this description is generated, it is sent to a potential peer via a signaling method:


var callerPC = new RTCPeerConnection();
callerPC.addStream(localStream);
callerPC.createOffer(gotSDP);

function gotSDP(description) {
  callerPC.setLocalDescription(description);
  /* Now the description must be sent to the peer.
  This tells the peer how to connect.  It could be anything,
  but here we’ll name this method ‘invite’ */
  invite(description);
}

On the callee side, the description is received, again through an unspecified means:


var calleePC;

// This function would be called when receiving a remote connection
function processRemoteDescription(description) {
  // Callee creates PeerConnection
  calleePC = new RTCPeerConnection();
  calleePC.setRemoteDescription(description);
  calleePC.createAnswer(function (localDescription) {
    calleePC.setLocalDescription(localDescription);

    /* We need to send our own SDP back to the caller.
    This method is left up to the application to create.  Here,
    we name the function ‘okay’ */
    okay(localDescription);
  });
  calleePC.onaddstream = function (remoteStream) {
    // Show the caller’s stream in a video/audio element
  };
}

Lastly, the caller needs to receive the answer and add the new remote media stream:


function onOkay(description) {
  callerPC.setRemoteDescription(description);
}
callerPC.onaddstream = function (remoteStream) {
  // Show the callee’s stream in a video/audio element
};

OnSIP for Developers: Powering RTCPeerConnection

RTCPeerConnection’s agnostic signaling standard ensures that developers have a wide array of relay options when it comes to creating a WebRTC-based app. But the premise of actually building your own signaling architecture, replete with a complex server-side solution comprised of expensive equipment and a costly construction, is perhaps one of the primary factors discouraging developers from incorporating WebRTC into their apps. The prospect of utilizing a proprietary platform is also undesirable for developers who want flexibility and control over their applications.

OnSIP defuses these concerns by offering developers a simple solution to the surprisingly involved process of getting two browsers to exchange metadata. OnSIP has a mature, reliable SIP platform that has fully incorporated WebRTC into its core architecture. Developers can harness our pre-built network of geographically distributed SIP proxies to initiate WebRTC connections behind NATs and firewalls, bridge compatibility gaps between browsers, expand WebRTC apps on a massive scale, and track communications for information and accuracy. If RTCPeerConnection is the vehicle that allows browsers to stream real-time data without plugins, then OnSIP's platform is simply the fuel that allows this machine to run.