One area of confusion in WebRTC has been the comment that it is real-time communications for browsers. I have repeatedly heard this in discussions and seen it in public comments. While the first places the WebRTC standard is being implemented is in browsers, there is no real limitation on WebRTC locations or devices. The standard is written to provide two basic defined functions: a standard set of APIs that a server can use to manage endpoint behavior and a set of protocol to enable peers to talk. There is no requirement that a peer be a browser, only that it must conform to a very basic set of RTP and API standards.
From early on, the standards took a path that did not preclude peer endpoints other than browsers being part of a WebRTC solution. This was obviously required for things like conference bridging or centralized recording. In this way, a peer can be any endpoint that can use WebRTC. However, there are no standards other than the W3C APIs for how to control that peer end point. In a situation where there are more than a "few" endpoints collaborating together using WebRTC, a media server must manage the multiple video feeds, either through an MCU type mixer or through video switching/routing. In either case, that central video media server is acting as a peer to the other end points to get and return a video stream, reducing the bandwidth and processing the impact of having too many peer streams. The result is that in this type of video conference each peer to peer connection has one end as a browser, the other as a media server. Uberconference has implemented precisely this architecture by using WebRTC to integrate browser users into its audio conferencing solution.
Image via Shutterstock
Vonage announced this week that it has used the lower level WebRTC native stack to implement WebRTC directly into a mobile client. By doing this and using common codecs, now that mobile device can communicate with the Vonage backend system seamlessly. This allows the use of WebRTC for some devices using common RTP, other APIs and signaling for other parts of the system. This code base, sometimes referred to as the Google Media Engine, was open sourced by Google in 2011 and is the basis of their WebRTC implementation. I think we should expect to see many other traditional communications players adopt this media engine to implement innovative solutions with WebRTC that are not browser based.
It is interesting to reflect on the cases of WebRTC and how many of those may not have both the peers as a traditional browser. For example, a camera could have the WebRTC API, protocols, and codec and therefore be controllable for a server using the WebRTC APIs and simple JavaScript. It would be simple to have a browser "subscribe" to a video feed from the camera using WebRTC. This could have many uses in surveillance and other environments. While the browser user would have an HTML(5) interface, the camera might only have the basic APIs for control of the media and peer connections. Similarly, a device that is only a sensor can be controlled using WebRTC and sending data through the data channel. In the end, there may be many more non-browser devices using WebRTC and adding value to the end game.
Edited by
Rich Steeves