WebRTC Expert Feature

August 07, 2013

The Poverty of Hollywood (Squares)


You are supposed to know when you have “real” video conferencing because that’s when everyone turns up on your screen in their own little “Hollywood Squares” box, each of which get smaller and smaller as more people arrive and then they become really hard to see once people start sharing content as well. This is called “continuous presence” and is intended to be a great advance over “active speaker” video where you only see one speaker at a time which is switched around according to who is speaking.

For smaller conferences with a fewer people, continuous presence works and feels just fine. However, in larger conferences -- above about six people in my experience -- while I get some benefit all from the “introductions” as people arrive where we can see that everyone is lookin’ good and we all say hi, I don’t feel I am getting much productivity benefit from all the little squares within these large conferences especially once the content sharing kicks in. Now some video clients give each user control of screen layout (I am familiar with Avaya/Radvision Scopia that does this well and I am sure others do too) which allow you to setup who you want to see the most and where on the screen. But in larger conferences do I really need to see everyone else, who is picking their nose, who is asleep, and those who have failed to comprehend the following tips?

On the server-end, continuous presence is far and away the most expensive way to handle the session since it demands the use of an expensive MCU (hardware, software or cloud) to mix all the user streams together to produce the full continuous presence effect on all endpoints. But why is this a good idea? I know it’s great for all the vendors selling or renting MCU resources, but why pay for this if you are not getting a significant additional productivity benefit in large meetings? We have started to see some pushback against the Hollywood model as vendors have introduced scalable video coding (SVC) and “video routers” where the objective is to dramatically increase scale and lower cost by routing and “slicing” video streams without ever have to decode, mix, and re-encode them. These vendors’ SVC interfaces have often started with active speaker or a limited number of active speakers models. However, the feeling still seems to be there in the industry that it would be “better” if one could somehow get back closer to full continuous presence.


Image via Shutterstock

WebRTC now re-asks the same question – are “real” WebRTC-based video conferences always going to need centralized cloud MCU resources to mix all the streams just like traditional conferencing or should we just rely on how much can we do by just meshing clients together? Once again, this is not a big issue for small conferences – four people on the screen looks fine and can be meshed with just a bit of complexity and bandwidth – but what about larger conferences?

I want to suggest that there is a poverty of video conferencing design patterns for more productive large conference interfaces that lie between the extremes of single active speaker and full continuous presence. We as an industry just don’t seem to have thought about it enough (although I look forward to comments from those saying they have already solved the problem). So I think the WebRTC “innovation space” of millions of Web developers experimenting with new approaches to communications is an excellent laboratory for experimenting with new user interface models for productive collaboration!

To throw some random starter ideas out there (untested, so take with enough salt): how many visible speakers do many large conferences really need? Let’s assume we can see three people – that would allow us to highlight the conference owner (maybe the “boss”) and keep her/him in constant view so we can see the scowls and then have the two most recent active speakers who don’t have to flip during dialogs (and the boss never flips either of these). Only three concurrent, albeit two changing, streams to manage and distribute along with some speaker identification logic across all the clients, while leaving plenty of screen real-estate for all the content sharing that everyone needs to be looking at. How about making it easy for the conference owner to pin someone into a fourth, or back on the third, stream view so that the interviewee, or executive reviewer, or honored guest stays visible to all during the session even when not talking? Still want to see the nose picking or gauge audience reactions? How about a not-too-obtrusive thumbnail that automatically rotates though everyone on the call – that will still keep people on their toes! If more reaction is needed, then why not have more of the Web conferencing style visual feedback options and just show this somewhere in the interface and roster – more effective than video for sentiment accumulation. Introductions? How about the owner gets an “intro” button that automatically cycles through everyone on the call and each person gets a “red light” and 10 seconds to say hello (the new video version of 140 characters, and if you don’t talk you lose your moment of fame)? And the interface could cycle through people arriving before the owner starts to provide some benefit for getting there early.

Now I believe that some existing MCU-based video systems may have a few of these underlying capabilities, but I am not aware of any that put a simple well-thought-through set of “managed streams” collaboration models and controls directly in the hands of ordinary mortals driving conferences. I am not saying that any of my ideas are particularly good, and they are certainly not user-tested, but I believe even this short thought-experiment reveals a large unexplored terrain of possibly better collaborative models for combining people, video, content and feedback to truly make larger conference sessions more productive. And I think the HTML5 and WebRTC developer space has the user interface skills and application innovation to properly explore this terrain and bring back the gold!



Get stories like this delivered straight to your inbox. [Free eNews Subscription]




FOLLOW US

Free WebRTC eNewsletter

Sign up now to recieve your free WebRTC eNewsletter for all up to date news and conference details. Its free! what are you waiting for.