WebRTC Expert Feature

June 16, 2021

Data Annotation: Why We Can't Take the Human Out of the Loop


So much of machine learning appears, from the outside, to be grounded within quantitative data sets and code, that those who exist on the periphery could be forgiven for assuming that this represents a wholly technical world of numbers and objective fact. And, while this holds true to a certain extent – we cannot, after all, communicate with and educate machines without speaking their language – it is not the full picture.

The term ‘human in the loop’ (HITL) refers to an integral human element in many of the processes necessary to creating an artificial intelligence. The extent to which human feedback is required depends entirely on the project itself, and, of course, on which stage developers are at in the process of training their AI. But, whatever the capacity in which it is used, the ongoing value of positioning a human within the data loop is clear to see.

This holds particularly true in the realm of data annotation for the purpose of creating a vision AI capable of perceiving, interpreting, and extracting meaning from image and video.

Automation, and the Role of Auditing

Data annotation is a vital aspect of machine learning. Carefully curated datasets represent the teaching tools for any vision AI as, in sufficient quantities, they create, refine, and augment a machine’s ability to ‘perceive’ the world through visual data.    

This visual data must, of course, be converted from raw images and audio into comprehensible, labelled datasets – a little like translating text into a language understood by the receiver. Of course, compiling labelled data on the sort of scale required to create a powerful and accurate vision AI is no mean feat – enterprises require massive quantities of data, or their efforts will likely be wasted.

This is where a well-measured combination of a machine and human intelligence becomes so valuable. Machines, while capable of sorting through far greater quantities of data than humans, require auditing from humans in order to achieve the levels of accuracy to which all developers aspire. The same holds true whether we consider AI for medical diagnostics, production chains, autonomous cars, content algorithms, or any other system that rests on advanced machine learning.

An Adaptable Role

While the significance of the human in the loop is clear to see, that’s not to say that the role is not a flexible one. Rather, as the machine learning pipeline grows, develops, and yields increasingly accurate results – and the datasets continue to grow in size and number – the vision AI’s capabilities, and the role of human intelligence, both begin to change.

While, in the early days of collecting datasets, human intelligence played a more central role, successful development means that the human can recede to the periphery, and label those issues that occur extreme operating thresholds, for instance. In other words, machine intelligence improves with use – and, with that improvement, the human element can take more of a backseat.

There are many reasons why the human in the loop remains essential to AI development, from detecting and avoiding biases from developing, to ensuring that the most specialist knowledge is applied to training what will, in the future, prove to be an equally specialized AI. Thus, from relatively simple processes to incredibly complex technologies and services, the human remains central to any machine intelligence. 

Get stories like this delivered straight to your inbox. [Free eNews Subscription]


Free WebRTC eNewsletter

Sign up now to recieve your free WebRTC eNewsletter for all up to date news and conference details. Its free! what are you waiting for.