Human in the Loop: Accelerating the AI Lifecycle

Executive Summary

The AI lifecycle moves from proof of concept to proof of scale, then to model in production. This is a cyclical process, and people play an important role. Humans in the loop (HITL) inspect, validate, and make changes to algorithms to improve outcomes. They also collect, label, and conduct quality control (QC) on data. The benefits of humans in the loop begin with model development and extend across the AI lifecycle, from proof of concept to model in production.

Case study 1: A managed team builds a dataset of images to accelerate model development.
Case study 2: Workforce teams label and check the quality of data annotations that had required farming expertise.
Case study 3: Data analyst teams resolve exceptions on sports player analytics.

Poor utilization of people in the AI lifecycle is costly and yields low-quality data and poor model performance. It is best to consider workforce early in the AI lifecycle, and before model development begins, to achieve higher quality outcomes.

Building any solution that applies artificial intelligence (AI) means the journey will be perilous. The development team is among a sea of protagonists, most of whom will fail in their quest to use machine learning to solve a problem, launch a product, or transform an industry.

They will encounter many challenges: data scarcity, dirty data, workforce burdens, algorithm and quality control (QC) failures, and more. At every stage of the AI lifecycle, they will make decisions about their data and their workforce that will determine success or failure. Many of those choices will dictate their ability to quickly make changes in their process that could improve model performance.

Innovators have applied external workforces and technology to tackle these challenges and accelerate the AI lifecycle. Some have attempted to speed data processing at scale with crowdsourcing or automation, both of which require in-house staff and ongoing management. Others have used traditional business process outsourcing (BPO) providers. Within the last decade, managed outsourced teams have emerged.

No matter the approach, how development teams deploy people throughout the AI lifecycle is an important decision with far-reaching consequences. Knowing when and where to do so is key to reducing the number of failed AI projects and the time that is wasted on them.

In this guide, we share what we have learned from annotating data for hundreds of AI projects, from computer vision to natural language processing (NLP). To learn about how to design a human-in-the-loop workforce to build and maintain high-performing machine learning models, read on.

The term “human in the loop” originated to describe any model that requires human interaction. Where the human-in-the-loop (HITL) approach is used in real time and models are continually trained and calibrated, people are essential to achieving desired results. Examples include chatbots and recommendation engines.

Similarly, human-in-the-loop machine learning refers to the stages of the model development process that require a person to inspect, validate, or change some part of the process to train and deploy a model into production. An example is when a data engineer verifies a machine learning model's predictions before moving it to the next stage of development.

HITL also can extend to the people who prepare and structure data for machine learning.

Humans in the Loop: Data Preparation and Quality

There is a lot of thankless work that goes into the creation of a dataset, whether it will be used to train a model or to maintain one in production. Data drives the model development process, and working with data requires people.

Indeed, people are involved across the AI lifecycle, from exploratory data analysis to machine learning production. About 80% of total project time is consumed with data preparation, labeling, and processing, and the majority of that work is done by people, according to analyst firm Cognilytica.

Quality is an important factor because a machine learning model is only as good as the data that trains and retrains it. Quality data is the key to creating high-performance AI. Traditionally, scaling a people operation for this kind of work was difficult, so many AI project teams turned to crowdsourcing. This approach gave them access to a large number of workers, which provided one element of the required scale.

But data quality often suffered because workers were anonymous and changed over time. The resulting lack of accountability and workers’ unreliable communication with the AI development team stalled development and became costly, creating burdens on valuable internal resources and causing data annotation rework, low quality data, and poor model performance.

The majority (96%) of AI project failures occur midway through the model development process, due to poor data quality, labeling, or modeling, according to Dimensional Research.

Today, many organizations understand how people are involved in data labeling. Yet, many haven’t yet leveraged the benefits of human-in-the-loop machine learning across the AI lifecycle.

Deploying people strategically at other stages in the model development process, from testing to model maintenance and optimization, can improve models and their outcomes. Specifically, a trained and managed workforce can speed the time to market and reduce the burden of data work on internal resources.

The benefits of humans in the loop extend across the AI model development process:

Continuous improvement of machine learning and deep learning models requires people for data labeling, quality control (QC), and modeling.
Automation can speed the development process but it requires people to build, maintain, and monitor exceptions.
People design the processes that integrate an AI solution into an organization.
When AI models fail, people must step in to mitigate risk and resolve problems.

In the model development process, there are seven key areas where a human in the loop can make a significant impact. These areas are represented by the blue boxes in the image. In this context, human in the loop refers specifically to a data workforce, acknowledging the enormous amount of work performed by data scientists and engineers beyond the scope of this simplified process.

Many teams outsource data gathering, cleaning, and annotation so they can assign data scientists and other high-cost talent to projects that require more advanced skills. However, people are helpful far beyond these stages in the process. Teams can strategically deploy a managed workforce at every phase of model development to scale and accelerate the process.

Automation is helpful, and the optimal approach is blending it with people. When applying automation, it is important to consider how much time it will save - and cost, in terms of maintaining the tools, managing the people who use them, and monitoring for exceptions. Let’s take a closer look at how people are involved, and where they can be more strategically applied to accelerate the process.

HITL AND THE AI MODEL DEVELOPMENT PROCESS

Design and Build

Exponential amounts of quality data are required for peak algorithm performance. In fact, 80% of AI project time is spent in this phase of the model development process, including data acquisition, cleaning, and annotation. A human-in-the-loop workforce can complete much of the day-to-day data work that is required during these early, critical stages of deployment.

Whether labels are applied using manual or automated means, a human-in-the-loop workforce can be used to acquire and prepare data for annotation. With data cleansing and enrichment in particular, subjectivity and understanding of edge cases are important so this nuanced work can be completed accurately.

A human workforce plays a critical role in the acquisition and cleansing process, especially in cases where web scraping or data enrichment is involved, to ensure that only the most relevant, complete data is applied to the project.

A company was using computer vision to build analytics solutions and needed to acquire a dataset of objects being held by human hands.

CASE STUDY 1:

Data Acquisition and Cleansing

A company uses computer vision to build analytics solutions that transform video into reliable information for their clients. They needed to acquire a dataset of images to train their algorithm to identify different objects being held by human hands. Given the unique requirements for the data needed, the company turned to CloudFactory’s managed team to build a dataset of hundreds of thousands of images to their exact specifications. This helped them minimize the strain on their in-house development team and accelerate model development.

Deploy and Operationalize

This next phase of the model development process is where AI projects are put to the test and where most failures occur. As 80% of project time is spent in the first phase, it can be frustrating that 96% of failures won’t happen until they get to this point in the process. Innovators are turning to a human-in-the-loop workforce to avoid the pitfalls of poor data quality and the unreliable models that result.

Manual data labeling is the most readily accepted use for a human-in-the-loop workforce but an HITL workforce also can be used to conduct QC on data that has been auto-labeled. Auto-labeling is a feature found in data annotation tools that use AI to pre-label a dataset. These tools can augment the work of humans in the loop to save time and lower costs.

To achieve the highest quality results, auto-labeling should be paired with an HITL workforce to ensure the automation is performing as anticipated. By the same token, during this phase HITL workforces can be deployed to validate the outputs of the model itself, determining model accuracy and identifying areas for improvement.

An AgTech company uses imagery from drones and satellites to provide farmers with crop analytics to help them increase their yields.

CASE STUDY 2:

Data Annotation and Quality Control

Hummingbird Technologies is an agriculture technology (AgTech) company that uses imagery from drones and satellites to provide farmers with crop analytics that can help them increase their yields. They originally relied on agronomists, who are experts in soil and crop management, and their own data scientists to perform manual annotations on their datasets. Using an HITL workforce, Hummingbird outsourced its image annotation work and annotation QC to a dedicated CloudFactory team, allowing their own team to focus on accelerating and supporting new model development and improvements.

Refine and Optimize

In the final phase of the model development process, teams need to start thinking about what is needed for their model in production. They are monitoring to detect and resolve model drift and ensure high performance of the AI system. They also are thinking about how to handle automation exceptions.

They may need to gather, annotate, and train the model again to respond to changes in the real-world environment. Humans in the loop can support all of these important steps, monitoring outcomes, helping to maintain a healthy data pipeline, and identifying and resolving exceptions.

A SportsTech company uses machine learning to provide video analysis and coaching tools to sports teams.

CASE STUDY 3:

Validation, Exceptions, and Optimization

A sports technology (SportsTech) company uses machine learning to provide video analysis and coaching tools to sports teams. Player tracking algorithms identify key moments in the games for coaches and athletes. Outsourcing vital tasks allowed the development team more time to train their model and optimize performance, so they could focus on improving user experience.

CloudFactory’s team validated the model in production by ensuring player tracking labeling was accurate. They also handled exceptions by adding manual labels when the model was not able to do it due to, for example, poor video quality or players standing too close to one another. Data analysts also corrected labels with poor player tracking. All of this led to a better experience for users and a reduction in burdensome work for the model development team.

Now that we have outlined the place where a human-in-the-loop workforce can play an important role in the model development process, let’s look at the unique needs presented at each stage of the AI lifecycle. Each stage includes a subsection of the key areas we have covered, building in a cyclical manner.

Proof of Concept

Establishing proof of concept requires an understanding of the problem to be solved and the data required to use machine learning to address it. This typically includes exploratory data analysis, model development, and experimentation by data scientists or machine learning researchers.

In this stage, an HITL workforce should be considered as an option for data acquisition, cleansing, and annotation needs. Some development teams may also lean on an HITL workforce for light QC and model validation.

In supervised machine learning, labeled training data is used to create a proof-of-concept model. It is likely the model will require improvements before it is deployed to a production system. For example, the development team may discover that a new class of objects is needed in a labeled image dataset to improve model performance or outcomes.

Automation can be helpful in this phase for data collection and labeling. Web scraping tools can be used, for example, to collect data. Auto-labeling can augment the work of humans in the loop to save time and money. However, auto-labeling has limitations, and it’s important to use people to review those outputs.

Keep in mind, machine learning is experimental. AI project teams may try out various features, algorithms, modeling techniques, and parameter configurations to find what works best. The ideal workforce for proof of concept can stay in communication with developers, understands the changing parameters of the project, and is highly adaptable to changes and iterations in process.

Testing a machine learning system is more complex than typical software development. Testing must move through stages of data validation and model quality evaluation. This, like all phases of the model development process and the AI lifecycle, is iterative and cyclical in nature. As the model is refined, evolved, and improved, there is learning through the process. A human-in-the-loop workforce must be able to adapt to those changes.

Proof of Scale

Now is the time when AI project teams consider how their model will operate in a production environment, where it is consuming unlabeled and unstructured data in real time. Here lies a big hurdle: proof that the model can and will scale - and this includes the feasibility of scaling in a way that is profitable.

Here, it’s important to deploy humans in the loop strategically, particularly in handling edge cases and exceptions. In addition to the tools mentioned in the last stage, there will be opportunities to apply automation to scale and operationalize QC.

Furthermore, machine learning requires a multi-step data pipeline to retrain and deploy models. The sheer complexity of that pipeline will require automation of the steps that were manually completed in the experimentation, training, and testing stages. That automation will need to be monitored and maintained by people.

The answers to these considerations will determine how people, processes, and technology will be involved in operationalizing the AI solution and how much change the organization will face in the production phase.

Model in Production

Here, AI project teams achieve and sustain production of the machine learning model. One of the biggest challenges is maintaining an integrated system that can be continuously operated in production. Humans in the loop can help in production in all of the ways they helped develop the model, up to this point. When teams work with the same data analysts throughout the full process, they can benefit from the improvements and optimizations HITL teams can help them deploy across the entire model development process.

CHALLENGES

At this phase, development teams are likely to encounter all of the challenges from the first two phases and decisions about:

Who should be involved to monitor the AI system
How to establish continuous improvement and the criteria for sustaining it, considering return on investment each time the model development process is completed
What kind of manual work is involved to create ground truth on production data and compare it to what the AI is predicting
How much of the monitoring process can be automated and what is likely to require human intervention
How to scale the AI with the confidence to detect and protect against model drift

At the proof-of-concept phase, a workforce should be able to start fast and have the agility to iterate the process, as there will be plenty of experimentation. It’s best for workforces to come with low commitment, that is, not lock development teams into a contract with restrictive terms.

During proof of scale, a managed team can provide the iterative abilities that are critical to being able to make continuous improvements. It is important to use an HITL workforce that can provide quality work and can use the tools that will be deployed at scale. Workers must be able to measure performance so quality can be increased over time.

When a model is in production, an HITL workforce should have all of the capabilities required in the earlier phases and should also have the process focus to be able to drive continuous improvement. Monitoring key performance indicators (KPIs) and handling exception processes will be important. The faster the data workforce can operate and produce quality outcomes, the better.

At each phase, development teams will build on and often repeat the work done in the previous phase. An HITL workforce used consistently throughout the process can improve upon what was done before.

Next Steps

Once a team decides that a workforce is the right choice for their project, the next step is to find the right-fit workforce.

About CloudFactory

CloudFactory combines people and technology to provide cloud workforce solutions. Our professionally managed and trained teams can process data for machine learning with high accuracy using virtually any tool. We complete millions of tasks a day for innovators including Microsoft, GoSpotCheck, Hummingbird Technologies, Ibotta, and Luminar. We exist to create meaningful work for one million talented people in developing nations, so we can earn, learn, and serve our way to become leaders worth following.

Human in the Loop:
Accelerating the AI Lifecycle

Executive Summary

Read the full guide below, or download a PDF version of the guide you can reference later.

Table of Contents

The AI Lifecycle Challenge

Human-in-the-Loop Machine Learning (HITL)

Humans in the Loop: Data Preparation and Quality

Accelerating Model Development with Human in the Loop