Diego Rivera. “Electric Power”
The fact that supervised Machine Learning requires a lot of hand-labeled data is considered a nuisance by most AI companies. Models may require millions of data points in order to reach a high accuracy, while the labeling process is costly, time-consuming, and may end up bringing a lot of noise into the data.
That is why there are consistent efforts across the AI industry to either automate the process, find ways for users to contribute labels for free, or drastically reduce the number of labels that the model requires. Even so, sometimes there is no other option but to source human labeling.
This need has fueled the growth of two industries: crowdsourcing (online gig workers working on platforms like Amazon’s Mechanical Turk) and digital sweatshops (business process outsourcing companies in the Global South which employ workers to perform the labeling).
The picture is not entirely grim though, with some data labeling companies creating glamorous reputations and reaching unicorn valuations. At the same time, an entire impact sourcing subsector has consolidated in which companies are channeling this type of work to disadvantaged groups, like people with disabilities, women from rural areas, or slum dwellers.
Ghost work
Even though labeling platforms and agencies may be trying to uplift the status of the services they are providing, the truth is that most data labelers are still working in conditions with low pay, insecure employment, and no opportunity for career advancement.
Even impact sourcing companies are struggling to secure contracts for their workforce and compete in an industry that is characterized by a “race to the bottom”. Annotation company Alegion has shared that “This whole industry is very, very competitive; everybody tries to find that little cheaper labor force somewhere else in the world.”
The problems in the industry stem from the fact that the human labor used for powering AI models is frequently kept out of sight. It has even been referred to as AI’s “dirty little secret”, as if the work of humans for training and monitoring AI systems would mean that these systems are a fraud instead of “real” artificial intelligence.
Human annotators are an essential part of the “fauxtomation” illusion in technology today. AI systems are represented as smarter than they actually are while the human labor used to train them has become invisible and trivialized, or “ghost work”, as Mary L. Gray and Siddharth Suri have referred to it in their book with the same title.
The entire data labeling industry is plagued by one deceptive assumption: that manual labels are easy and fast to complete. The name of the assignments on Mechanical Turk is also based on this premise – an “HIT” is a “human intelligence task”, which is difficult for computers but easy for humans.
Other crowdsourcing platforms brand themselves as “API for human labor” or “human–as–a–service”, which further dehumanizes the anonymous workers who perform the so-called “heteromated labor”. This also helps to reinforce a perception that such tasks don’t need to be highly remunerated (the minimum payment for an HIT on Mechanical Turk is $0.01).

Diego Rivera. “Detroit Industry Mural, North Wall.” Detail.
The professionalization of labeling
Is data labeling truly an easy task? As the AI field progresses, the labels that it needs are getting more and more complex. Data labeling is not about labeling cats and dogs on images anymore but rather about navigating complex class taxonomies, using a variety of tools, and developing the expertise necessary to handle unexpected edge cases.
Trivial “human intelligence” tasks are already being automated at great speeds and tech giants like Google, Microsoft, IBM and Amazon offer a variety of pre-built generic models for object detection, OCR, NLP, etc. Therefore, in order to compete, smaller and medium-sized AI companies need to build customized solutions for particular domains and use cases where generic models don’t offer enough granularity or accuracy.
Common labeling tasks currently require workers to acquire a great level of expertise in order to distinguish different types of crops and weeds on fields from a particular region, annotate and analyze all the elements of architectural floorplans, spend hours segmenting the components of a single aerial image, classify cars on the road in hundreds of models and makes, etc.
All of these tasks are non-trivial and they require project-specific training, getting to know the dataset, carefully studying instructions and examples, communication among labelers to agree on standardized and consistent approaches for difficult cases, as well as frequent iterations and feedback.
Data labeling is slowly becoming more and more professionalized and it is materializing into a job title with its own set of specifications, requirements, and minimum skillset. And as human-in-the-loop pipelines become more commonplace across the industry, they will require professional humans in the loop to monitor and audit AI models and deal with a great degree of subtleties and gray areas.
In order to ensure the professionalization of data labeling work, it is essential to train and upskill workers who are aware of their role in the AI lifecycle and recognize the impact of their work (and mistakes). Empowered labelers who are informed of the application and goal of the AI model they are working on can bring great value by spotting edge cases or recognizing and reporting harmful biases in the data which might otherwise pass unperceived.
Making data labeling visible
There is a general consensus in the AI field that data is the critical infrastructure necessary to build AI systems and it directly impacts their performance, fairness, robustness, safety, and scalability. And yet, novel model development is the most glamorized and celebrated work in AI, while data labeling is widely considered grunt work.
A new publication by Google Research called “Everyone wants to do the model work, not the data work” exposes this dichotomy and calls for recognition of the human labor in preparing, curating, and nurturing the data that powers AI models through a proactive focus on data excellence.
Indeed, recognizing the importance of data work will lead to its visibilization and release from the shackles of “ghost work”. In addition, when this recognition is coupled with the professionalization of the field, it bears the promise of better remuneration and more dignified work conditions for all data labelers.

Diego Rivera. “Man, Controller of the Universe”/”Man at the Crossroads”. Detail
Responsible sourcing
One way to approach visibilization has been the establishment of industry standards for ethical sourcing. The Global Impact Sourcing Coalition (GISC) is the main body which unites suppliers of impact sourcing services and proposes a standard they can adopt.
However, much less work is done on the level of AI companies which are sourcing labeled data. While tech giants like Microsoft, Facebook and Google are part of GISC, the vast majority of AI companies are unaware of such sourcing standards and continue to perpetuate questionable practices for the acquisition and appropriation of data and to exert pressure on suppliers for keeping labeling costs at a minimum.
One of the few notable examples of AI businesses which practice responsible sourcing and publicly share their policies under a “Social responsibility” section on their website is Germany-based company Understand.ai. The managing director and co-founder Philip Kessler shares that:
“When evaluating potential labeling partners fair wages is an equal criteria to the quality of annotated boxes on our scorecard. Candidate companies are being subjected to pre-contract due diligence and our employees and distributors are sensitized to spot any foul behaviour when dealing with the partner on a daily basis. Where possible and safe for all parties, especially in the current pandemic situation, we conduct in-person checks.”
By committing to similar practices and standards, AI companies have the power to transform the data industry by exerting pressure on their suppliers. The multistakeholder organization Partnership on AI (PAI) has recently launched a whole initiative for Responsible sourcing across the data supply line which seeks to develop actionable resources for data scientists and AI engineers to ensure quality working conditions for data labelers.
Raising the perceived value of data labeling work by AI practitioners is the first step towards sustainable change in the industry. Combined with public pressure and regulation, incentivizing AI companies to create and abide by responsible sourcing policies is the only viable way to transition away from marginalized “ghost work” to the professional human in the loop of the near future.
ABOUT THE AUTHOR
Iva Gumnishka is the founder of Humans in the Loop, a social enterprise that provides training data and human input for the MLOps lifecycle: anything from dataset collection and annotation to output verification and real-time edge case handling. Iva is also the founder of Humans in the Loop Charoity arm, which uses the profit generated by the social enterprise to provide training and support to the venture’s workers.