Live Markets, Charts & Financial News

If your AI seems smarter​, it’s thanks to smarter human trainers By Reuters

2

By Subantha Mukherjee and Anna Tong

STOCKHOLM/SAN FRANCISCO (Reuters) – In the early years, getting AI models like ChatGPT or its rival Cohere to emit human-like responses required large teams of low-cost workers to help the models differentiate basic facts like whether a photo was of an object. A car or a carrot.

But more sophisticated updates to AI models in a highly competitive arena now require a rapidly expanding network of human trainers with specialized knowledge — from historians to scientists, some with doctorates.

“A year ago, we could have successfully hired undergrads, just to generally teach AI how to improve,” said Cohere co-founder Evan Chang, speaking about in-house human coaches.

“Now we have doctors who are licensed to teach models how to behave in medical settings, or financial analysts or accountants.”

For further training, Cohere, which was last valued at more than $5 billion, is working with a startup called Invisible Tech. Cohere is one of the main competitors of OpenAI and specializes in enterprise AI.

The startup Invisible Tech employs thousands of trainers, working remotely, and has become one of the key partners of AI companies from AI21 to Microsoft (NASDAQ:) to train their AI models to reduce errors, known in the AI ​​world as hallucinations.

“We have 5,000 people in more than 100 countries around the world who have PhDs, masters and knowledge work professionals,” said Francis Pedraza, founder of Invisible.

Invisible pays up to $40 per hour, depending on the worker’s location and the complexity of the work. Some companies like Outlier pay up to $50 per hour, while another called Labelbox said it pays up to $200 per hour for “highly expert” topics like quantum physics, but starts at $15 for basic topics.

Invisible was founded in 2015 as a workflow automation company that services the likes of food delivery company DoorDash (NASDAQ:) to digitize their delivery menu. But things changed when they were approached by a relatively unknown research company called OpenAI in the spring of 2022, before the public launch of ChatGPT.

“OpenAI came to us with a problem that when you asked a question in an early version of ChatGPT, it would make you delirious. You couldn’t trust the answer,” Pedraza told Reuters.

“They needed an advanced AI training partner to provide reinforcement learning with human feedback.”

OpenAI did not respond to a request for comment.

Generative AI produces new content based on the previous data used to train it. However, sometimes it cannot distinguish between true and false information and generates false outputs known as hallucinations. In one notable example, in 2023, Google’s (NASDAQ:) chatbot shared inaccurate information about the satellite that first captured images of a planet outside Earth’s solar system in a promotional video.

AI companies realize that hallucinations can hamper GenAI’s appeal to businesses and are trying various ways to reduce them, including using human trainers to teach the concept of fact and fiction.

Since joining OpenAI, Invisible says it has become an AI training partner for most GenAI companies, including Cohere, AI21, and Microsoft. Cohere and AI21 have confirmed that they are customers. Microsoft has not confirmed that it is a customer of Invisible.

“These are all companies that have had training challenges where the first cost is computing power and the second cost is high-quality training,” Pedraza said.

How does it work?

OpenAI, which started the craze around GenAI, has a team of researchers called the “Human Data Team” that works with AI trainers to collect specialized data to train its models like ChatGPT.

A source familiar with the company’s operations said OpenAI researchers have come up with various experiments such as reducing hallucinations or improving writing style and working with AI trainers from Invisible and other vendors.

At any given time, dozens of experiments are being conducted, some using tools developed by OpenAI and others by vendor tools, the person said.

Depending on what the AI ​​companies want — improving Swedish history or doing financial modeling — Invisible assigns workers with relevant degrees to those projects, reducing the burden of managing hundreds trained by AI companies.

“OpenAI has some of the most amazing computer scientists in the world, but they are not necessarily experts on Swedish history or chemistry questions or biology questions or anything you can ask,” Pedraza said, adding that more than 1,000 contract workers provide their services to OpenAI. Lonely.

Cohere’s Zhang said he personally used Invisible’s trainers to find a way to teach his GenAI model to find relevant information from a large data set.

a race

Competitors in this space include Scale AI, a private startup recently valued at $14 billion, which provides AI companies with training data sets. It has also ventured into the business of providing AI coaches, and counts OpenAI as one of its clients. Scale AI did not respond to interview requests for this story.

Invisible, which has been profitable since 2021, has raised just $8 million in seed capital.

“We are 70% owned by the team, and only 30% are owned by investors,” Pedraza said. “We facilitate secondary rounds, and the last trading price was at a half-billion-dollar valuation.” Reuters was unable to confirm this assessment.

Human trainers first got into AI training through data classification work that required fewer qualifications and were paid less as well, sometimes as low as $2, and was mostly done by people in African and Asian countries.

As AI companies launch more advanced models, demand for specialized trainers across dozens of languages ​​is rising, creating a well-paying niche where workers from a variety of subjects can become AI trainers without even knowing how to code.

Demand from AI companies creates more companies providing similar services.

“My inbox is full of new companies popping up here and there,” Zhang said. “I see this as a new space where companies are just hiring humans to generate data for AI labs like ours.”

Comments are closed, but trackbacks and pingbacks are open.