On May 31, OpenAI announced its efforts to enhance ChatGPT’s mathematical problem-solving capabilities, with the goal of reducing artificial intelligence (AI) hallucinations. OpenAI emphasized hallucination mitigation as a critical step towards developing compatible AI.
In March, the introduction of the latest version of ChatGPT – ChatGPT-4 – propelled AI into the mainstream. However, generative AI chatbots have long grappled with factual accuracy, and occasionally produce false information, commonly referred to as “hallucinations.” announce Through a post on the OpenAI website.
AI hallucinations refer to instances where AI systems generate output that is factually incorrect, misleading, or not supported by real-world data. These hallucinations can appear in various forms, such as generating false information, making up events or people that do not exist, or giving inaccurate details about certain topics.
Open AI go run The research is to examine the effectiveness of two types of feedback: “outcome stewardship” and “process stewardship.” Outcome supervision includes feedback based on the end result, while process supervision provides input for each step in the chain of thought. OpenAI evaluated these models using math problems, generating multiple solutions and selecting the highest-ranked solution according to each model’s feedback.
After a thorough analysis, the research team found that process supervision resulted in superior performance as it encouraged the model to adhere to a human-approved process. In contrast, moderating results has proven more difficult to audit consistently.
OpenAI has recognized that the implications of process moderation extend far beyond mathematics, with further investigation necessary to understand its implications in different domains. She expressed the possibility that if the observed outcomes were to be found in broader contexts, process supervision could provide a positive mix of performance and alignment compared to outcome supervision. In order to facilitate the research, the company has publicly released the entire data set to supervise the process, inviting exploration and study in this field.
Related: An AI request briefly catapults Nvidia into the $1 trillion club
Although OpenAI has not presented explicit cases that prompted the investigation of hallucinations, two recent events present the problem in real-life scenarios.
In a recent incident, attorney Stephen Schwartz in Mata v. Avianca Airlines I confess Rely on a chatbot as a research resource. However, the information provided by ChatGPT turned out to be completely fabricated, highlighting the problem at hand.
OpenAI’s ChatGPT isn’t the only example of AI systems experiencing hallucinations. during clarification From its chatbot technology in March, Microsoft’s Bing AI chatbot checked earnings reports and found inaccurate numbers for companies like Gap and Lululemon.
magazine: 25k traders bet on ChatGPT stock picks, AI sucks on dice rolls, and more