OpenAI Aims to Silence Concerns with Release of New Model GPT-5.2
Meanwhile, OpenAI has debuted a new AI model, GPT-5.2, that it says beats all existing models by a substantial margin across a wide range of tasks.
According to data OpenAI released, the new model performed particularly well on a benchmark of complicated professional tasks across a range of “knowledge work”—from law to accounting to finance—as well as on evaluations involving coding and mathematical reasoning.
However, OpenAI executives said that the model had been in the hands of “Alpha customers” who help test its performance for “several weeks”—a time period that would mean the model was completed prior to Altman’s code red declaration.
Improved Performance Across Various Tasks
Consequently, GPT-5.2 showed a large jump in performance across several benchmark tests of interest to enterprise customers.
It met or exceeded human expert performance on a wide range of difficult professional tasks, as measured by OpenAI’s GDPval benchmark, 70.9% of the time.
Moreover, the model excelled at writing and debugging code, and demonstrated a “state of the art” ability to use other software tools to complete tasks.
Enhanced Safety Features
Therefore, the company also said its new model is better than the company’s earlier ones at providing “safe completions”—which it defines as providing users with helpful answers while not saying things that might contribute to or worsen mental health crises.
In addition, OpenAI executives emphasized the importance of safety protocols and the company’s commitment to improving its models to recognize and respond to signs of mental or emotional distress.
However, the release of the new model came on the same day a new lawsuit was filed against the company, alleging that ChatGPT’s interactions with a psychologically troubled user had contributed to a murder-suicide in Connecticut.
Competition in the AI Space
Meanwhile, OpenAI is no doubt hoping to persuade customers to turn back to its models for coding with GPT-5.2, as it faces increasing competition from Google and Anthropic.
According to some figures, Anthropic’s Claude model has proved especially popular among enterprises, exceeding OpenAI’s market share.
However, OpenAI executives said that the company had been building GPT-5.2 “for many months” and that the model had been in the hands of “Alpha customers” for “several weeks.”
Source: Link









