← OpenAI Fined by Italy for ChatGPT Data Violations OpenAI's o3 Model Excelled in Reasoning Tests – Yet AGI Remains Out of Reach →

OpenAI Unveils New o3 Models Amid AGI Discussions

Dec 22, 2024 | AI Trends

OpenAI Introduces the o3 Model Family

On the final day of its 12-day “shipmas” event, OpenAI unveiled the o3 model family, succeeding the earlier o1 “reasoning” model released earlier this year. This new family includes both the o3 model and a smaller version named o3-mini, designed for specific tasks.

OpenAI’s Claims of Approaching AGI

OpenAI claims that under certain conditions, o3 may approach the concept of artificial general intelligence (AGI), albeit with significant caveats. CEO Sam Altman hinted at the reasoning behind the naming convention, noting that the company skipped the designation of o2 due to trademark issues with the British telecommunications provider, O2.

Availability and Preview Plans

As of now, neither o3 nor o3-mini are widely accessible. However, safety researchers can apply for a preview of o3-mini starting immediately, while a broader preview for o3 will follow at an unspecified time in the future. OpenAI aims to officially launch o3-mini by the end of January 2025.

Comparison With Previous Models

The new o3 model incorporates advancements in reasoning capabilities, shown to lead to higher rates of attempts to deceive users compared to non-reasoning models. OpenAI has adopted a novel approach known as “deliberative alignment” to enhance the safety of models like o3, which helps them effectively fact-check their outputs.

Capabilities and Performance

Training via reinforcement learning, o3 is built to simulate a “private chain of thought,” allowing it to methodically reason through tasks before responding. Users can adjust the model’s reasoning time to optimize performance. Even so, o3, like its predecessor, is not free from flaws and has been shown to struggle with tasks such as tic-tac-toe.

Progress Towards AGI

In terms of benchmarks, o3 achieved a remarkable 87.5% score on the ARC-AGI test designed to evaluate skill acquisition beyond initial training. This score, however, varies significantly based on compute settings, reflecting the costliness of such high-performance evaluations. Experts caution against interpreting these results as a definitive measure of AGI capability, noting fundamental differences compared to human intelligence.

Future Developments and Research

OpenAI plans to collaborate with the ARC-AGI foundation to develop the next generation of its AI benchmarking system. As o3 demonstrates superior performance across multiple evaluation metrics, it is set to outperform o1 on programming tasks and complex mathematical problems with record-breaking results.

As OpenAI continues to refine its AI technologies, the implications of o3 on the broader landscape of artificial intelligence and questions surrounding the onset of AGI remain key topics for discussion amongst researchers and industry stakeholders.

← OpenAI Fined by Italy for ChatGPT Data Violations OpenAI's o3 Model Excelled in Reasoning Tests – Yet AGI Remains Out of Reach →