Considering their cautious approach to naming (since March 2023, they haven’t even added a 4.1 to the GPT name, despite making significant upgrades), this already sparks interest.
What’s even more fascinating is how much smarter this model has become, according to their technical blog.
Firstly, they state that the model surpasses the level of PhD candidates in solving tasks in physics, chemistry, and biology. Think about that. A model accessible from today (although, I suspect, not technically available to all paying users right away – access usually rolls out over several weeks). September 2024. PhD level, or better.
Secondly, in the selection exam for the International Mathematical Olympiad, GPT-4o (the current top model) solves 13% of the tasks correctly. GPT-о1, the new model, solves 83% of them. Consider that difference. Soon, this metric will become irrelevant. More difficult problems will have to be devised to measure whether the models are improving.
Thirdly, in competitive programming tasks, GPT-4о solves 11% of tasks correctly, while GPT-о1 solves 89%.
This model doesn’t appear to be significantly larger than GPT-4 or 4o, but it’s been trained to approach problem-solving in a markedly different way. It “thinks” longer, builds a “chain of thoughts,” tries different solutions, evaluates them, corrects its mistakes, and only then produces a “refined” solution.
For simpler questions, it’s likely to be too slow and “overly intelligent.” But for complex problems, it looks like a real game-changer.