On Tuesday, Google announced Gemini 2.5, a new family of AI inference models that pause to “think” before answering questions.
To launch a new model family, Google is launching Gemini 2.5 Pro Experimental, a multimodal inference AI model, Gemini 2.5 Pro Experimental. The model will be available on Tuesday at the company’s developer platform, Google AI Studio, and on the Gemini app for subscribers to Gemini Advanced, the company’s $20-month AI plan.
Google says that all new AI models will burn reasoning capabilities.
Since Openai launched its first AI inference model in September 2024, at O1, the technology industry has competed to rival or exceed the capabilities of its own. Today, Anthropic, Deepseek, Google, and Xai all have AI inference models. It uses additional computing power and time to fact-check through the questions before providing answers.
Inference techniques have helped AI models achieve new heights in mathematics and coding tasks. Many in the technology world believe that inference models are an important element of AI agents. This is an autonomous system that can perform tasks that primarily carry out human interventions. However, these models are also more expensive.
Google previously experimented with the AI Reasoning model, previously releasing a “thinking” version of Gemini in December. However, Gemini 2.5 represents the company’s most serious attempt to best openai’s “O” series models.
Google claims that the Gemini 2.5 Pro is superior to its previous frontier AI models and several benchmarks over its major competing AI models. Specifically, Google says it has designed Gemini 2.5, which is excellent at creating visually compelling web apps and agent coding applications.
A rating measurement code edit called Aider PolyGlot states that a Gemini 2.5 Pro score of 68.6% is superior to the top AI models of Openai, Anthropic and the Chinese AI Lab Deepseek.
However, another test measuring software development capabilities validated the SWE bench, with a Gemini 2.5 Pro score of 63.8%, surpassing Openai’s O3-Mini and Deepseek’s R1, while humanity’s Claude 3.7 Sonnet won 70.3%.
The final exam for humanity is a multimodal test consisting of thousands of crowdsourced questions related to mathematics, humanities and natural sciences, and Google has a Gemini 2.5 Pro score of 18.8%, delivering better performance than most rival flagship models.
According to Google, the Gemini 2.5 Pro ships in a million token context windows. This means that AI models can take around 750,000 words per match. This is longer than the entire “Lord of the Rings” book series. And soon, Gemini 2.5 Pro will support twice the input length (2 million tokens).
Google has not published API pricing for Gemini 2.5 Pro. The company says it will share more in the coming weeks.
Source link