Gemini 3 Makes a Grand Entrance, Fully Geared for Learning, Construction, and Planning

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

12/01 2025 574

Today, the head of Google Developer Relations and the leader of Google AI Studio ignited a social media frenzy by posting a tweet with the single word 'Gemini'.

Just yesterday, Musk unveiled Grok 4.1, adding a touch of rivalry to the air as Gemini 3's release now seems to carry a hint of confrontation.

Sam Altman also took to social media to congratulate the launch of Gemini 3. One netizen quipped, "Why no congratulations for Musk's Grok 4.1? I sense a grudge."

Setting aside the controversies for a moment, Gemini 3 is making a strong impression this time around. The official team claims it can "turn any concept into reality." This new model is engineered to grasp profound meanings and subtleties, whether it's detecting nuanced cues in creativity or dissecting intricate problems step by step.

Gemini 3 also excels at comprehending the context and underlying intent of requests, fetching the necessary information without the need for excessive prompts.

In all major AI benchmark tests, Gemini 3's reasoning and multimodal capabilities surpass those of the 2.5 Pro.

For textual reasoning, it tops the LMArena leaderboard with a score of 1501 Elo. Without utilizing any tools, it achieved a 37.5% score in "Humanity's Last Exam." It also scored an impressive 91.9% in the GPQA Diamond test. Additionally, it set a new all-time high score of 23.4% in the MathArena Apex test, showcasing doctoral-level reasoning prowess.

Beyond textual reasoning, Gemini 3 Pro achieved an accuracy rate of 81% in the MMMU-Pro test and 87.6% in the Video-MMMU test, redefining multimodal reasoning capabilities.

It also scored a leading 72.1% in the SimpleQA Verified test, demonstrating significant progress in factual accuracy. This means Gemini 3 Pro can reliably solve complex problems spanning numerous fields, such as science and mathematics.

Every response from Gemini 3 Pro is intelligent, concise, and direct, eschewing clichés and flattery, and offering some truly profound insights.

It can also write code to visualize plasma flow in a tokamak and compose a poem that captures the physical principles of fusion.

In "Humanity's Last Exam," the Gemini 3 deep thinking model scored 41.0% without using tools. It scored a remarkable **93.8%** in GPQA Diamond, outperforming Gemini 3 Pro in both categories. Additionally, it achieved an unprecedented score of 45.1% on ARC-AGI-2.

Gemini 3 integrates reasoning, visual and spatial understanding abilities, leading multilingual performance, and a million-scale context window, further pushing the boundaries of multimodal reasoning.

If you're keen to learn how to cook, Gemini 3 can interpret and translate handwritten recipes in various languages and generate new ones. It can also create interactive flashcards, visualizations, or code snippets for academic papers, lengthy video lectures, or tutorials. It can even analyze pickleball match videos, pinpoint areas for improvement, and devise training plans to help users enhance their skills comprehensively.

The official team stated that Gemini 3 is the best Vibe coding and agent coding model developed to date. It tops the WebDev Arena leaderboard with a score of 1487 Elo. Additionally, it achieved a score of 54.2% in the Terminal-Bench 2.0 test, which evaluates the model's ability to use tools to operate a computer via a terminal. Meanwhile, it also significantly outperformed the 2.5 Pro version in the coding ability test SWE-bench Verified.

Compared to other cutting-edge models, Gemini 3 Pro exhibits superior long-term planning capabilities and can generate higher returns.

By combining deeper reasoning with more refined and consistent tool usage, Gemini 3 can handle more complex multi-step workflows from start to finish—such as booking local services or organizing an inbox.

Gemini 3 is the model with the most comprehensive safety assessment among all Google AI models to date. The model shows a lower tendency to please, stronger resistance to prompt injection, and greater resilience against cyberattack abuse.

Currently, Gemini 3 is available on the developer platforms of the Gemini app, AI Studio, and Vertex AI, as well as Google's new agent development platform, Google Antigravity.

References:

https://blog.google/products/gemini/gemini-3/#gemini-3

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links