Get the latest news updates from India and the US on InternetBuzzFeed. Explore trending headlines, political news, tech updates, entertainment stories & more — all in real time.
Gemini 3.1 Pro just changed the AI pricing game. In roughly 90 days, Google shipped the most lopsided leap in AI reasoning the industry has ever seen — and priced it at half of what Claude charges.
In roughly 90 days, Google quietly shipped what might be the most lopsided leap in AI reasoning the industry has ever seen — all while slashing the price below what competitors dare to charge.
Gemini 3.1 Pro achieves a verified ARC-AGI-2 score of 77.1% — more than double the reasoning performance of Gemini 3 Pro, released just three months prior. Vertu That’s not a minor update. That’s a generational jump compressed into a single quarter.
And the kicker? It’s priced the same as Gemini 3 Pro — $2 per million input tokens and $12 per million output tokens — less than half the price of Claude Opus 4.6. Simon Willison
Gemini 3.1 Pro is Google DeepMind’s most advanced reasoning model released on February 19, 2026. It’s the first major point-update in the Gemini 3 series — and it targets one specific objective: dramatically improving reasoning efficiency without increasing cost.
Gemini 3 Pro launched in November 2025. Three months later, Google introduced Gemini 3.1 Pro with a redesigned reasoning core.
This update wasn’t incremental. It focused on improving structured problem-solving, abstract logic, and multi-step inference — the types of tasks that define modern agentic AI systems.
The model maintains a massive 1 million token input context window.
That means you can feed it:
Large codebases
Long legal contracts
Multi-document research datasets
Extended transcripts
On top of that, Gemini 3.1 Pro increases its output limit to 65,000 tokens — enabling large-scale generation without losing thread continuity.
This isn’t a chatbot-first model.
Gemini 3.1 Pro is built for:
Agentic workflows
Tool-using systems
Research automation
Enterprise-grade analytics
Large document reasoning
The design intent is clear: optimize intelligence per dollar spent.
Benchmarks don’t always tell the full story. But sometimes, they shift industry narratives overnight.
Gemini 3.1 Pro’s benchmark performance did exactly that.
ARC-AGI-2 is widely considered one of the toughest reasoning benchmarks in AI.
It tests generalization — the ability to solve novel logic patterns the model hasn’t seen before.
Gemini 3.1 Pro achieved a verified score of 77.1%.
For context:
Gemini 3 Pro scored 31.1%
GPT-5.1 scored 17.6%
That’s not incremental improvement.
That’s a generational leap compressed into a single quarter.
In competitive coding environments, Gemini 3.1 Pro posts an Elo rating of 2887 — significantly ahead of GPT-5.2 and Gemini 3 Pro.
This positions it strongly for:
Code generation
Debugging
Refactoring
Competitive programming environments
On advanced domain knowledge evaluation:
Gemini 3.1 Pro: 44.4%
Gemini 3 Pro: 37.5%
GPT-5.2: 34.5%
Across 12 of 18 tracked benchmarks, Gemini 3.1 Pro leads.
This is the real story.
Performance jumps like this usually take years.
So what changed?
Google enhanced its reasoning framework called “Deep Think.”
This system enables structured internal reasoning paths designed to solve complex multi-step problems more efficiently.
Instead of simply increasing compute, Gemini 3.1 Pro extracts more insight per reasoning token.
In practical terms:
It reaches correct answers in fewer internal steps.
It reduces unnecessary reasoning chains.
It lowers cost per successful inference.
The Gemini API now includes a field called total_thought_tokens — an encrypted representation of internal reasoning used in multi-turn agentic workflows.
This indicates that reasoning isn’t an add-on feature.
It’s embedded in the infrastructure.
Here’s where the industry calculus changes.
Gemini 3.1 Pro:
$2 per million input tokens
$12 per million output tokens
Claude Sonnet 4.6:
$3 per million input
$15 per million output
Claude Opus 4.6:
$15 per million input
Up to $75 per million output (varies by source)
Even against GPT-5.2, Gemini offers competitive or lower pricing with stronger reasoning performance
Google offers a Batch API with a 50% discount.
That reduces effective cost to:
$1 per million input tokens
$6 per million output tokens
At scale, this creates a structural cost advantage for enterprise workloads.
No model dominates every category. But Gemini 3.1 Pro clearly excels in:
ARC-AGI-2 leadership shows strong generalization.
Efficient reasoning infrastructure makes it ideal for autonomous systems.
The 1 million token window enables document-heavy enterprise applications.
LiveCodeBench leadership reinforces its coding capabilities.
Claude remains strong in specific areas.
Claude tends to produce more fluid, emotionally nuanced outputs.
On Arena-style human preference tests, Claude Opus 4.6 still leads Gemini in text-based tasks.
For long-form editorial storytelling, Claude often feels more natural.
It depends on your workload.
Run high-volume API pipelines
Build AI agents
Process large documents or codebases
Need top-tier logic and math reasoning
Operate within Google Cloud infrastructure
Prioritize creative storytelling
Need nuanced human-like tone
Optimize for Arena-style human preference metrics
The real headline isn’t just performance.
It’s velocity.
Gemini 3 Pro launched in November 2025. Gemini 3.1 Pro arrived in February 2026 with more than double the reasoning score.
If this pace continues, each generation becomes the floor for the next.
That’s compounding intelligence growth.
And competitors are watching closely.
Gemini 3.1 Pro is available via:
Production usage billed per token.
Enterprise-scale deployments.
Available through Google AI Pro and Ultra subscriptions.
Pricing pressure is intensifying.
When a model doubles reasoning performance while maintaining price parity — and undercuts premium competitors — it shifts enterprise decision-making.
Developers increasingly evaluate:
Cost per inference
Benchmark reliability
Infrastructure integration
Long-term scalability
Gemini 3.1 Pro forces the industry to compete not just on capability — but on economics.
Google’s advanced reasoning-focused AI model released February 19, 2026.
It more than doubled ARC-AGI-2 reasoning performance (77.1% vs 31.1%).
Yes, it costs significantly less per million tokens than Claude Opus 4.6.
No. Claude still leads in creative writing and some human preference benchmarks.
A benchmark measuring novel reasoning and generalization capability.
Google didn’t just release another model.
It reset the cost-performance baseline for frontier AI systems.
Gemini 3.1 Pro isn’t perfect for every task. But it dramatically changes the economics of building intelligent systems at scale.
If you’re still defaulting to premium-priced models out of habit, this is the moment to reevaluate.
The AI race isn’t slowing down.
It’s compounding.