google

Gemini 3.1 Pro

Gemini 3.1 Pro is Google's elite multimodal model featuring DeepThink reasoning, a 2M context window, and native Veo 3.1 video integration for advanced tasks.

MultimodalDeep ReasoningVideo GenerationWorkspace AIGoogle Gemini
google logogoogleGeminiFebruary 19, 2026
Context
2.0Mtokens
Max Output
8Ktokens
Input Price
$2.50/ 1M
Output Price
$15.00/ 1M
Modality:TextImageAudioVideo
Capabilities:VisionToolsStreamingReasoning
Benchmarks
GPQA
94.3%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). Gemini 3.1 Pro scored 94.3% on this benchmark.
HLE
44.4%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. Gemini 3.1 Pro scored 44.4% on this benchmark.
MMLU
85.9%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. Gemini 3.1 Pro scored 85.9% on this benchmark.
MMLU Pro
89.5%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. Gemini 3.1 Pro scored 89.5% on this benchmark.
SimpleQA
72.1%
SimpleQA: Factual Accuracy Benchmark. Tests a model's ability to provide accurate, factual responses to straightforward questions. Measures reliability and reduces hallucinations in knowledge retrieval tasks. Gemini 3.1 Pro scored 72.1% on this benchmark.
IFEval
89.2%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. Gemini 3.1 Pro scored 89.2% on this benchmark.
AIME 2025
100%
AIME 2025: American Invitational Math Exam. Competition-level mathematics problems from the prestigious AIME exam designed for talented high school students. Tests advanced mathematical problem-solving requiring abstract reasoning, not just pattern matching. Gemini 3.1 Pro scored 100% on this benchmark.
MATH
91.1%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. Gemini 3.1 Pro scored 91.1% on this benchmark.
GSM8k
95.9%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. Gemini 3.1 Pro scored 95.9% on this benchmark.
MGSM
92.4%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. Gemini 3.1 Pro scored 92.4% on this benchmark.
MathVista
69.8%
MathVista: Mathematical Visual Reasoning. Tests the ability to solve math problems that involve visual elements like charts, graphs, geometry diagrams, and scientific figures. Combines visual understanding with mathematical reasoning. Gemini 3.1 Pro scored 69.8% on this benchmark.
SWE-Bench
80.6%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. Gemini 3.1 Pro scored 80.6% on this benchmark.
HumanEval
84.1%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. Gemini 3.1 Pro scored 84.1% on this benchmark.
LiveCodeBench
73%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. Gemini 3.1 Pro scored 73% on this benchmark.
MMMU
71.1%
MMMU: Multimodal Understanding. Massive Multi-discipline Multimodal Understanding benchmark testing vision-language models on college-level problems across 30 subjects requiring both image understanding and expert knowledge. Gemini 3.1 Pro scored 71.1% on this benchmark.
MMMU Pro
58.4%
MMMU Pro: MMMU Professional Edition. Enhanced version of MMMU with more challenging questions and stricter evaluation. Tests advanced multimodal reasoning at professional and expert levels. Gemini 3.1 Pro scored 58.4% on this benchmark.
ChartQA
86.5%
ChartQA: Chart Question Answering. Tests the ability to understand and reason about information presented in charts and graphs. Requires extracting data, comparing values, and performing calculations from visual data representations. Gemini 3.1 Pro scored 86.5% on this benchmark.
DocVQA
93.1%
DocVQA: Document Visual Q&A. Document Visual Question Answering benchmark testing the ability to extract and reason about information from document images including forms, reports, and scanned text. Gemini 3.1 Pro scored 93.1% on this benchmark.
Terminal-Bench
52%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. Gemini 3.1 Pro scored 52% on this benchmark.
ARC-AGI
77.1%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. Gemini 3.1 Pro scored 77.1% on this benchmark.

About Gemini 3.1 Pro

Learn about Gemini 3.1 Pro's capabilities, features, and how it can help you achieve better results.

Gemini 3.1 Pro represents a landmark shift in Google's generative AI roadmap, released in February 2026 as the flagship of the Gemini 3 series. Engineered to bridge the gap between versatile multimodal assistance and PhD-level reasoning, the model introduces the DeepThink engine, which significantly reduces hallucinations in complex logic and mathematical modeling tasks through advanced chain-of-thought processing.

Boasting a massive 2,048,000 token context window, Gemini 3.1 Pro can ingest and reason across hour-long video files, massive codebases, or thousands of pages of documentation with near-perfect retrieval. A key differentiator is its native integration with Veo 3.1, allowing it to generate high-fidelity video directly from text prompts without requiring separate video generation models.

Optimized for agentic workflows, the model features a specialized customtools endpoint for high-reliability tool use in software engineering and automated research. It is designed to work seamlessly within the Google Workspace ecosystem, providing secure grounding in private Docs, Gmail, and Drive data for unprecedented productivity gains.

Gemini 3.1 Pro

Use Cases for Gemini 3.1 Pro

Discover the different ways you can use Gemini 3.1 Pro to achieve great results.

Collaborative Software Engineering

Orchestrating with other models via MCP servers for deep code analysis and validation of formatting across massive repositories.

Automated Workspace Triage

Using side panels in Gmail and Docs to summarize massive email threads and extract actionable reports using @ tagging.

Multimedia Content Creation

Generating professional website-ready animated SVGs and 3D simulations directly from text prompts using the integrated Veo 3.1 engine.

PhD-Level Research Synthesis

Leveraging NotebookLM for dense document analysis, generating audio overviews, and providing precise citations for technical papers.

Strategic Market Intelligence

Performing Deep Research to analyze competitor pricing and market trends to build structured, cited reports from dozens of sources.

Full-Stack Web Prototyping

Turning wireframes and UI photos into functional, interactive web applications with complex lighting physics and spatial understanding.

Strengths

Limitations

Elite Logic Mastery: Achieved a 77.1% on ARC-AGI 2, demonstrating ability to solve novel patterns that stump most other frontier models.
Reasoning Latency: DeepThink mode can result in slower response times, sometimes taking several minutes to provide an answer for complex math.
Massive 2M Context: Industry-leading 2 million token context window allows for deep analysis of long-form documents and entire codebases in one prompt.
Ecosystem Dependency: Maximum utility and grounding features are heavily dependent on being within the Google Workspace ecosystem (Docs, Sheets, Gmail).
Deep Workspace Grounding: Native integration with Google Workspace data using @ tagging allows for high-context task completion without data leakage.
Subscription Complexity: Many advanced agentic features are locked behind specific premium tiers or enterprise Workspace licenses.
Cost-Efficient Performance: Despite being a top-tier reasoning model, its $2.50/$15.00 pricing is highly efficient relative to current market competitors.
Prompt Adherence Laziness: Occasionally attempts to substitute complex language requests with easier alternatives unless strictly constrained in the prompt.

API Quick Start

google/gemini-3.1-pro-preview

View Documentation
google SDK
import { GoogleGenAI } from "@google/genai";

const genAI = new GoogleGenAI("YOUR_API_KEY");
const model = genAI.getGenerativeModel({ 
  model: "gemini-3.1-pro-preview",
  tools: [{ codeExecution: {} }]
});

async function run() {
  const prompt = "Synthesize the logic of this quantum physics paper.";
  const result = await model.generateContent(prompt);
  console.log(result.response.text());
}

run();

Install the SDK and start making API calls in minutes.

What People Are Saying About Gemini 3.1 Pro

See what the community thinks about Gemini 3.1 Pro

Gemini 3.1 is a way bigger jump than the '.1' suggests, including ARC AGI 2 going from 31% to 77%. This is the defining shift of 2026.
Jake Lindsay
twitter
Gemini 3.1 Pro's ability to handle interactive SVGs makes it the best tool for technical educational content right now.
u/2doapp
reddit
The individual particle effects of the water simulation in the naval combat test is significantly better than anything I've ever seen.
Stavy Lapis
youtube
I'm seeing 3.1 Pro actually correcting its own logic loops in DeepThink mode, which 3.0 just couldn't do.
u/LLM_Tester
reddit
The native integration with Workspace data makes this the first model that actually feels like a personal research assistant.
DevReviewer
hackernews
Context window is the killer feature here. I dropped a whole 1.5M token codebase and it found a memory leak in seconds.
dev_guru_2026
hackernews

Videos About Gemini 3.1 Pro

Watch tutorials, reviews, and discussions about Gemini 3.1 Pro

Clearly they've made a lot of impact with the DeepThink models and are putting it into the main pro model.

If you have thinking set to high, this acts almost like a mini version of Gemini DeepThink.

Benchmark scores for coding are up nearly 15% across the board.

It is a big update that basically gets the model back into the same sort of competitive area as Opus 4.6.

That's roughly half of what Deep think used to take for the same problem.

Gemini 3.1 Pro without any tool use scored by far the highest on Humanity's Last Exam.

By far the best model for 3D and spatial understanding, creating highly detailed animations from a single image.

The multimodality here isn't just a bolt-on; it's native and it shows.

We are seeing near-perfect retrieval across the entire 2-million token window.

Currently the best model out there you can use in terms of intelligence index.

3.1 Pro achieved a score of 77.1 versus the previous 31.1... this is a massive leap.

The individual particle effects of the cannonballs landing in the water is better than anything I've seen.

It's not just following instructions; it's understanding the underlying physics of the scene.

Google is really leaning into the agentic capabilities with the new tool-use endpoint.

The model implemented separate splash effects for individual cannonballs... a level of granular detail I hadn't seen.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows

Pro Tips for Gemini 3.1 Pro

Expert tips to help you get the most out of Gemini 3.1 Pro and achieve better results.

Configure Thinking Levels

Use the thinking_level parameter and set it to 'High' for math and logic puzzles to activate the DeepThink mini mode.

Leverage @ Tagging

In Google Workspace, use the @ symbol to ground the model in specific files to avoid manual copy-pasting and maintain context.

Chain with Flash for Speed

Use Gemini 3.1 Pro for high-level architectural planning, then switch to the Flash variant for repetitive boilerplate generation.

Use Google AI Studio Canvas

Enable canvas mode in AI Studio to allow the model to output multi-file scripts and more expansive codebases during development.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

openai

GPT-5.2 Pro

OpenAI

GPT-5.2 Pro is OpenAI's 2025 flagship reasoning model featuring Extended Thinking for SOTA performance in mathematics, coding, and expert knowledge work.

400K context
$21.00/$168.00/1M
moonshot

Kimi K2 Thinking

Moonshot

Kimi K2 Thinking is Moonshot AI's trillion-parameter reasoning model. It outperforms GPT-5 on HLE and supports 300 sequential tool calls autonomously for...

256K context
$0.15/1M
openai

GPT-5.2

OpenAI

GPT-5.2 is OpenAI's flagship model for professional tasks, featuring a 400K context window, elite coding, and deep multi-step reasoning capabilities.

400K context
$1.75/$14.00/1M
google

Gemini 3 Pro

Google

Google's Gemini 3 Pro is a multimodal powerhouse featuring a 1M token context window, native video processing, and industry-leading reasoning performance.

1M context
$2.00/$12.00/1M
deepseek

DeepSeek-V3.2-Speciale

DeepSeek

DeepSeek-V3.2-Speciale is a reasoning-first LLM featuring gold-medal math performance, DeepSeek Sparse Attention, and a 131K context window. Rivaling GPT-5...

131K context
$0.28/$0.42/1M
anthropic

Claude Opus 4.6

Anthropic

Claude Opus 4.6 is Anthropic's flagship model featuring a 1M token context window, Adaptive Thinking, and world-class coding and reasoning performance.

200K context
$5.00/$25.00/1M
google

Gemini 3 Flash

Google

Gemini 3 Flash is Google's high-speed multimodal model featuring a 1M token context window, elite 90.4% GPQA reasoning, and autonomous browser automation tools.

1M context
$0.50/$3.00/1M
anthropic

Claude Sonnet 4.6

Anthropic

Claude Sonnet 4.6 offers frontier performance for coding and computer use with a massive 1M token context window for only $3/1M tokens.

1M context
$3.00/$15.00/1M

Frequently Asked Questions About Gemini 3.1 Pro

Find answers to common questions about Gemini 3.1 Pro