google

Gemini 3 Flash

Gemini 3 Flash is Google's high-speed multimodal model featuring a 1M token context window, elite 90.4% GPQA reasoning, and autonomous browser automation tools.

google logogoogleGemini 3December 17, 2025
Context
1.0Mtokens
Max Output
66Ktokens
Input Price
$0.50/ 1M
Output Price
$3.00/ 1M
Modality:TextImageAudioVideo
Capabilities:VisionToolsStreamingReasoning
Benchmarks
GPQA
90.4%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). Gemini 3 Flash scored 90.4% on this benchmark.
HLE
43.5%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. Gemini 3 Flash scored 43.5% on this benchmark.
MMLU
91.8%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. Gemini 3 Flash scored 91.8% on this benchmark.
MMLU Pro
72.5%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. Gemini 3 Flash scored 72.5% on this benchmark.
SimpleQA
68.7%
SimpleQA: Factual Accuracy Benchmark. Tests a model's ability to provide accurate, factual responses to straightforward questions. Measures reliability and reduces hallucinations in knowledge retrieval tasks. Gemini 3 Flash scored 68.7% on this benchmark.
IFEval
88.2%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. Gemini 3 Flash scored 88.2% on this benchmark.
AIME 2025
99.7%
AIME 2025: American Invitational Math Exam. Competition-level mathematics problems from the prestigious AIME exam designed for talented high school students. Tests advanced mathematical problem-solving requiring abstract reasoning, not just pattern matching. Gemini 3 Flash scored 99.7% on this benchmark.
MATH
58%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. Gemini 3 Flash scored 58% on this benchmark.
GSM8k
94%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. Gemini 3 Flash scored 94% on this benchmark.
MGSM
92.4%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. Gemini 3 Flash scored 92.4% on this benchmark.
MathVista
65.4%
MathVista: Mathematical Visual Reasoning. Tests the ability to solve math problems that involve visual elements like charts, graphs, geometry diagrams, and scientific figures. Combines visual understanding with mathematical reasoning. Gemini 3 Flash scored 65.4% on this benchmark.
SWE-Bench
78%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. Gemini 3 Flash scored 78% on this benchmark.
HumanEval
84.1%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. Gemini 3 Flash scored 84.1% on this benchmark.
LiveCodeBench
77.2%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. Gemini 3 Flash scored 77.2% on this benchmark.
MMMU
81.2%
MMMU: Multimodal Understanding. Massive Multi-discipline Multimodal Understanding benchmark testing vision-language models on college-level problems across 30 subjects requiring both image understanding and expert knowledge. Gemini 3 Flash scored 81.2% on this benchmark.
MMMU Pro
81.2%
MMMU Pro: MMMU Professional Edition. Enhanced version of MMMU with more challenging questions and stricter evaluation. Tests advanced multimodal reasoning at professional and expert levels. Gemini 3 Flash scored 81.2% on this benchmark.
ChartQA
86.5%
ChartQA: Chart Question Answering. Tests the ability to understand and reason about information presented in charts and graphs. Requires extracting data, comparing values, and performing calculations from visual data representations. Gemini 3 Flash scored 86.5% on this benchmark.
DocVQA
93.1%
DocVQA: Document Visual Q&A. Document Visual Question Answering benchmark testing the ability to extract and reason about information from document images including forms, reports, and scanned text. Gemini 3 Flash scored 93.1% on this benchmark.
Terminal-Bench
47.6%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. Gemini 3 Flash scored 47.6% on this benchmark.
ARC-AGI
33.6%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. Gemini 3 Flash scored 33.6% on this benchmark.

Try Gemini 3 Flash Free

Chat with Gemini 3 Flash for free. Test its capabilities, ask questions, and explore what this AI model can do.

Prompt
Response
Gemini 3 Flash

Your AI response will appear here

About Gemini 3 Flash

Learn about Gemini 3 Flash's capabilities, features, and how it can help you achieve better results.

The Performance Powerhouse of Gemini 3

Gemini 3 Flash is Google's frontier-class multimodal model optimized for extreme speed and massive scalability. Developed by Google DeepMind, it serves as the efficiency-first workhorse of the Gemini 3 ecosystem, delivering high-quality reasoning and native multimodal processing across text, code, images, and audio. It is specifically designed for high-volume enterprise workloads where low latency and cost-effectiveness are paramount.

Unprecedented Context and Agency

The model features a massive 1-million-token context window, allowing it to process entire code repositories, hours of video, or thousands of pages of documentation in a single prompt. More than just a chatbot, it is engineered for agency; integrated with Google's Stagehand and Nano Browser APIs, it can autonomously navigate the web, execute multi-step digital tasks, and interact with live web elements as a human would.

Elite Scientific Reasoning

While optimized for speed, Gemini 3 Flash does not sacrifice intelligence. Through the specialized Deep Think activation protocol, the model can trigger internal chain-of-thought processes to solve PhD-level problems in math, science, and logic. This dual nature allows it to switch between rapid data extraction and sophisticated, expert-level analysis with simple system instructions.

Gemini 3 Flash

Use Cases for Gemini 3 Flash

Discover the different ways you can use Gemini 3 Flash to achieve great results.

Autonomous Browser Automation

Executing multi-step web tasks like lead generation and complex data scraping via Stagehand and Nano Browser APIs.

High-Volume Data Extraction

Processing massive datasets or long-form documents using the 1M token context window for seamless information synthesis.

Real-Time Voice Interaction

Powering responsive, low-latency AI assistants with native audio-to-audio capabilities and low speech-to-text latency.

Rapid Prototyping and Coding

Generating and testing boilerplate code and UI components in developer environments using the integrated Canvas mode.

Search and Information Synthesis

Enhancing AI Overviews with quick, multimodal reasoning across diverse text, image, and video sources.

Agentic Workflow Orchestration

Serving as a lightweight executor for complex, multi-agent digital task forces requiring rapid tool-calling.

Strengths

Limitations

Extreme Efficiency: Runs 3x faster than Gemini 2.5 Pro while offering significant cost reductions for high-volume enterprise tasks.
Hyper-Conciseness by Default: Defaults to extremely brief responses which may require significant prompt engineering or XML tags for creative tasks.
Massive Context Capacity: The 1-million-token window enables the processing of entire repositories or lengthy video transcripts in a single prompt.
Context Drift Susceptibility: Vulnerable to the "lost in the middle" syndrome in long prompts if specific contextual anchoring techniques are not applied.
Elite Reasoning Performance: Achieves a PhD-level 90.4% on GPQA Diamond, indicating high scientific accuracy when using the Deep Think protocol.
Safety Evaluation Gaps: Demonstrated a 97.3% jailbreak success rate during red-team evaluation of early versions, posing potential security risks.
Agentic Mastery: Superior ability to perform autonomous browser actions and tool-calling via deep integration with the Stagehand framework.
Sub-human Execution Depth: While strong at planning, it can still struggle with execution in complex, dynamic, non-verifiable digital environments.

API Quick Start

google/gemini-3-flash

View Documentation
google SDK
import { GoogleGenAI } from "@google/genai";

const genAI = new GoogleGenAI(process.env.GOOGLE_API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-3-flash" });

async function run() {
  const prompt = "Analyze the core logic in this codebase for efficiency.";
  const result = await model.generateContent(prompt);
  const response = await result.response;
  console.log(response.text());
}

run();

Install the SDK and start making API calls in minutes.

What People Are Saying About Gemini 3 Flash

See what the community thinks about Gemini 3 Flash

"The Pareto frontier of intelligence-per-dollar has effectively verticalized with Flash"
OrdinaryLavishness11
reddit
"Gemini 3 Flash CLI turns your terminal into a full AI studio"
JamMasterJulian
reddit
"It isn't just cheap; it is elite, scoring 90.4% on GPQA Diamond"
OrdinaryLavishness11
reddit
"We are effectively automating the automation of science"
alexwg
x/twitter
"Do not confuse the muzzle for the mind when interacting with Gemini 3"
uberzak
reddit
"The web automation capabilities through Stagehand are a game changer"
AIBuilder99
hackernews

Videos About Gemini 3 Flash

Watch tutorials, reviews, and discussions about Gemini 3 Flash

This isn't a plugin. It's the next generation of the web itself — a browser that reads, clicks, types, scrolls, and builds entirely on its own.

Stagehand translates it into visual coordinates and simulates the click.

It handles CAPTCHAs and dynamic loading better than any previous agent I've tested.

The latency between the command and the first click is under 800 milliseconds.

This turns every website into a structured API for your agents.

Google just brought Gemini’s brain straight into your terminal.

It’s like having an AI lab — inside your terminal.

You can pip or npm install this right now and start piping logs directly to the model.

The Flash model is perfect for this because it won't break your bank even with 50,000 line logs.

It's actually capable of writing and executing its own bash scripts safely.

Gemini 3 Flash demonstrates that speed and scale don't have to come at the cost of intelligence.

I built a full content automation tool using Gemini 3 Flash... Before: took 3 hours. After: under 2 minutes.

The GPQA scores for a 'Flash' model are honestly terrifying for the competition.

Its ability to maintain coherence over 1 million tokens is its secret weapon.

If you are building high-volume SaaS apps, this is the default choice now.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows
Watch demo video

Pro Tips

Expert tips to help you get the most out of this model and achieve better results.

Deep Think Protocol

Use the system instruction <deep_think_activation: true> when the model needs to solve complex PhD-level problems to trigger its extended reasoning phase.

XML Output Specification

To counteract the model's default hyper-conciseness, wrap your length and style requirements in explicit <output_verbosity> XML tags.

Contextual Anchoring

When utilizing the full 1M token context, reference specific anchor points or file names in the prompt to prevent information drift.

Terminal Integration

Utilize the Gemini 3 Flash CLI to automate local file processing and shell scripts directly from your terminal environment.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

google

Gemini 3 Pro

google

Google's Gemini 3 Pro is a multimodal powerhouse featuring a 1M token context window, native video processing, and industry-leading reasoning performance.

1M context
$2.00/$12.00/1M
openai

GPT-5.1

openai

GPT-5.1 is OpenAI’s advanced reasoning flagship featuring adaptive thinking, native multimodality, and state-of-the-art performance in math and technical...

400K context
$1.25/$10.00/1M
moonshot

Kimi K2 Thinking

moonshot

Kimi K2 Thinking is Moonshot AI's trillion-parameter reasoning model. It outperforms GPT-5 on HLE and supports 300 sequential tool calls autonomously for...

256K context
$0.15/1M
openai

GPT-5.2

openai

GPT-5.2 is OpenAI's flagship model for professional tasks, featuring a 400K context window, elite coding, and deep multi-step reasoning capabilities.

400K context
$1.75/$14.00/1M
openai

GPT-5.2 Pro

openai

GPT-5.2 Pro is OpenAI's 2025 flagship reasoning model featuring Extended Thinking for SOTA performance in mathematics, coding, and expert knowledge work.

400K context
$21.00/$168.00/1M
xai

Grok-4

xai

Grok-4 by xAI is a frontier model featuring a 2M token context window, real-time X platform integration, and world-record reasoning capabilities.

2M context
$3.00/$15.00/1M
anthropic

Claude Opus 4.5

anthropic

Claude Opus 4.5 is Anthropic's most powerful frontier model, delivering record-breaking 80.9% SWE-bench performance and advanced autonomous agency for coding.

200K context
$5.00/$25.00/1M
zhipu

GLM-4.7

zhipu

GLM-4.7 by Zhipu AI is a flagship 358B MoE model featuring a 200K context window, elite 73.8% SWE-bench performance, and native Deep Thinking for agentic...

200K context
$0.60/$2.20/1M

Frequently Asked Questions

Find answers to common questions about this model