google

Gemini 3.1 Flash-Lite

Gemini 3.1 Flash-Lite is Google's fastest, most cost-efficient model. Features 1M context, native multimodality, and 363 tokens/sec speed for scale.

MultimodalHigh SpeedCost EfficientGoogle Gemini
google logogoogleGemini 3.1March 3, 2026
Context
1.0Mtokens
Max Output
66Ktokens
Input Price
$0.25/ 1M
Output Price
$1.50/ 1M
Modality:TextImageAudioVideo
Capabilities:VisionToolsStreamingReasoning
Benchmarks
GPQA
86.9%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). Gemini 3.1 Flash-Lite scored 86.9% on this benchmark.
HLE
25%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. Gemini 3.1 Flash-Lite scored 25% on this benchmark.
MMLU
89.2%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. Gemini 3.1 Flash-Lite scored 89.2% on this benchmark.
MMLU Pro
83%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. Gemini 3.1 Flash-Lite scored 83% on this benchmark.
SimpleQA
43.3%
SimpleQA: Factual Accuracy Benchmark. Tests a model's ability to provide accurate, factual responses to straightforward questions. Measures reliability and reduces hallucinations in knowledge retrieval tasks. Gemini 3.1 Flash-Lite scored 43.3% on this benchmark.
IFEval
90.1%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. Gemini 3.1 Flash-Lite scored 90.1% on this benchmark.
AIME 2025
75%
AIME 2025: American Invitational Math Exam. Competition-level mathematics problems from the prestigious AIME exam designed for talented high school students. Tests advanced mathematical problem-solving requiring abstract reasoning, not just pattern matching. Gemini 3.1 Flash-Lite scored 75% on this benchmark.
MATH
88.5%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. Gemini 3.1 Flash-Lite scored 88.5% on this benchmark.
GSM8k
96.2%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. Gemini 3.1 Flash-Lite scored 96.2% on this benchmark.
MGSM
92%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. Gemini 3.1 Flash-Lite scored 92% on this benchmark.
MathVista
64.1%
MathVista: Mathematical Visual Reasoning. Tests the ability to solve math problems that involve visual elements like charts, graphs, geometry diagrams, and scientific figures. Combines visual understanding with mathematical reasoning. Gemini 3.1 Flash-Lite scored 64.1% on this benchmark.
SWE-Bench
52%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. Gemini 3.1 Flash-Lite scored 52% on this benchmark.
HumanEval
90.5%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. Gemini 3.1 Flash-Lite scored 90.5% on this benchmark.
LiveCodeBench
72%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. Gemini 3.1 Flash-Lite scored 72% on this benchmark.
MMMU
76.8%
MMMU: Multimodal Understanding. Massive Multi-discipline Multimodal Understanding benchmark testing vision-language models on college-level problems across 30 subjects requiring both image understanding and expert knowledge. Gemini 3.1 Flash-Lite scored 76.8% on this benchmark.
MMMU Pro
76.8%
MMMU Pro: MMMU Professional Edition. Enhanced version of MMMU with more challenging questions and stricter evaluation. Tests advanced multimodal reasoning at professional and expert levels. Gemini 3.1 Flash-Lite scored 76.8% on this benchmark.
ChartQA
85.5%
ChartQA: Chart Question Answering. Tests the ability to understand and reason about information presented in charts and graphs. Requires extracting data, comparing values, and performing calculations from visual data representations. Gemini 3.1 Flash-Lite scored 85.5% on this benchmark.
DocVQA
92.2%
DocVQA: Document Visual Q&A. Document Visual Question Answering benchmark testing the ability to extract and reason about information from document images including forms, reports, and scanned text. Gemini 3.1 Flash-Lite scored 92.2% on this benchmark.
Terminal-Bench
51%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. Gemini 3.1 Flash-Lite scored 51% on this benchmark.
ARC-AGI
8.5%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. Gemini 3.1 Flash-Lite scored 8.5% on this benchmark.

About Gemini 3.1 Flash-Lite

Learn about Gemini 3.1 Flash-Lite's capabilities, features, and how it can help you achieve better results.

Gemini 3.1 Flash-Lite is engineered for high-volume AI applications where processing speed is the primary technical requirement. Unlike larger Pro models, Flash-Lite uses a streamlined architecture that prioritizes throughput, reaching 363 tokens per second. It serves as a specialized tool for developers building real-time voice agents, automated content moderation systems, and large-scale data extraction pipelines that must remain cost-effective under heavy traffic.

Despite its lite designation, the model maintains a 1 million token context window. It can ingest raw audio files, hour-long videos, and hundreds of pages of PDFs in a single request. By introducing Thinking Levels, Google allows users to choose between near-instant responses for simple tasks and a deeper reasoning phase for complex logic. This provides multiple performance profiles within a single API endpoint to balance cost and accuracy.

The model is natively multimodal, which eliminates the need for external tools to transcribe audio or describe images before processing. This native capability improves performance on visual tasks like document question answering and chart analysis. Developers can use the thinking_level parameter to adjust internal reasoning time, effectively scaling the model's effort based on the specific complexity of each query.

Gemini 3.1 Flash-Lite

Use Cases

Discover the different ways you can use Gemini 3.1 Flash-Lite to achieve great results.

High-Volume Translation

Processing thousands of multilingual chat messages or support tickets in real-time with sub-second latency.

Intelligent Model Routing

Acting as a fast classifier to determine if incoming queries need to be escalated to more expensive models.

Multimodal Content Moderation

Scanning large batches of user-generated images and videos for safety compliance at low cost.

Real-Time UI Prototyping

Generating functional React or Tailwind components from hand-drawn wireframes or verbal descriptions.

Long-Document Summarization

Condensing massive legal archives or technical manuals without losing context across the 1M token window.

Live Audio Transcription

Converting hours of meetings or lecture recordings into structured summaries and action items in one pass.

Strengths

Limitations

Blistering Performance: At 363 tokens per second, it is one of the fastest models in the industry for real-time responsiveness.
Low Factual Recall: A SimpleQA score of 43.3% indicates a high risk of hallucinations for general knowledge without grounding.
Advanced Reasoning: Achieving 86.9% on GPQA Diamond, it provides PhD-level scientific logic in a lightweight tier.
Price Increase: It is significantly more expensive than the Gemini 2.5 Flash-Lite predecessor it replaces in the lineup.
Dynamic Cost Control: The Thinking Levels parameter allows for granular control over compute spend on a per-request basis.
Higher Latency in High-Thinking: Using the high thinking level adds roughly 7 to 10 seconds of pre-computation before generation begins.
Unified Multimodality: Native ingestion of audio, video, and PDFs eliminates the need for complex multi-model orchestration pipelines.
Safety Refusals: Internal testing shows a 21.7% drop in image-to-text safety consistency during red-teaming exercises.

API Quick Start

google/gemini-3.1-flash-lite-preview

View Documentation
google SDK
import { GoogleGenAI } from "@google/generative-ai";

const genAI = new GoogleGenAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({
  model: "gemini-3.1-flash-lite-preview",
  generationConfig: {
    thinkingConfig: { thinking_level: "high" }
  }
});

const result = await model.generateContent("Create a weather dashboard UI.");
console.log(result.response.text());

Install the SDK and start making API calls in minutes.

Community Feedback

See what the community thinks about Gemini 3.1 Flash-Lite

The coding capability of 3.1 Flash-Lite is surprisingly good for front-end development; it coded a 360-degree viewer perfectly.
WorldofAI
youtube
Gemini 3.1 Flash-Lite is the model to build always-on multimodal AI Agents. It reads, connects, and consolidates everything.
Shubham Saboo
twitter
Pricing is a massive shock. A 3.75x jump on output tokens is going to sting if you're on a tight cloud budget.
Binary Verse AI
youtube
It shifts the burden of complexity from your engineering team's architecture right onto Google's infrastructure.
Julian Goldie
youtube
Another price drop for intelligence. High speed, low cost, high intelligence. A great model for agentic routing.
ctgtplb
twitter
The 1M context is still the killer feature here. I can dump entire repo folders and it just works with sub-second TTFT.
DevFlow_26
reddit

Related Videos

Watch tutorials, reviews, and discussions about Gemini 3.1 Flash-Lite

It seems like they have been able to squeeze in a lot of intelligence into this model somehow.

I would use it for high throughput workloads which are very well defined.

The front-end capability of the flashlight is even better than most models that I have actually worked with.

It literally created a fully functional viewer in one shot.

This model is ideal for those who need speed without sacrificing all the logic.

This model is what we would call a workhorse model... specifically designed for high throughput tasks.

If you run this on minimal thinking budget, it basically works as a non-reasoning model and it's extremely fast.

It did a remarkably good job at the website that we have as an output.

The speed-to-cost ratio is the real reason why you would move your production apps here.

It handles multimodal inputs natively which is a huge advantage over competitors.

Hitting nearly 87% on GPQA Diamond with a model labeled lite disrupts our entire categorization system.

Do not use this model as a factual oracle... you have to bring the facts to it.

With 3.1 Flash-Lite, you avoid firing three other microservices... that simplicity is worth real money.

The 45 percent increase in output speed is felt immediately in the streaming response.

You are getting 1M context for pennies, which still feels like magic in production.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows

Pro Tips

Expert tips to help you get the most out of Gemini 3.1 Flash-Lite and achieve better results.

Set Thinking Levels

Use minimal thinking for classification to reduce costs but switch to high for complex coding tasks.

Enable Grounding

Always use Google Search grounding for tasks requiring factual recall since base factual accuracy is lower.

Upload Raw Files

Avoid pre-processing audio or video into text and instead upload raw files to leverage native multimodality.

Use System Instructions

Strictly enforce JSON schemas using the system_instruction parameter to minimize output correction tokens.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

anthropic

Claude Opus 4.5

Anthropic

Claude Opus 4.5 is Anthropic's most powerful frontier model, delivering record-breaking 80.9% SWE-bench performance and advanced autonomous agency for coding.

200K context
$5.00/$25.00/1M
xai

Grok-4

xAI

Grok-4 by xAI is a frontier model featuring a 2M token context window, real-time X platform integration, and world-record reasoning capabilities.

2M context
$3.00/$15.00/1M
moonshot

Kimi K2.5

Moonshot

Discover Moonshot AI's Kimi K2.5, a 1T-parameter open-source agentic model featuring native multimodal capabilities, a 262K context window, and SOTA reasoning.

256K context
$0.60/$3.00/1M
zhipu

GLM-5

Zhipu (GLM)

GLM-5 is Zhipu AI's 744B parameter open-weight powerhouse, excelling in long-horizon agentic tasks, coding, and factual accuracy with a 200k context window.

200K context
$1.00/$3.20/1M
openai

GPT-5.1

OpenAI

GPT-5.1 is OpenAI’s advanced reasoning flagship featuring adaptive thinking, native multimodality, and state-of-the-art performance in math and technical...

400K context
$1.25/$10.00/1M
openai

GPT-5.2

OpenAI

GPT-5.2 is OpenAI's flagship model for professional tasks, featuring a 400K context window, elite coding, and deep multi-step reasoning capabilities.

400K context
$1.75/$14.00/1M
alibaba

Qwen3.5-397B-A17B

alibaba

Qwen3.5-397B-A17B is Alibaba's flagship open-weight MoE model. It features native multimodal reasoning, a 1M context window, and a 19x decoding throughput...

1M context
$0.40/$2.40/1M
moonshot

Kimi K2 Thinking

Moonshot

Kimi K2 Thinking is Moonshot AI's trillion-parameter reasoning model. It outperforms GPT-5 on HLE and supports 300 sequential tool calls autonomously for...

256K context
$0.60/$2.50/1M

Frequently Asked Questions

Find answers to common questions about Gemini 3.1 Flash-Lite