How much does Gemini 3.5 Flash cost for developers?

The model costs $1.50 per 1 million input tokens and $9.00 per 1 million output tokens. Google offers a 90% discount for cached input tokens, making repetitive queries highly economical.

What is the context window for this model?

Gemini 3.5 Flash supports a context window of 1,048,576 tokens. This allows users to input roughly 700,000 words or several hours of video content in one request.

Can this model process video and audio files natively?

Yes, it supports direct input for video, audio, image, and PDF files. It analyzes these streams natively to maintain spatial and temporal context during reasoning.

How does it compare to Gemini 3.1 Pro?

Gemini 3.5 Flash outperforms the 3.1 Pro model on many agentic and coding benchmarks. It generates output tokens roughly four times faster while maintaining superior logic.

Does Gemini 3.5 Flash support function calling?

Yes, it is highly optimized for tool use and function calling. It can interact with external APIs, IDEs, and terminal environments for multi-step workflows.

What is the maximum output token limit?

The model can generate up to 65,536 tokens in a single response. This capacity is sufficient for producing complete applications or extensive technical reports.

Is there a reasoning or thinking mode available?

Yes, the model features a chain-of-thought thinking mode that can be toggled on. This allows developers to audit the internal planning process of the model.

Gemini 3.5 Flash

Gemini 3.5 Flash is Google's high-speed multimodal model with a 1M context window, optimized for sub-second agentic loops and complex coding tasks.

Multimodal AIAgentic Workflows1M ContextHigh-Speed LLM

googleGeminiMay 19, 2026

Context

1.0Mtokens

Max Output

66Ktokens

Input Price

$1.50/ 1M

Output Price

$9.00/ 1M

Modality:TextImageAudioVideo

Capabilities:VisionToolsStreamingReasoning

Benchmarks

GPQA

74%

HLE

34%

MMLU

89%

MMLU Pro

83%

SimpleQA

76.7%

IFEval

88%

AIME 2025

68%

MATH

88%

GSM8k

97%

MGSM

92%

MathVista

74%

SWE-Bench

55.1%

HumanEval

92%

LiveCodeBench

56%

MMMU

84%

MMMU Pro

88.3%

ChartQA

89%

DocVQA

94%

Terminal-Bench

76.2%

ARC-AGI

12%

View API Documentation

About Gemini 3.5 Flash

Learn about Gemini 3.5 Flash's capabilities, features, and how it can help you achieve better results.

High-Efficiency Agentic Performance

Gemini 3.5 Flash is a multimodal model designed for speed and complex reasoning. It supports a 1-million-token context window, enabling users to process massive data sets including hour-long videos and entire code repositories in a single prompt. The architecture is optimized for sub-second latency, targeting developers building interactive AI agents and automated workflows.

Native Multimodality and Reasoning

This model introduces a Thinking mode for advanced chain-of-thought logic. It natively processes text, images, audio, video, and PDFs, which removes the need for separate preprocessing pipelines. Benchmarks indicate that it outperforms the previous Gemini 3.1 Pro in coding and tool-use tasks while maintaining the efficiency of the Flash tier.

Production-Ready Scaling

At a cost of $1.50 per million input tokens, it provides a cost-effective path for high-volume applications. It is specifically tuned for function calling and terminal-based tasks, achieving high scores on agentic benchmarks like SWE-bench and Terminal-Bench. This makes it a primary choice for real-time coding assistants and data curation systems.

Use Cases

Discover the different ways you can use Gemini 3.5 Flash to achieve great results.

Automated Newsroom Curation

Scanning thousands of RSS feeds and social threads to score and rank stories based on specific editorial profiles in real-time.

High-Volume Document Analysis

Processing massive archives like legal case histories to extract structured summaries and actionable insights without losing context.

Real-time Music Synthesis

Generating interactive audio tools and musical interfaces using native understanding of music theory and audio waveforms.

Interactive Browser OS Generation

Creating fully functional operating system simulations and complex UI dashboards from natural language prompts.

Rapid Code Refactoring

Executing logic updates across large codebases without consuming the higher credits required by flagship models.

Agentic Terminal Automation

Performing multi-step system tasks and coding iterations using a terminal harness to orchestrate development environments.

Strengths

Limitations

Massive 1M Token Context: Supports deep analysis of long-form data including full-length videos and entire software repositories.

Increased Pricing: Token costs have tripled compared to previous Flash preview models, moving to $1.50 input and $9 output per million tokens.

Exceptional Synthesis Logic: Leading performance in generating complex interactive audio tools and modern browser-based operating system simulations.

Arithmetic Inaccuracy: Occasionally struggles with basic mathematical operations, failing simple prompts that specialized reasoning models solve easily.

Sub-Second Latency: Optimized for extreme throughput, reaching output speeds up to 1500 tokens per second in production environments.

Context Window Degradation: Users report that retrieval reliability can diminish slightly as the context window approaches the 1-million-token limit.

Agentic Performance Gains: Outperforms many larger flagship models on real-world coding tasks and terminal-based agentic benchmarks.

3D Lighting Inconsistencies: Can produce overly dark or poorly lit environments in complex 3D simulations, requiring iterative prompting to correct.

API Quick Start

google/gemini-3.5-flash

View Documentation

google SDK

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI(process.env.GOOGLE_API_KEY);
const model = client.getGenerativeModel({ 
  model: "gemini-3.5-flash",
  generationConfig: { maxOutputTokens: 65536 }
});

async function run() {
  const prompt = "Build a fully interactive 3D synthwave landscape using Three.js.";
  const result = await model.generateContent(prompt);
  console.log(result.response.text());
}

run();

Install the SDK and start making API calls in minutes.

Community Feedback

See what the community thinks about Gemini 3.5 Flash

“Gemini 3.5 Flash is the clear leader on the Intelligence vs Speed Pareto frontier and makes large gains on real-world agentic tasks.”

— Artificial Analysis

twitter

“Gemini 3 is brilliant for UK business use. It captures nuanced politeness levels and UK-specific tax assumptions better than US-centric models.”

— Efficient_Degree9569

“This model is so like it loves music stuff. It is very, very fast and the audio synthesizer it generated had me completely sold.”

— Bjaman

youtube

“Gemini 3.5 Flash is definitely outperforming the previous Pro model on coding related things, which is huge for agentic developers.”

— DevGuru99

“Google just released Gemini 3.5 Flash. The interesting part is not just that it’s faster. Google is positioning this as the agentic king.”

— TestingCatalog

twitter

“Gemini 3.5 Flash is super strong model for its class. Beats Gemini 3.1 Pro on so many benchmarks.”

— AI_Expert

twitter

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents

Web Automation

Smart Workflows

Get Started Free

Pro Tips

Expert tips to help you get the most out of Gemini 3.5 Flash and achieve better results.

Enable Thinking Mode

Toggle the thinking setting in the API or Google AI Studio to activate advanced chain-of-thought reasoning for engineering problems.

Leverage Native Multimodality

Upload raw audio or video files directly for analysis to preserve temporal and tonal data instead of using external transcripts.

Specify Constraints Verbatim

The model follows negative constraints strictly. Use instructions like 'No explanations' for raw code output to minimize latency.

Apply The High-Low Strategy

Use Flash for high-volume tasks like UI drafting and only use Pro models for final architectural verification.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Mohammed Ibrahim

CEO, qannas.pro

Ben Bressington

CTO, AiChatSolutions

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

Claude 3.7 Sonnet

Anthropic

Claude 3.7 Sonnet is Anthropic's first hybrid reasoning model, delivering state-of-the-art coding capabilities, a 200k context window, and visible thinking.

200K context

$3.00/$15.00/1M

Claude 4.5 Sonnet

Anthropic

Anthropic's Claude Sonnet 4.5 delivers world-leading coding (77.2% SWE-bench) and a 200K context window, optimized for the next generation of autonomous agents.

200K context

$3.00/$15.00/1M

GPT-5.3 Codex

OpenAI

GPT-5.3 Codex is OpenAI's 2026 frontier coding agent, featuring a 400K context window, 77.3% Terminal-Bench score, and superior logic for complex software...

400K context

$1.75/$14.00/1M

Kimi K2.7 Code

Moonshot

Kimi K2.7 Code is a 1T parameter MoE model from Moonshot AI. It features a 262k context window and 30% more efficient reasoning for software engineering.

262K context

$0.95/$4.00/1M

GLM-5.2

Zhipu (GLM)

GLM-5.2 is Zhipu AI's flagship open-weight model featuring a 1M context window and specialized agentic coding capabilities under an MIT license.

1M context

$1.40/$4.40/1M

Qwen3.5-Omni

alibaba

Qwen3.5-Omni is a natively omnimodal AI by Alibaba Cloud, offering seamless audio-visual reasoning, real-time voice chat, and 256k context for low-latency apps.

256K context

$0.40/$4.80/1M

GPT-5.4

OpenAI

GPT-5.4 is OpenAI's frontier model featuring a 1.05M context window and Extreme Reasoning. It excels at autonomous UI interaction and long-form data analysis.

1M context

$2.50/$15.00/1M

Kimi K2 Thinking

Moonshot

Kimi K2 Thinking is Moonshot AI's trillion-parameter reasoning model. It outperforms GPT-5 on HLE and supports 300 sequential tool calls autonomously for...

256K context

$0.60/$2.50/1M

Frequently Asked Questions

Find answers to common questions about Gemini 3.5 Flash

Gemini 3.5 Flash

About Gemini 3.5 Flash

High-Efficiency Agentic Performance

Native Multimodality and Reasoning

Production-Ready Scaling

Use Cases

Automated Newsroom Curation

High-Volume Document Analysis

Real-time Music Synthesis

Interactive Browser OS Generation

Rapid Code Refactoring

Agentic Terminal Automation

Strengths

Limitations

API Quick Start

Community Feedback

Related Videos

Supercharge your workflow with AI Automation

Pro Tips

Enable Thinking Mode

Leverage Native Multimodality

Specify Constraints Verbatim

Apply The High-Low Strategy

What Our Users Say

Related AI Models

Claude 3.7 Sonnet

Claude 4.5 Sonnet

GPT-5.3 Codex

Kimi K2.7 Code

GLM-5.2

Qwen3.5-Omni

GPT-5.4

Kimi K2 Thinking

Frequently Asked Questions

How much does Gemini 3.5 Flash cost for developers?

What is the context window for this model?

Can this model process video and audio files natively?

How does it compare to Gemini 3.1 Pro?

Does Gemini 3.5 Flash support function calling?

What is the maximum output token limit?

Is there a reasoning or thinking mode available?