openai

GPT-4o mini

OpenAI's most cost-efficient small model, GPT-4o mini offers multimodal intelligence and high-speed performance at a significantly lower price point.

Small ModelCost-EfficientVision-CapableFast AIMultimodal
openai logoopenaiGPT-4oJuly 18, 2024
Context
128Ktokens
Max Output
16Ktokens
Input Price
$0.15/ 1M
Output Price
$0.60/ 1M
Modality:TextImage
Capabilities:VisionToolsStreaming
Benchmarks
GPQA
40.2%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). GPT-4o mini scored 40.2% on this benchmark.
HLE
5.3%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. GPT-4o mini scored 5.3% on this benchmark.
MMLU
82%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. GPT-4o mini scored 82% on this benchmark.
MMLU Pro
60%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. GPT-4o mini scored 60% on this benchmark.
SimpleQA
8.6%
SimpleQA: Factual Accuracy Benchmark. Tests a model's ability to provide accurate, factual responses to straightforward questions. Measures reliability and reduces hallucinations in knowledge retrieval tasks. GPT-4o mini scored 8.6% on this benchmark.
IFEval
84%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. GPT-4o mini scored 84% on this benchmark.
MATH
70.2%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. GPT-4o mini scored 70.2% on this benchmark.
GSM8k
91%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. GPT-4o mini scored 91% on this benchmark.
MGSM
87%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. GPT-4o mini scored 87% on this benchmark.
MathVista
62.5%
MathVista: Mathematical Visual Reasoning. Tests the ability to solve math problems that involve visual elements like charts, graphs, geometry diagrams, and scientific figures. Combines visual understanding with mathematical reasoning. GPT-4o mini scored 62.5% on this benchmark.
SWE-Bench
33.2%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. GPT-4o mini scored 33.2% on this benchmark.
HumanEval
87.2%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. GPT-4o mini scored 87.2% on this benchmark.
LiveCodeBench
31.4%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. GPT-4o mini scored 31.4% on this benchmark.
MMMU
59.4%
MMMU: Multimodal Understanding. Massive Multi-discipline Multimodal Understanding benchmark testing vision-language models on college-level problems across 30 subjects requiring both image understanding and expert knowledge. GPT-4o mini scored 59.4% on this benchmark.
MMMU Pro
45.8%
MMMU Pro: MMMU Professional Edition. Enhanced version of MMMU with more challenging questions and stricter evaluation. Tests advanced multimodal reasoning at professional and expert levels. GPT-4o mini scored 45.8% on this benchmark.
ChartQA
85.1%
ChartQA: Chart Question Answering. Tests the ability to understand and reason about information presented in charts and graphs. Requires extracting data, comparing values, and performing calculations from visual data representations. GPT-4o mini scored 85.1% on this benchmark.
DocVQA
92.4%
DocVQA: Document Visual Q&A. Document Visual Question Answering benchmark testing the ability to extract and reason about information from document images including forms, reports, and scanned text. GPT-4o mini scored 92.4% on this benchmark.
Terminal-Bench
25%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. GPT-4o mini scored 25% on this benchmark.
ARC-AGI
4%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. GPT-4o mini scored 4% on this benchmark.

About GPT-4o mini

Learn about GPT-4o mini's capabilities, features, and how it can help you achieve better results.

A New Standard for Small Models

GPT-4o mini represents a significant leap in AI efficiency, designed to replace GPT-3.5 Turbo as the go-to model for developers. Built with a native multimodal architecture, it delivers GPT-4 class performance at a fraction of the cost and latency. It features a massive 128,000 token context window and supports complex outputs of up to 16,384 tokens, making it ideal for processing long-form documents and high-volume data streams.

Intelligence Meets Affordability

Unlike previous small models that sacrificed intelligence for speed, GPT-4o mini maintains high reasoning capabilities across text and vision tasks. It is 60% cheaper than GPT-3.5 Turbo and significantly more capable, scoring 82% on the MMLU benchmark. This model is specifically optimized for applications where low latency and high reliability are paramount, such as real-time customer assistants and large-scale data classification engines.

GPT-4o mini

Use Cases

Discover the different ways you can use GPT-4o mini to achieve great results.

Customer Support Automation

Handling high volumes of customer inquiries with low latency and high accuracy at a fraction of the cost.

Content Summarization

Processing large documents or long-form content into concise summaries within the 128k context window.

Data Extraction

Converting unstructured text or images into structured data formats like JSON for database ingestion.

Multilingual Translation

Providing real-time translation across dozens of languages for chat applications and global communication.

Educational Tutoring

Serving as an interactive study assistant for students needing help with math, science, and language arts.

Basic Vision Tasks

Analyzing images to identify objects, extract text via OCR, or provide descriptions for accessibility.

Strengths

Limitations

Incredible Price to Performance: At $0.15 per million input tokens, it offers frontier-level reasoning with an 82% MMLU score.
Complex Reasoning Gaps: Trails larger models like GPT-4o or o1 in expert-level science, scoring 40.2% on GPQA.
High Throughput Speed: The model delivers responses with extremely low latency, making it ideal for real-time user interfaces.
Coding Limitations: Lacks the deep architectural understanding for complex software engineering compared to Claude 3.5 Sonnet.
Large Context Window: Maintains a full 128k context window, allowing for complex document processing rarely found in small models.
Reduced Output Window: The 16k output limit can be restrictive for tasks requiring massive code migrations or book-length generation.
Native Vision Support: Includes multimodal capabilities in a small form factor, excelling at image analysis and OCR tasks.
Factuality Stability: Smaller models remain more prone to hallucinations in niche domains than their flagship counterparts.

API Quick Start

openai/gpt-4o-mini

View Documentation
openai SDK
import OpenAI from "openai";

const openai = new OpenAI();

async function main() {
  const completion = await openai.chat.completions.create({
    messages: [{ role: "user", content: "Explain quantum physics." }],
    model: "gpt-4o-mini",
  });

  console.log(completion.choices[0].message.content);
}

main();

Install the SDK and start making API calls in minutes.

Community Feedback

See what the community thinks about GPT-4o mini

GPT-4o mini has basically killed the market for fine-tuning older models for basic RAG. The costs are too low to ignore.
AI_Dev_Central
reddit
The speed is just insane. I'm getting tokens back almost instantly for my translation agent.
TechCruncher
twitter
OpenAI really forced the hands of Anthropic and Google with this pricing. $0.15 for 1M tokens is a new floor.
hn_reader_99
hackernews
I swapped out 3.5 for mini and the logic improvement was visible in the first five minutes of testing.
PromptEngineerPro
youtube
It is finally cheap enough to use LLMs for basic data cleaning at scale without a massive cloud bill.
DataVizWiz
reddit
The vision performance for OCR is actually better than some specialized models that cost 10x more.
VisionDev
twitter

Related Videos

Watch tutorials, reviews, and discussions about GPT-4o mini

It is faster and cheaper than GPT-3.5 Turbo across the board.

The vision capabilities for a model this small are genuinely surprising.

Pricing is basically a race to zero now with this release.

It manages to keep the context window massive while being tiny.

Benchmarks show it beating Claude Haiku in almost every category.

GPT 40 mini is a lightweight model so it's much faster than GPT 40.

It's way way faster than GPT 4.

For daily tasks, most users won't even notice the reasoning difference.

The image recognition is very consistent for basic objects.

It handles complex instructions much better than the old 3.5 model.

It currently outperforms their gbd4 on the chat preferences in the LMC leaderboard.

Everything looks perfect and you know this particular receipt looks like a typical receipt.

The response time is practically sub-second for short prompts.

It is very effective at summarizing long PDFs through the API.

You can run millions of tokens for just a few dollars.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows

Pro Tips

Expert tips to help you get the most out of GPT-4o mini and achieve better results.

Use for RAG

Utilize the low input cost to perform extensive Retrieval Augmented Generation without high expenses.

Structure with JSON Mode

Use the JSON mode or function calling parameters to ensure consistent data structures for backend workflows.

Batch Processing

Employ OpenAI's Batch API with this model to reduce costs by 50% for non-urgent tasks.

Temperature Tuning

Set a lower temperature between 0.1 and 0.3 for factual extraction tasks to maximize accuracy.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

alibaba

Qwen3-Coder-Next

alibaba

Qwen3-Coder-Next is Alibaba Cloud's elite Apache 2.0 coding model, featuring an 80B MoE architecture and 256k context window for advanced local development.

262K context
$0.12/$0.75/1M
zhipu

GLM-4.7

Zhipu (GLM)

GLM-4.7 by Zhipu AI is a flagship 358B MoE model featuring a 200K context window, elite 73.8% SWE-bench performance, and native Deep Thinking for agentic...

200K context
$0.60/$2.20/1M
minimax

MiniMax M2.5

minimax

MiniMax M2.5 is a SOTA MoE model featuring a 1M context window and elite agentic coding capabilities at disruptive pricing for autonomous agents.

1M context
$0.15/$1.20/1M
google

Gemini 3.1 Flash Live Preview

Google

Gemini 3.1 Flash Live Preview is Google's ultra-low-latency, audio-to-audio model featuring a 131K context window, high-fidelity multimodal reasoning, and...

131K context
$0.75/$4.50/1M
openai

GPT-5.4

OpenAI

GPT-5.4 is OpenAI's frontier model featuring a 1.05M context window and Extreme Reasoning. It excels at autonomous UI interaction and long-form data analysis.

1M context
$2.50/$15.00/1M
google

Gemini 3.1 Flash-Lite

Google

Gemini 3.1 Flash-Lite is Google's fastest, most cost-efficient model. Features 1M context, native multimodality, and 363 tokens/sec speed for scale.

1M context
$0.25/$1.50/1M
openai

GPT-5.3 Instant

OpenAI

Explore GPT-5.3 Instant, OpenAI's "Anti-Cringe" model. Features a 128K context window, 26.8% fewer hallucinations, and a natural, helpful tone for everyday...

128K context
$1.75/$14.00/1M
google

Gemini 3.1 Pro

Google

Gemini 3.1 Pro is Google's elite multimodal model featuring the DeepThink reasoning engine, a 1M+ context window, and industry-leading ARC-AGI logic scores.

1M context
$2.00/$12.00/1M

Frequently Asked Questions

Find answers to common questions about GPT-4o mini