A complete guide to how OpenAI API pricing works, which model to choose, and how to calculate your monthly bill before you start building. Updated March 2025.

How OpenAI API Pricing Works

OpenAI charges per token, where roughly 4 characters equals one token (or about 750 words per million tokens). Pricing is listed per million tokens and splits between input tokens (what you send to the model) and output tokens (what the model generates). Output tokens consistently cost more than input tokens, often 4x more, because generating text is computationally more expensive than processing it.

There is no monthly minimum. You pay only for what you use, and billing is calculated at the end of each calendar month. New accounts receive a small free credit to get started.

Current Model Pricing (March 2025)

Model	Input	Output	Context
GPT-4o	$2.50/M	$10.00/M	128K
GPT-4o mini	$0.15/M	$0.60/M	128K
o1	$15.00/M	$60.00/M	200K
o1-mini	$3.00/M	$12.00/M	128K
o3-mini	$1.10/M	$4.40/M	200K
GPT-3.5 Turbo	$0.50/M	$1.50/M	16K

o1 is OpenAI's reasoning model. It is dramatically more expensive because it spends time internally thinking through problems before responding. The thinking tokens are invisible to you but still billed as output. For tasks that genuinely require multi-step reasoning (complex maths, logic puzzles, code debugging), o1 can justify its cost. For most tasks, GPT-4o or GPT-4o mini is more cost-effective.

Batch API: 50% Discount for Non-Real-Time Work

OpenAI's Batch API lets you submit large volumes of requests that are processed within 24 hours. The tradeoff: you cannot use batch for real-time applications. In return, you get a 50% discount on both input and output tokens across supported models.

This makes the batch API highly attractive for common production tasks:

-Bulk content generation (product descriptions, summaries, metadata)
-Classification and tagging pipelines running on historical data
-Embedding generation for document corpora
-Overnight analysis jobs that do not need immediate results

With batch pricing, GPT-4o drops to $1.25/M input and $5.00/M output. GPT-4o mini drops to $0.075/M input and $0.30/M output. At high volumes, switching from real-time to batch can halve your monthly bill without any change to output quality.

Which Model Should You Use?

The single most impactful cost decision you will make is model selection. Here is a practical guide:

GPT-4o mini: 95% of use cases

At $0.15/M input and $0.60/M output, GPT-4o mini is the default choice for most production applications. It handles writing, summarisation, extraction, classification, and code generation capably. It is 16x cheaper than GPT-4o for input tokens. Start here and only upgrade if you identify tasks where it consistently fails.

GPT-4o: Complex or high-stakes tasks

Use GPT-4o when accuracy matters enough to justify the cost premium. Good for legal drafting, nuanced analysis, tasks requiring strong instruction following, and anything where errors carry real cost. At $2.50/M input, it is 16x the price of mini. Reserve it for tasks where that quality difference is measurable.

o1 / o3-mini: Reasoning tasks only

o1 and o3-mini are reasoning models that think before they answer. They excel at maths, coding challenges, multi-step logic, and scientific analysis. They are slower and significantly more expensive. o3-mini at $1.10/M input offers the best reasoning value. Only use o1 at $15/M when the task genuinely demands it.

Estimating Your Monthly API Bill

The formula is straightforward:

Monthly cost = (input tokens / 1,000,000 x input rate) + (output tokens / 1,000,000 x output rate)

To estimate token counts: a 500-word prompt is roughly 650 tokens. A 1,000-word output is roughly 1,300 tokens. If your application sends 10,000 requests per day with average 500-token prompts and 500-token outputs:

-Daily input: 10,000 x 500 = 5 million tokens
-Daily output: 10,000 x 500 = 5 million tokens
-GPT-4o mini daily cost: (5 x $0.15) + (5 x $0.60) = $0.75 + $3.00 = $3.75
-GPT-4o mini monthly cost: $3.75 x 30 = approximately $112
-GPT-4o same scenario: approximately $1,875/month

This illustrates why model selection matters so much. For 10,000 daily requests, using GPT-4o mini instead of GPT-4o saves over $1,700/month. Use our homepage calculator to run these numbers for your specific usage.

Rate Limits and Usage Tiers

OpenAI uses a tiered rate limit system based on how much you have spent. New accounts start at Tier 1 with modest limits. As you spend more, you automatically progress to higher tiers with greater requests-per-minute and tokens-per-minute allowances.

Tier	Qualification	GPT-4o RPM
Tier 1	$5 paid	500 RPM
Tier 2	$50 paid, 7 days	5,000 RPM
Tier 3	$100 paid, 7 days	5,000 RPM
Tier 4	$250 paid, 14 days	10,000 RPM
Tier 5	$1,000 paid, 30 days	10,000 RPM

Rate limits are enforced per model, per minute. For applications that need to scale quickly, contact OpenAI's sales team to discuss custom limits before hitting production traffic spikes.

ChatGPT API Pricing Explained: Models, Tokens, and Costs in 2025