AI Token Counter for Markdown
Count tokens in real-time for GPT-4, Claude, Gemini, and Llama. Track context window usage and estimate API costs instantly.
Try It Now
Type or paste your text below to count tokens in real-time
Why Token Counting Matters for AI Development
When working with Large Language Models (LLMs) like GPT-4, Claude, or Gemini, understanding token usage is crucial. Each model has a context window — a maximum number of tokens it can process in a single request. Exceeding this limit means your prompt gets truncated or rejected.
Real-Time Token Counting
See your token count update as you type. No need to submit or wait — instant feedback for every keystroke.
Multi-Model Support
Support for GPT-4, Claude 3, Gemini 2.0, Llama 3, and more. Each model's tokenizer accurately counted.
Context Window Visualization
See at a glance how much of each model's context window you're using with visual percentage bars.
API Cost Estimation
Estimate API costs before you send. Know exactly how much each request will cost across different providers.
Supported AI Models & Context Windows (November 2025)
OpenAI Models
| Model | Context Window | Tokenizer | Notes |
|---|---|---|---|
| GPT-4 | 8K tokens | cl100k_base | Original GPT-4 |
| GPT-4 Turbo | 128K tokens | cl100k_base | Enhanced speed |
| GPT-4o / GPT-4o-mini | 128K tokens | o200k_base | Multimodal |
| GPT-4.1 / mini / nano | 1M tokens | o200k_base | Million-token context |
| GPT-5 / GPT-5.1 | 400K tokens | o200k_base | 400K input, 128K output |
| GPT-5.2 | 400K tokens | o200k_base | Latest flagship (Dec 2025) |
| o1 / o1-preview / o1-mini | 128K - 200K tokens | o200k_base | Reasoning models |
| o1-pro | 200K tokens | o200k_base | Extended reasoning |
| o3 / o3-mini / o3-pro | 200K tokens | o200k_base | Advanced reasoning |
| o4-mini | 200K tokens | o200k_base | Next-gen reasoning |
Anthropic Claude Models
| Model | Context Window | Tokenizer | Notes |
|---|---|---|---|
| Claude 3 (Opus/Sonnet/Haiku) | 200K tokens | Claude tokenizer | Claude 3 generation |
| Claude 3.5 (Sonnet/Haiku) | 200K tokens | Claude tokenizer | Enhanced capabilities |
| Claude 3.7 Sonnet | 200K tokens | Claude tokenizer | Extended thinking |
| Claude 4 | 200K tokens | Claude tokenizer | Claude 4 base |
| Claude Opus 4.6 | 200K tokens | Claude tokenizer | Most capable (1M beta) |
| Claude Opus 4.5 | 200K tokens | Claude tokenizer | Previous flagship |
| Claude Sonnet 4.5 | 200K tokens | Claude tokenizer | Balanced (1M beta) |
| Claude Haiku 4.5 | 200K tokens | Claude tokenizer | Latest fast |
Google Gemini Models
| Model | Context Window | Tokenizer | Notes |
|---|---|---|---|
| Gemini 1.5 Pro | 2M tokens | Gemini tokenizer | Long context |
| Gemini 1.5 Flash | 1M tokens | Gemini tokenizer | Fast inference |
| Gemini 2.0 Flash | 1M tokens | Gemini tokenizer | Multimodal |
| Gemini 2.0 Flash Thinking | 1M tokens | Gemini tokenizer | Reasoning mode |
| Gemini 2.5 Pro/Flash/Flash-Lite | 1M tokens | Gemini tokenizer | Enhanced 2.5 series |
Meta Llama Models
| Model | Context Window | Tokenizer | Notes |
|---|---|---|---|
| Llama 3.1 (8B/70B/405B) | 128K tokens | Llama tokenizer | Open weights |
| Llama 3.2 (1B/3B) | 128K tokens | Llama tokenizer | Mobile/edge models |
| Llama 3.2 Vision (11B/90B) | 128K tokens | Llama tokenizer | Multimodal |
| Llama 3.3 70B | 128K tokens | Llama tokenizer | Latest 3.x |
| Llama 4 Scout | 10M tokens | Llama tokenizer | 10M context! |
| Llama 4 Maverick | 1M tokens | Llama tokenizer | Multimodal flagship |
Other Popular Models
| Model | Context Window | Tokenizer | Notes |
|---|---|---|---|
| Mistral Large 3 | 256K tokens | Mistral tokenizer | Mistral flagship (Dec 2025) |
| Mistral Nemo | 128K tokens | Mistral tokenizer | Open source |
| Codestral | 32K tokens | Mistral tokenizer | Code-focused |
| Pixtral 12B/Large | 128K tokens | Mistral tokenizer | Multimodal |
| Grok-2 / Grok-2 mini | 128K tokens | Grok tokenizer | xAI models |
| Grok-3 | 1M tokens | Grok tokenizer | xAI flagship |
| Command R / R+ | 128K tokens | Cohere tokenizer | RAG-optimized |
| Command A | 256K tokens | Cohere tokenizer | Cohere flagship |
| DeepSeek-V3 / V3.1 / R1 | 128K tokens | DeepSeek tokenizer | Open source reasoning |
How Our Token Counter Works
- Write or paste your markdown content — Type directly or paste existing markdown into the editor
- Select your target model — Choose from GPT-4, Claude, Gemini, or other models
- See instant results — Token count, context window usage, and cost estimates
- Optimize and export — Adjust your content to fit context limits, then export
Frequently Asked Questions
What is a token in AI/LLM context?
A token is the basic unit of text that AI models process. It can be a word, part of a word, or punctuation. For example, 'running' is one token, while 'extraordinarily' might be split into multiple tokens.
Why do different models have different token counts?
Each AI model uses its own tokenizer with different rules. GPT-4 uses tiktoken, Claude uses its own tokenizer, and Gemini uses SentencePiece. The same text can result in different token counts.
How accurate is this token counter?
Our token counter uses the same tokenizers as the actual AI models (tiktoken for GPT, custom for Claude, etc.), so the counts are highly accurate and match what you'll see in production.
Is this token counter free?
Yes! Completely free with no limits, no signup required, and no hidden costs. Count tokens for unlimited documents forever.
Ready to Count Tokens?
Start writing with real-time token counting. No signup, no cost, no limits.