DevAI Resources
API Documentation Official Source Claude Models Intermediate

Anthropic API Documentation

← Back to all resources

URL: https://docs.anthropic.com

1M

Token context window

3

Model tiers (Haiku / Sonnet / Opus)

90%

Cost reduction via prompt caching

REST

API format

What is this resource?

The Anthropic API Documentation is the official developer reference for Claude, the large language model family built by Anthropic. It is a complete technical guide covering everything required to integrate Claude into a software application — from API authentication and your first message request, to advanced features like tool use, vision processing, Extended Thinking, and prompt caching for cost optimization. As of April 2026, the current model lineup is Claude Opus 4.7 (released April 2026, $5/$25 per 1M tokens), Claude Sonnet 4.6 ($3/$15), and Claude Haiku 4.5 ($1/$5) — with Opus 4.7 and Sonnet 4.6 both supporting a full 1-million-token context window.

Anthropic was founded with a specific focus on AI safety, and that philosophy shapes Claude's behavior: the model is designed to be helpful, harmless, and honest. Claude's flagship differentiator in 2025–2026 is Extended Thinking — a mode where Claude explicitly externalizes its reasoning chain before answering, similar to OpenAI's o-series models but with more developer control. You can set a token budget for thinking, and Claude uses that budget to reason through complex problems before producing its final response. Anthropic also released Adaptive Thinking on Haiku 4.5, which dynamically decides how much thinking to apply based on question complexity — making it highly efficient for mixed workloads.

What's in it?

The core of the documentation is the Messages API reference, which covers how to send prompts to Claude and receive responses. Unlike OpenAI's API, Anthropic separates the system prompt from the conversation messages — the system parameter is a top-level field in the request, not a message in the messages array. This structural difference is small but meaningful: it makes it easier to keep your behavioral instructions cleanly separated from the conversation history.

The Models overview page is one of the most practically useful pages in the docs. It describes the three tiers: Haiku 4.5 (smallest, fastest, cheapest at $1/$5 — ideal for high-volume tasks with Adaptive Thinking), Sonnet 4.6 ($3/$15, the production workhorse with 1M context), and Opus 4.7 ($5/$25, the most capable for complex reasoning, research, and long-document analysis, also 1M context). Choosing the right model for each task is one of the most impactful optimizations available to any developer using the API.

The Extended Thinking section is new and critically important for 2025+. You enable it by passing "thinking": {"type": "enabled", "budget_tokens": N} in your request, and Claude produces a thinking content block (visible to you for debugging) before its final response. This dramatically improves performance on multi-step reasoning, math, and complex analysis tasks. The Prompt Caching feature lets you cache repeated content (system prompts, large documents) server-side, cutting those tokens' cost by up to 90%. The Batch API processes requests asynchronously with a 50% discount and supports up to 300,000 output tokens per batch — ideal for bulk document processing.

How is it relevant to your purpose?

For developers building AI-powered applications, treating OpenAI as the only option is a strategic mistake. Claude offers several capabilities that matter specifically in application development contexts. The most significant is the 1-million-token context window on Opus 4.7 and Sonnet 4.6 — enough to process an entire codebase, a multi-hundred-page legal document, or a year's worth of transcripts in a single API call. Even GPT-5.4 (OpenAI's 2026 flagship) maxes out at 1.05M tokens, making this a parity feature rather than the differentiator it once was — but Claude's context handling is generally praised as more reliable at extreme lengths.

Claude's strongest differentiator for developers in 2026 is Extended Thinking: the ability to see Claude's reasoning process, set a thinking budget, and get dramatically better results on hard problems without changing your prompt structure. This is especially useful for coding, architecture decisions, and analysis tasks where you want transparency into why the model reached a conclusion. The API structure is also intentionally similar to OpenAI's, meaning LangChain and other frameworks support both providers through the same interface — switching is a one-line change. Knowing both APIs makes you a more versatile and competitive developer.

Extended Thinking: Claude's reasoning mode

Enable Extended Thinking by adding "thinking": {"type": "enabled", "budget_tokens": 5000} to your API request. Claude will then produce a thinking block (its internal scratchpad) before the final response. Unlike OpenAI's o-series where reasoning is hidden, Claude's thinking is visible to you — you can log it, display it to advanced users, or use it for debugging. On tasks like multi-step code generation, math, or architectural decisions, Extended Thinking can lift quality as much as upgrading from Sonnet to Opus would.

Recommended Watch

Getting Started with the Claude API

A practical introduction to the Anthropic API using Python — covers the Messages API structure, model selection, tool use basics, and how Claude compares to OpenAI's API from a developer's perspective.

Quick Start: Messages API + Extended Thinking (Python)

The system parameter is top-level (not in messages[]). The second example shows Extended Thinking — Claude's visible reasoning mode. Install with pip install anthropic.

import anthropic # Initialize the client — API key from console.anthropic.com client = anthropic.Anthropic(api_key="sk-ant-...") # --- Standard message (Haiku 4.5 — cheapest, $1/$5 per 1M) --- message = client.messages.create( model="claude-haiku-4-5-20251001", max_tokens=1024, system="You are a helpful coding assistant.", messages=[ {"role": "user", "content": "Explain what an API is in two sentences."} ] ) # Response text lives in message.content[0].text print(message.content[0].text) print(f"Tokens — in: {message.usage.input_tokens}, out: {message.usage.output_tokens}") # --- Extended Thinking (Sonnet 4.6 — reasoning visible to developer) --- thinking_msg = client.messages.create( model="claude-sonnet-4-6", # $3/$15 per 1M — 1M context window max_tokens=8000, thinking={"type": "enabled", "budget_tokens": 5000}, messages=[{ "role": "user", "content": "Design the database schema for a multi-tenant SaaS app." }] ) for block in thinking_msg.content: if block.type == "thinking": print(f"[Thinking] {block.thinking[:200]}...") # visible reasoning elif block.type == "text": print(block.text) # final answer