Claude Opus 4.6 — Anthropic's New Flagship with 1M Context and Agent Teams
Anthropic has released Claude Opus 4.6 today — the company's most capable model for enterprise workflows and agentic software development. With a 1-million-token context window, Agent Teams, and PowerPoint integration, Opus 4.6 sets new standards for AI-powered knowledge work.
What's New in Opus 4.6?
1M Token Context Window (Beta)
Opus 4.6 is the first Opus model with an extended context window of 1 million tokens. Previous Opus models were limited to 200K tokens — a frequent bottleneck during long coding sessions and extensive document analysis.
In a needle-in-a-haystack test (MRCR v2) across 1M tokens, Opus 4.6 scored 76% — compared to just 18.5% for Sonnet 4.5. This means less context compaction, fewer interrupted sessions, and more reliable results on complex tasks.
Agent Teams
Perhaps the most significant addition: Agent Teams enable parallel coordination of multiple AI agents in Claude Code. Instead of a single agent working through tasks sequentially, multiple agents can now work simultaneously on different subtasks.
Scott White, Head of Product at Anthropic, compared the feature to having a talented team of professionals: each agent owns its piece and coordinates directly with the others. Agent Teams are currently available as a research preview for API users and subscribers.
Real-world example: At Rakuten, Opus 4.6 autonomously closed 13 issues and assigned 12 issues to the right team members — in a single day, across an organization of approximately 50 people and 6 repositories.
PowerPoint Integration
Claude is now available directly inside PowerPoint as a side panel. The model can read layouts, fonts, and slide masters, making edits that stay on-brand and on-template. Capabilities include:
- Building slides from corporate templates
- Restructuring storylines
- Converting bullets into diagrams
- Generating complete presentations from descriptions
The PowerPoint integration is available as a research preview via a waitlist.
Adaptive Thinking and Effort Controls
Developers gain finer control over performance and cost through new mechanisms:
- Adaptive Thinking: The model automatically adjusts its reasoning depth based on task complexity
- Effort Controls: Allow tuning intelligence, latency, and cost per use case
- Context Compaction: Improved compaction for longer, more stable sessions
Benchmarks: Significant Progress
Opus 4.6 shows significant improvements over its predecessor and competitors:
| Benchmark | Opus 4.5 | Opus 4.6 | GPT-5.2 | Gemini 3 Pro |
|---|---|---|---|---|
| Terminal Bench 2.0 | 59.8% | 65.4% | — | — |
| OSWorld | 66.3% | 72.7% | — | — |
| ARC AGI 2 | 37.6% | 68.8% | 54.2% | 45.1% |
| BigLaw Bench | — | 90.2% | — | — |
Particularly impressive: The ARC AGI 2 score jumped from 37.6% to 68.8% — an increase of over 83%. This benchmark measures the ability to solve problems that are easy for humans but extremely difficult for AI. Opus 4.6 surpasses both GPT-5.2 (54.2%) and Gemini 3 Pro (45.1%).
Note: Small regressions were observed on SWE-bench Verified and the MCP Atlas benchmark — areas Anthropic will likely address in future updates.
Enterprise Validation
Several companies have already reported impressive results with Opus 4.6:
- Harvey (Legal AI): 90.2% BigLaw Bench score with 40% perfect scores and 84% above 0.8. The highest result of any Claude model for legal reasoning.
- Box: 10% performance lift on high-reasoning tasks — 68% versus a 58% baseline — with near-perfect scores in technical domains.
- Rakuten: Autonomous management of issues and team assignments across 6 repositories and 50 employees.
Pricing and Availability
Opus 4.6 is available now on:
- claude.ai (web interface and mobile app)
- Anthropic API ($5/$25 per million input/output tokens — unchanged from 4.5)
- Microsoft Azure (via Microsoft Foundry)
- All major cloud platforms
Maximum output has been increased to 128K tokens, which is especially relevant for coding and document tasks.
What This Means for Developers
Opus 4.6 marks a turning point for agentic workflows:
- Longer sessions without interruption: 1M context means complex coding projects can run without compaction cycles
- Real team collaboration: Agent Teams enable splitting large projects into parallel workstreams
- Enterprise-ready: The combination of improved benchmarks, PowerPoint integration, and legal reasoning makes Opus 4.6 the first true enterprise AI model
- Same price, more power: No price increase despite significant improvements
Conclusion
With Opus 4.6, Anthropic takes a decisive step from AI as an assistant to AI as a team member. The combination of a 1M context window, Agent Teams, and the ability to autonomously handle complex enterprise tasks positions Claude as a serious platform for professional knowledge work.
The question is no longer whether AI will be integrated into enterprise workflows — but how quickly teams can adapt their existing processes.
Frequently Asked Questions
What is the context window size of Claude Opus 4.6?
Claude Opus 4.6 supports up to 1 million tokens in beta — a 5x increase from the previous 200K limit. It scored 76% on needle-in-a-haystack tests across the full 1M context, compared to just 18.5% for Sonnet 4.5.
What are Agent Teams in Claude Opus 4.6?
Agent Teams enable parallel coordination of multiple AI agents in Claude Code. Instead of one agent working sequentially, multiple agents work simultaneously on different subtasks — similar to a development team where each member owns their piece of the project.
What are the main improvements over Claude Opus 4.5?
Key improvements include the 1M token context window (up from 200K), Agent Teams for parallel task execution, PowerPoint integration for enterprise workflows, and improved performance on coding and reasoning benchmarks.
Is Claude Opus 4.6 available for enterprise use?
Yes. It's designed for enterprise workflows with extended context for large document analysis, Agent Teams for complex development tasks, and PowerPoint integration. Available through the Anthropic API, Claude for Enterprise, and AWS Bedrock.