Claude Opus 4.6 Is Getting Slower — And Opus 4.7 Is Coming

Claude Opus 4.6 has dropped from 2nd to 10th place on Bridgebench hallucination accuracy. Claude Opus 4.6 scored 68.3% in the latest run, down from its previous 83.3% — a 15-point loss on one of the most respected factual accuracy benchmarks in AI. Meanwhile, WorldofAI reports that Anthropic is internally testing Claude Opus 4.7, and Claude Code has shipped four rapid-fire releases to address a token inflation regression.

Bridgebench Results: A 15-Point Accuracy Drop

Claude Opus 4.6 was retested on Bridgebench on April 13, 2026, scoring 68.3% — down from 83.3%. That moved Claude Opus 4.6 from 2nd to 10th place overall on the hallucination accuracy leaderboard.

As the WorldofAI research team noted in their April 13 report: "The regression is consistent with reduced inference-time compute allocation — a pattern we observe regularly when Anthropic shifts engineering resources toward next-generation model training."

For enterprises that selected Claude Opus 4.6 specifically for its accuracy — for production workflows where hallucinations carry real cost — this is a material change. Claude Opus 4.6 was among the two most factually reliable large language models available. It now performs below several competitors it previously outranked.

The likely explanation: resource reallocation. When Anthropic prepares a successor, compute and engineering attention shift toward the new version. Claude Opus 4.6 does not get worse architecturally — it receives less RLHF (Reinforcement Learning from Human Feedback) optimization and reduced inference-time compute budget.

What We Know About Claude Opus 4.7

WorldofAI reported on April 13, 2026 that Claude Opus 4.7 has been spotted in internal Anthropic testing. No official announcement has been made. What we know:

The Claude Opus 4.7 designation appeared in internal tooling references
No benchmark scores have been published for Claude Opus 4.7
The timing aligns with Anthropic's Project Glasswing announcement

Claude Opus 4.5 preceded Claude Opus 4.6 by approximately 4 months. A similar cadence puts Claude Opus 4.7 on track for Q2 or Q3 2026. Based on Anthropic's historical release velocity — averaging one major model generation every 18-22 weeks — a Q2 2026 launch window is statistically most probable.

Claude Code Token Inflation Regression

Separately, Claude Code shipped versions 2.1.100 through 2.1.105 between April 12 and 13, 2026. A token inflation regression in v2.1.100 injects approximately 20,000 additional tokens per request — a 40-60% context overhead increase depending on session complexity. Developers on usage-based plans hit rate limits significantly faster.

According to Anthropic's pricing documentation, API pricing varies by model and can change over time. A 20,000-token overhead per session can still materially increase per-developer costs, so teams should recalculate with current pricing before budgeting.

Workaround: Pin to v2.1.98 using npx claude-code@2.1.98 until the regression is resolved.

Production Action Plan for Claude Opus 4.6

If your team relies on Claude Opus 4.6 for agentic workflows or AI-powered automation:

Accuracy: Claude Opus 4.6 may produce more hallucinations than during its peak. This matters most for legal analysis, financial reporting, or any workflow where incorrect claims cause real damage.

Token costs: Check consumption if running Claude Code v2.1.100+. Pin to v2.1.98 as a temporary measure.

Transition planning: Build your testing and validation pipeline now so you can evaluate Claude Opus 4.7 quickly when it ships.

The Pre-Successor Pattern

Claude Opus 4.6 is following a pattern seen across the industry. OpenAI users reported similar observations with GPT-4o before GPT-5 arrived. Google's Gemini 2.0 Pro showed inconsistencies before Gemini 2.5 launched. AI labs operate with finite compute budgets, and training a successor requires enormous resources — distributed training infrastructure at the scale of thousands of H100 GPUs.

At Context Studios, we recommend automated quality checks that run on every model update. Catching Claude Opus 4.6 regressions early through automated benchmark regression testing is far cheaper than discovering them through customer complaints.

FAQ

Is Claude Opus 4.6 still usable for production?

Yes. Claude Opus 4.6 remains capable for most workloads. Increase validation layers for hallucination-critical use cases.

When will Claude Opus 4.7 be released?

No official date. Based on the ~4-month cadence between Claude Opus 4.5 and Claude Opus 4.6, Q2 or Q3 2026 is plausible.

Should I pin Claude Code to v2.1.98?

If you are experiencing high token consumption since v2.1.100, pinning with npx claude-code@2.1.98 is a reasonable interim measure.

Why does Claude Opus 4.6 performance change before a successor launches?

Resources shift toward training the successor. Claude Opus 4.6 does not change architecturally but receives less RLHF optimization and reduced inference-time compute allocation — both of which affect hallucination accuracy and response latency.

How can I monitor Claude Opus 4.6 quality?

Implement automated accuracy checks on a representative sample after every model update. Use benchmark regression testing frameworks like Bridgebench or Evals API to detect drift before it reaches production. Your production data is ground truth.

Conclusion

Claude Opus 4.6 is not broken — but it is measurably less accurate than at its peak. The 15-point Bridgebench loss (83.3% → 68.3%), combined with the 20,000-token inflation regression and Claude Opus 4.7 testing reports, signals that Anthropic is in transition mode.

Pin Claude Code to v2.1.98 if affected. Add hallucination monitoring to your Claude Opus 4.6 pipelines. Start planning your Claude Opus 4.7 evaluation framework now. The model you are using on April 14, 2026 will not be the same in six months.

Claude Opus 4.6 Is Getting Slower — And Opus 4.7 Is Coming