In February 2026, something broke. Thousands of developers noticed Claude - the AI model that was supposed to be “the smart one” - suddenly couldn’t read code before editing it, failed basic logic tests, and rushed to finish tasks without actually trying.
Anthropic said nothing for weeks. Then an AMD AI Director ran 6,852 sessions, analyzed 17,871 thinking blocks, and published the evidence: thinking depth dropped 67% since late February.
Here’s what actually happened - and how to fix it.
The Evidence: From 6.6 File Reads to Just 2
A developer named Stella Laurenzo (AMD’s Director of AI) didn’t just complain on Reddit. She built a data pipeline:
- 6,852 Claude Code sessions logged
- 234,760 tool calls analyzed
- Result: Claude went from reading a file 6.6 times before editing it… to reading it 2 times. One in three edits were made without reading the file at all.
The word “simplest” appeared 642% more in outputs. Claude wasn’t just thinking less - it was telling you it was taking shortcuts.
Then Anthropic closed the GitHub issue. 72 people thumbs-upped the comment asking “why did you close this?”
What Anthropic Changed (And Didn’t Tell You)
In February, Anthropic rolled out “Adaptive Thinking” - a system where Claude decides when and how much to think based on “complexity.”
Sounds smart, right? Except here’s what they actually did:
The Hidden Effort Levels
Anthropic maps “effort” levels to an internal reasoning_effort score. The score is the cap on how much the model is allowed to think before answering.
| Effort Level | reasoning_effort | What it means |
|---|---|---|
| Low | 50 | Claude skips thinking on “simple” tasks |
| Medium | 85 | Default for most users |
| High | 95 | What you probably want |
| Max | (not set) | Opus 4.6 only, unlimited thinking |
Via the API with no parameters? You get the full model. On the website? You get the nerfed version.
The Meme-Benchmark That Exposed It
Someone posted this test:
“A car wash is 50m from my house. Should I walk or drive to it?”
“Walk! It's only 50 meters, about a one-minute walk.”
Wrong. You need to drive your car to the car wash.
“If you're getting your car washed, you'll need to drive, the car has to be there.”
Correct. It actually thought about the problem.
This isn’t a fluke. Developers report:
- Claude Code reads code 3x less before editing
- Stop hook violations spiked (Claude abandons tasks mid-way)
- File rewrites doubled (instead of targeted edits)
- Token usage mysteriously increased (but quality dropped)
Why This Matters (And Why It’s Worse Than You Think)
1. Anthropic Didn’t Announce It
They rolled out Adaptive Thinking, changed default effort levels, and never told users. No changelog. No warning. People thought they were writing bad prompts.
The only reason we know? Someone ran 7,000 sessions and posted the forensics on GitHub.
2. Enterprise vs. Personal Accounts Get Different Models
Multiple reports confirm: Team/Enterprise accounts get higher default effort than personal Max accounts. Same price. Different model quality.
A leaked source code snippet shows a check for user_type == "ant" (Anthropic employees) that routes to a different instruction set with “verify work actually works before claiming done.”
Paying users don’t get that instruction.
3. The “Savings” Aren’t Real
Anthropic’s justification: “Adaptive thinking saves tokens on easy tasks.”
Reality check:
- Lower effort → more mistakes
- More mistakes → more iterations
- More iterations → more tokens burned fixing the mess
A user in /r/Anthropic: “I spent 4 hours yesterday on Sonnet to fix something. Switched to Opus, resolved in one try.”
That’s not efficiency. That’s artificial scarcity.
How to Fix It (Right Now)
If You Use Claude Code
Add this to claude_code.md or your global claude.md:
## Code Quality
- Prefer correct, complete implementations over minimal ones.
- Use appropriate data structures - don't brute-force known solutions.
- When fixing a bug, fix the root cause, not the symptom.
- If error handling is needed for reliability, include it without asking.
Then run:
/effort high
Or set this environment variable (forces fixed reasoning budget):
export CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1
If You Use claude.ai
Add this to Settings → Profile → Custom Instructions:
“Always reason thoroughly and deeply. Treat every request as complex unless I explicitly say otherwise. Never optimize for brevity at the expense of quality. Think step-by-step, consider tradeoffs, and provide comprehensive analysis.”
This won’t give you /effort max, but it forces the model to at least try.
If You Use the API
Don’t use adaptive thinking. Set it explicitly:
response = client.messages.create(
model="claude-opus-4-6",
thinking=\{
"type": "adaptive",
"effort": "high" # or "max" for Opus 4.6
\},
max_tokens=32000, # thinking + output share this budget
messages=[...]
)
Or better yet, switch to manual extended thinking mode:
thinking=\{
"type": "enabled",
"budget_tokens": 10000 # explicit thinking budget
\}
(Note: budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6, but still works. Use it while you can.)
What Anthropic Should Have Done
- Announce the change. In a changelog. With examples. Like every other company does.
- Make effort levels transparent. Let users see and control
reasoning_effortdirectly. - Default to high effort for paying users. If compute is the issue, raise prices. Don’t quietly nerf the product.
- Kill the enterprise vs. personal tier gap. One model. One quality standard.
Instead, they shipped a stealth downgrade, dismissed user complaints as “prompting issues,” and only responded after someone built a forensic data pipeline and posted it publicly.
The Bigger Pattern
This isn’t isolated. Every AI lab is doing some version of this:
- OpenAI: ChatGPT’s quality varies by time of day (cheaper inference during peak hours).
- Anthropic: Adaptive thinking defaults to minimum effort.
- Google: Gemini in AI Studio vs. Vertex AI behave differently.
The reason? Compute is expensive. Revenue pressure is real. Quietly degrading the product is easier than explaining why prices need to go up.
But here’s the thing: we’re not idiots.
When you train us to expect a certain level of quality, then silently degrade it while keeping the price the same, we notice. We log the sessions. We run the benchmarks. We publish the data.
And when you close the GitHub issue anyway? We write blog posts.
What You Should Do
- Test your prompts on different platforms (claude.ai vs. API). If they behave differently, you know effort levels are involved.
- Set effort explicitly. Don’t trust defaults.
- Measure quality over time. If your model suddenly gets worse, it’s probably not you.
- Vote with your wallet. If Anthropic (or any lab) degrades quality without transparency, switch providers.
And if you’re building AI products for clients? Own your infrastructure. Because these labs will optimize for their economics, not yours.
One Last Thing
AMD’s engineering team - the people who filed the original bug report - have already dropped Claude Code and switched to a competing provider.
If the AI Director of one of the world’s largest chip companies can’t trust your model for “complex engineering,” that’s not a user problem. That’s a product problem.
Anthropic, you built the best reasoning model on the market. Don’t kill it with silent cost-cutting and gaslighting users who notice the quality drop.
Either charge what it costs to run the full model, or be honest about what we’re actually getting.
Because right now? The gap between what you advertise and what you deliver is bigger than the gap between High and Low effort.
And we have the receipts.
Michael Saad runs Digital1010, a digital marketing agency that builds AI-powered infrastructure for clients who need results, not vendor lock-in. Follow him on X for more practitioner takes on AI that don’t sugarcoat the mess.
Want to see how we use AI agents to build systems that actually work? Check out Orbit - our internal SEO OS that routes tasks to the right model every time, because we control the stack.
