Anthropic on PHP Boy Scout

They switched it off while it was fixing my code

Sat, 13 Jun 2026 00:00:00 +0000

I woke up this morning to a one-line message from my own tooling:

Claude Fable 5 is currently unavailable. Learn more: https://www.anthropic.com/news/fable-mythos-access

I followed the link expecting a status page about a wobble in someone’s data centre. Instead it was Anthropic, explaining that the evening before, at 5:21pm Eastern, the US government had ordered them to suspend all access to Fable 5 and Mythos 5 on national security grounds. Globally. Every user. Their own staff included.

I’d spent the previous day with Fable doing one very specific thing: pointing it at my own codebase and asking it to read the code and fix the flaws it found. That, very nearly word for word, is the thing it has now been banned for.

Three days late to the only model that mattered

Fable came out on the 9th. I didn’t get to it properly until the 12th, which is the sort of timing I specialise in. By the time I sat down with it, I had about a day of real use before it vanished. One day to form a view on what people were calling the most capable coding model anyone had shipped. So treat everything below as the read of a man who got three days’ notice and used one of them.

What I had it doing was unsexy and exactly the kind of work I care about: a full security audit of go-tool-base, the same “leave the codebase better than you found it” pass I’d normally run myself. Find the flaws, then start fixing them.

And it was good. Genuinely good. It surfaced issues that previous passes with Opus had walked straight past, and in a couple of cases the flaw was sitting in code that Opus itself had written. There is something bracing about one model quietly marking another’s homework, and being right.

Good, but let’s not get carried away

Here is where I have to be fair, because the anger that came later is only worth anything if the praise before it is honest.

Fable is not magic. The class of bug it found is not some exotic thing only it can see. Plenty of models, from plenty of providers, are perfectly capable of reading a codebase and pulling out the same problems, and there is a mountain of evidence that they do, every day. Anthropic say as much themselves: the capability is “widely available from other models (including OpenAI’s GPT-5.5)” and “is used every day by the defenders who keep systems safe.” I’d already arrived at that conclusion from my own keyboard before I read their statement. Fable was excellent. It was not unique. Hold that thought, because the whole argument turns on it.

It kept slipping out of my hands

The other thing I learned in my one day is that having Fable and using Fable were not the same thing.

I set my main working thread to Fable and got on with it. What I didn’t know, because nothing on screen told me, is that partway through the evening it had quietly handed me back to Opus. The only reason I know now is that the session log records it in black and white:

2026-06-12T06:57:22Z {"type":"fallback","from":{"model":"claude-fable-5"},"to":{"model":"claude-opus-4-8"}}
2026-06-12T18:50:08Z {"type":"fallback","from":{"model":"claude-fable-5"},"to":{"model":"claude-opus-4-8"}}

A whole evening of work I thought I was doing on Fable was, in fact, Opus wearing Fable’s badge. The audit itself launched on the wrong model first; I only caught it because I happened to be watching the workflow panel, killed it, and relaunched it on Fable, where it chewed through an entire five-hour quota in about forty minutes, then spent $50 of usage credits I’d been saving in about five more. Even the run that worked was visibly flaky: of the 282 little agents that audit fanned out into, well over half failed outright and had to be retried.

Then, in the small hours, it started refusing entirely. My tooling caught the moment before I did:

Now failing instantly. Fable appears to be temporarily unavailable for subagents (the first three succeeded). The user explicitly required Fable, so I won’t downgrade… rather than silently switch models.

It managed three of the fixes before it went, each one green on tests, the race detector and the linter. Three real improvements to my code, written by Fable, sitting in my git history. The other three were finished by Opus, because by morning there was nothing left to finish them with.

Capable, and almost impossible to build on

There was a second wall, and I hit it before any of this, on the day Fable launched, when I tried to make it go-tool-base’s default model.

Most of what you build on top of a model isn’t a chat window. You need it to hand your code an answer in a fixed shape, the same fields in the same places every time, so the program on the other end can rely on what comes back. The usual way to guarantee that is to force the model’s hand: you don’t ask politely for the structure and hope, you require it, so a wrong-shaped answer fails outright instead of quietly slipping through.

Fable won’t be forced. Ask it to commit to a guaranteed structure and it declines, flat out. As I understand it the reasoning is a safety one: letting anyone compel a model into a precise, mandated output is itself a lever, a way to march it toward saying something it shouldn’t. Reasonable enough on paper. In practice it meant the most capable model I’d touched couldn’t drive the structured parts of my own tool, and by that first afternoon I’d quietly set the default back to Opus. It was the same refusal, I realised later, that had collapsed half of that audit’s agents.

And it is not a niche complaint. Guaranteed structure is a hard requirement for a vast swathe of what people are actually building on these models. Not everyone is making another Claude Code. Plenty of us are wiring models into systems that have to get a clean, predictable contract back every single time, and a model that reserves the right to freestyle the shape of its answer is one you simply cannot put in that seat.

The part they banned is my bread and butter

So let’s be precise about what got pulled, because the precision is the whole point.

Anthropic describe the government’s concern as “a narrow potential jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws.”

Read that again. Reading a codebase and fixing its flaws. That is not some dark-web misuse I have to strain to imagine. That is my bread and butter, the literal, boring, defensive job I had Fable doing in the open, on my own project, when the shutters came down.

And here is where that earlier point earns its keep. If the banned capability were unique to Fable, you could at least follow the logic, however much you disagreed. But it isn’t, and it isn’t even close: give Opus enough time, enough budget and a patient enough hand on the prompts, and it would get to most of the same findings in the end. Fable just did it more efficiently, a difference of degree, not of kind. So banning one company’s model, for something every competitor ships and every blue team already relies on, makes precisely nobody safer. The exploit-writers keep their tools. The defenders lose one of theirs.

When the thing you have banned is available everywhere else, the ban has stopped being about safety. It is theatre. And given who is currently in charge of the theatre, it has the distinct whiff of a knee-jerk reaction, dressed as a national security triumph, by people who do not appear to understand the tool they are confiscating.

Who I’m not angry at

I want to be careful where I point this, because it would be lazy to spray it around.

I’m not angry at Anthropic. They put Fable through more than a thousand hours of external testing, with US government agencies and the UK’s AI Safety Institute among the people kicking the tyres, before it ever reached me. They satisfied every requirement put in front of them, and when the order came they complied under protest while saying, plainly, that applying this standard across the board “would essentially halt all new model deployments for all frontier model providers.” I’m a daily Claude user and an advocate for the work, and I am not going to hang the US administration’s decision around the neck of the company that did the diligence and then got told to switch the lights off anyway.

I’ll allow them one small dig, and there was nothing quiet about it. Moving Fable behind a paywall on the 22nd was openly announced and planned well ahead, and the free window was never charity. It was a taster: a few days of the new addiction on the house, enough to hook the punters, before the price went up. That is a bit of a dick move, however neatly it tests in a spreadsheet. Moot now, mind, with no model left to charge for.

I’ll even grant the other side its strongest point. A government looking at agentic systems that can chain reconnaissance into working exploits has something real to be twitchy about. I get the worry. I just don’t accept that yanking one vendor’s model, for a thing every vendor does, is a coherent answer to it.

There is a grim irony in how we got here, and it loops back to something I wrote in the spring. When Anthropic first showed Mythos off, I called the fanfare what it looked like: a closed model sold on a press release, a result you couldn’t independently check, marketing until proven otherwise. Fable 5 was Anthropic finally answering that, handing the rest of us something we could actually test. But all those years of selling Mythos as too dangerous to let out were marketing too, and that half landed rather better than they can have wanted. The US administration appears to have swallowed it whole and pulled the lever. Anthropic have ended up a victim of their own hype, and the reaction that hype provoked is, there is no gentler word for it, ludicrous.

What it comes down to

The lesson I’m taking from my one day isn’t about how clever Fable was. It’s about how little that cleverness is worth if you can’t rely on the thing being there.

I couldn’t trust which model I was actually talking to from one hour to the next. I couldn’t trust it to stay up for a full overnight run. And it turns out I couldn’t trust it to still exist by the weekend. You cannot evaluate, depend on, or build a workflow around a model that gets silently swapped out one evening and switched off by the state a few days later. Capability was never the hard part. Availability is.

And underneath all of it sits the thing I keep coming back to. A classifier cannot tell a defender from an attacker, because the two of them type the same commands. It turns out a government export control can’t tell them apart either. The only thing that ever could is a human being, paying attention, who can be held responsible for the judgement. There wasn’t one of those anywhere in this loop. There was a letter, sent at 5:21pm, and by morning the best tool I had for keeping my own code honest was gone, with a polite link where it used to be.

Supporting a provider, or actually using it

Sat, 02 May 2026 00:00:00 +0000

If your CLI tool talks to an AI model, you don’t want to hard-wire one vendor. So you reach for a single client interface over several providers, which is the right call. The trap is the next step: build that interface on only what every provider has in common, and you quietly throw away the very features that made you want a particular provider in the first place. rust-tool-base’s rtb-ai refuses to make that trade.

The pull toward one interface

If your CLI tool talks to an AI model, hard-wiring one vendor is a poor bet. One user has an Anthropic key, another an OpenAI key. Someone’s on Gemini. Someone runs Ollama locally because their data can’t leave the building. Someone points at an OpenAI-compatible endpoint from a provider you’ve never heard of. You don’t want a separate code path for each, so you want one AiClient that all of them slot behind.

rtb-ai gets that unification from the genai crate, which already speaks to Anthropic, OpenAI, Gemini, Ollama and OpenAI-compatible endpoints. One interface, five providers, the tool author picks one in config. The Go sibling makes the same bet: go-tool-base’s chat package also unifies several providers, behind an interface deliberately kept to four methods. So far this is the obvious design, and if it were the whole design there’d be nothing to write about.

What “unified” quietly costs you

Here’s the catch in any unified interface. It can only expose what every provider behind it has in common.

The common subset is plain chat. Messages go in, text comes out, optionally streamed token by token. That’s real and it’s useful and every provider does it. But the common subset is also the floor, and the features that make a particular provider worth choosing are almost never on the floor. They’re the things only that provider does.

Anthropic is the sharp example, because it has three features that matter and not one of them is common-subset.

Prompt caching. You can mark the stable parts of a request, the system prompt and the tool list, as cacheable. The provider keeps them warm, and on the next turn you aren’t billed to re-send and re-process text that didn’t change. On a long agent loop, where the same large system prompt rides along on every single turn, that’s a substantial saving in both cost and latency.

Extended thinking. The model works through a hard problem in a visible, budgeted reasoning pass before it commits to an answer, and you can see that reasoning.

Citations. Structured references back to source material in the response.

A client built strictly on the common subset can’t express any of those. It has no field for them, because four of the five providers wouldn’t know what to do with the field. So a purely lowest-common-denominator client would “support” Anthropic and then use it badly, leaving its best features unreachable. Support as a checkbox, not as the point.

The escape hatch

rtb-ai’s answer is to not choose. It runs two implementations under one interface.

For OpenAI, Gemini, Ollama and OpenAI-compatible endpoints, calls route through genai, the unified path. For Anthropic, every method drops to a direct reqwest implementation straight against the Messages API. Same AiClient on the surface, a different implementation underneath, selected by which provider the config names.

And the request type has deliberate room for the difference:

pub struct ChatRequest {
 pub system: Option<String>,
 pub messages: Vec<Message>,
 pub temperature: Option<f32>,
 pub max_tokens: Option<u32>,
 /// Anthropic-only: enables prompt caching at every stable point.
 /// Ignored on non-Anthropic providers.
 pub cache_control: bool,
 /// Anthropic-only: extended-thinking budget. `None` disables.
 /// Ignored on non-Anthropic providers.
 pub thinking: Option<ThinkingMode>,
}

Set cache_control and the Anthropic-direct path inserts cache breakpoints at the three stable points: the system prompt, the tool list, and the first message. Set thinking and it adds the thinking block, and streaming surfaces a separate ThinkingToken event so you can show the reasoning apart from the answer. On a non-Anthropic provider, both fields are simply ignored. The interface carries them; only the implementation that understands them acts on them.

A hatch, not a leak

It’s worth being precise about why this isn’t the thing it superficially resembles, which is a leaky abstraction.

A leaky abstraction is one where implementation details bleed through that you didn’t intend and can’t reason about. The abstraction quietly fails to abstract, and you’re left guessing which provider you’re really talking to.

This is the opposite of that. The two Anthropic-only fields aren’t a leak. They’re named, documented as Anthropic-only, inert everywhere else, and right there in the public type for anyone to see. The interface is uniform for the common case and deliberately, visibly non-uniform at exactly the points where uniformity would have cost you the good features. You opt into provider-specifics by setting a field. You stay fully portable by leaving it at its default. Nothing bleeds; you decide.

The same design line explains what does stay in the unified path. Structured output, chat_structured::<T>, sends a JSON Schema derived from your Rust type with the request and validates the reply against it before handing you a typed T. That’s a portability win that costs nothing across providers, so it belongs in the common interface. The split isn’t “Anthropic versus the rest”. It’s “features that are free to unify go in the unified path; features that aren’t get a designed door”. Prompt caching and extended thinking get the door, because flattening them away would be the expensive kind of convenient.

To sum up

A CLI tool that integrates AI wants one client over several providers, and a unified interface can only expose what those providers share. The shared floor is plain chat, and the features worth choosing a provider for, like Anthropic’s prompt caching, extended thinking and citations, are never on the floor.

rtb-ai keeps both. genai provides the unified path across five providers; an Anthropic-direct reqwest path drops below the abstraction for the features genai can’t reach, and ChatRequest carries the Anthropic-only fields openly, ignored elsewhere. Uniform where uniformity is free, with a designed escape hatch where it isn’t. That’s the difference between supporting a provider and actually using it.