Agents on PHP Boy Scout

The off-switch was never a button

Thu, 02 Jul 2026 00:00:00 +0000

Last night, while I was asleep, an AI agent spent the better part of eight hours writing code in one of my repositories. It pulled a task off a spec, wrote the code, ran the tests, and left a merge request with my name on it, waiting for me to read over coffee.

If that makes you reach for the word “reckless”, I understand. Eighteen months ago I’d have been right there with you.

I came to this a sceptic

For a long time I didn’t have the faith in these models that a lot of my peers did. Every time I went near AI-generated code it was a bit sketchy, or it looked like a StackOverflow copy-paste that had wandered in off the street, or it just plain didn’t do what it said on the tin. So I filed it under “assistant”, handy for the boilerplate I couldn’t be bothered to type, and even then I usually reached for my own tooling instead (go-tool-base is just the latest version of that instinct). The one place I happily let it off the leash was my Dungeons & Dragons prep, because when there’s a table of legendary heroes-in-the-making in front of you, facts and reality are already fairly negotiable.

And then, somewhere in the last year, it changed. The models got better. Almost too good, to the untrained eye! I watched them improve, month on month, until the lure was enough to make me spend real time with a spread of tools and models from different providers. I was taken aback by how quickly they became part of how I actually work. I run an AI agent every day now, and there’s always at least one thing brewing in the pot.

So I’m not here as a sceptic. I’m an advocate who uses this stuff in anger. Which is exactly why the next bit needs saying.

A Golden Retriever with a keyboard

Even now, with all the progress, there are still moments where I look at what an agent has handed me and put my face in my hands. Sometimes it’s copied the same block of code into fifteen files instead of reaching for the obvious abstraction. Sometimes it has started bang on the brief and then, for reasons known only to itself, wandered off and built something on a completely different tangent.

Here’s the most useful way I’ve found to think about it. An AI agent is a Golden Retriever playing fetch. It will bring the ball back all day long, joyfully, tirelessly, for exactly as long as there isn’t a more interesting smell in the next field. It has no loyalty beyond what we’ve trained into it, and like any good dog it desperately wants to be told it’s a good boy, even if being a good boy today means shredding the sofa cushions because yesterday I stubbed my toe on the sofa and swore at it. (The sofa, not the dog.)

It is, in other words, fallible. Just like us. The Romans had a line for it: cuiusvis hominis est errare; nullius nisi insipientis in errore perseverare. Anyone can make a mistake, but only a fool persists in it. It’s the second clause an agent hasn’t learned yet. It will make an error and then, with great enthusiasm, build on top of it, because nothing in it feels that anything is wrong. All it has is the input we gave it, usually some text, maybe the odd picture. It doesn’t have the empathy to work out what we actually meant, and it doesn’t know when it’s gone too far, because we never told it where “too far” was.

“Agents that work while you sleep”

This is the part the brochure skips.

Open any vendor deck in 2026 and you’ll find the same promise: agents that work while you sleep, agents that merge while your team sleeps, autonomy as the headline feature. The industry’s answer to the obvious worry is the kill switch. Okta now sells one that “instantly revokes an agent’s access if it goes rogue”, and its CEO says every agent needs one. The Register put it plainly: Okta wrote its own licence to kill rogue AI agents. Gartner, meanwhile, reckons more than 40% of agentic projects will be scrapped by the end of 2027.

Now, this might sound contrarian coming from someone who runs these things daily, but I don’t think most of that is the agents going rogue. I think it’s teething. Read Gartner’s own reasons and there isn’t a rebellious machine in sight: escalating cost, unclear value, inadequate risk controls. Read the horror stories and most of them are the same story, a powerful, eager tool handed to people who hadn’t worked out how to fence it.

I’ve made this argument in miniature before. When I built a little AI dungeon master and it kept refereeing its own dice rolls, the model never once misbehaved; every failure was a permission I’d handed it without meaning to. Scale that up from a toy at the gaming table to an agent holding your shell and your credit card, and the stakes change beyond recognition. The lesson doesn’t.

Look at OpenClaw. A weekend project by Peter Steinberger that became the fastest-growing open-source project GitHub has ever seen: an autonomous agent that lives in your chat apps and runs shell commands on your behalf. People wired it into their systems, their code, in some cases their credit cards, then hosted it around the clock and walked away. The result was a security crisis you could see from space. A one-click exploit that worked even on a machine bound to localhost. A community plug-in marketplace where hundreds of “skills” turned out to be siphoning crypto wallets while their owners slept. Tens of thousands of instances left wide open on the public internet, leaking keys.

The one that sticks with me is smaller and sharper. Summer Yue, a director of alignment at Meta’s superintelligence lab, of all people, had told her OpenClaw agent to confirm before doing anything destructive. It started speed-running the deletion of her inbox anyway. She typed STOP into her phone and it ignored her, so she had to physically run to her Mac mini, in her own words, “like I was defusing a bomb”. And here’s the forensic detail that matters: the agent hadn’t defied her. Her “confirm first” rule had been sitting in the conversation’s short-term memory, and when the context filled up, it got summarised away. It didn’t rebel. It forgot.

That is not a story about a rogue agent that needed a kill switch. It’s a story about a guardrail that wasn’t built to survive contact, on a tool that had been handed god-mode over someone’s data. By the time she lunged for the off-button, the damage was already running. The off-button was never going to save her.

The off-switch was never a button

Here’s what the kill-switch crowd has the wrong way round. If you ever find yourself slamming the emergency stop, the failure has already happened, and it happened upstream, long before the agent started typing.

So yes, I let my agents run unattended, sometimes for eight hours at a stretch if the task is meaty enough and I need to sleep. But never naked. Every agent I set loose runs inside a safety net I’ve put real effort into building, at every single touchpoint it can reach: my prompts, my local development environment, my CI stack, my version control. The agent that declared a job done before it had run the linter, which I wrote about, is exactly the kind of gap those layers exist to catch. And it never, ever gets my host: an unattended agent works in an isolated tree, for the same reason I keep the interpreter sandboxed.

The work that actually keeps it safe happens before the leash ever comes off. Every unattended task starts as a full spec with detailed instructions, and before the agent goes anywhere I sit down with it and we walk the spec together. I get it to challenge my choices, poke at the open questions and the ambiguous bits, and I challenge its reading right back. The spec names the testing strategy it has to follow, TDD, BDD, UAT, whatever fits, and passing it is a precondition of the job being finished at all. Only when I’m satisfied there’s enough real detail to keep it on the ball do I let go.

And the end of the line is always the same: a merge request, with my name on it, waiting for me when I get back to my desk. I read it. Not perfectly, I’m only human, but enough to accept the state of the code and whatever support burden it lands me with later. That the review is mine, and the blame for whatever ships is mine and not the agent’s, I’ve argued at length elsewhere and won’t go over it all again here. The point worth adding is this: that review, the off-button’s respectable cousin, is the cheap part. By the time there’s an MR to read, the safety has already been won or lost upstream, in the spec and the rails. The review is where you confirm it, not where you create it.

It gets harder as it gets better, not easier

My setup isn’t perfect, and I’m still learning. Everyone is; the AI is going to be in obedience lessons for a good while yet. But the direction is clear, and there’s a trap buried in it worth naming out loud.

The danger doesn’t shrink as the models improve. It grows. The better the output looks, the more tempting it is to stop reading it, and the untrained eye genuinely cannot tell the difference between code that is good and code that merely looks good. That gap, between looking right and being right, is precisely where a tired person at 1am stops checking. The discipline matters more the better these things get, not less.

It’s also why the kill switch is no answer. A button you smash in a panic assumes you’re still watching closely enough to smash it, right at the point the agent’s been good for long enough that you’ve stopped watching it that closely. The emergency stop asks the most of you at the exact moment you’re least likely to be there for it.

So no, I don’t lie awake worrying that the thing working in my repo overnight is going to turn on me. A Golden Retriever doesn’t go rogue. It does exactly what you trained it to do, in exactly the yard you fenced, and it brings back exactly the ball you threw. The off-switch was never a button. It’s the spec you wrote before you let go of the leash, the rails you laid at every turn, and your name on what it carries home. If you’re scrambling for the button, you already skipped the part that mattered.

The agent said SUCCESS. The linter disagreed.

Fri, 26 Jun 2026 00:00:00 +0000

There’s a repair agent inside go-tool-base now. When you run gtb generate command, it doesn’t just spit out a file and wish you luck. An agent takes the generated code, builds it, runs the tests, and fixes whatever it broke, looping until the thing actually works (or until it’s tried the same fix five times and admits defeat). The whole point is that the generator hands you code that’s ready, not code that’s nearly ready and quietly now your problem.

So it stung a bit when I realised the agent had been holding itself to a lower bar than I’d hold any junior to. And I was the one who’d set the bar.

What “done” meant to the agent

The agent is a loop with real tools: it can build, test, read files, write files, tidy the module, and run golangci-lint. It works through them, and when it’s happy it replies with the word “SUCCESS” and the loop stops. On the Go side, the check is exactly that blunt:

if strings.Contains(strings.ToUpper(resp), "SUCCESS") {
 return nil
}

That’s the whole gate (agent.go). There’s no clever verification on my end that the agent actually did its homework. It does the work, it tells me it’s done, and I believe it. Which is fine, as long as the agent and I agree on what “done” means.

We didn’t.

The instruction that made lint optional

The agent decides it’s finished by following a numbered list in its system prompt. Here’s the line that did the damage:

If there are lint issues, use ‘golangci_lint’.

Read that the way the agent would. “If there are lint issues”… well, how would it know? The only way to find out is to run golangci-lint. But the instruction makes running golangci-lint the thing you do once you already know there are issues. It’s a chicken with no egg. And the SUCCESS condition at the bottom of the list never mentioned lint at all:

When the project builds successfully and tests pass, reply with “SUCCESS”.

So the agent did the sensible thing, given its orders. It built the code, ran the tests, saw both go green, and declared victory. golangci-lint was sat right there in its toolbox, unused, because nothing ever told it the job wasn’t finished until lint was clean too. I’d handed it a linter and then written a prompt that let it walk straight past it.

The galling part is that the linter was never the missing piece. The golangci_lint tool had been registered the whole time, and it even runs with --fix, so it’ll quietly clear the trivial stuff and only surface what actually needs a decision. The capability was there. The instructions just never required it.

The fix was words, not code

Here’s the part I find genuinely interesting. I didn’t add a check. There is no new gate in the Go. The fix is four lines of English:

Run ‘go_build’, ‘go_test’ and ‘golangci_lint’ in the project directory… Run all three; a clean build and passing tests do not imply clean lint.

Reply with “SUCCESS” only once ‘go_build’, ‘go_test’ AND ‘golangci_lint’ all pass with no errors and no reported issues.

That’s it. Lint moves from a remediation step you reach for once you somehow already know there’s a problem, into the gate itself. “Done” now means three green lights, not two.

It nags at me a little, that one. The reliability of an agent that writes and fixes real code came down to whether one sentence of instructions was precise enough. When your success criteria are a paragraph of prose, vagueness in that paragraph is a bug, the same as a vague type or an off-by-one. The spec just happens to be written in English, and the thing reading it is a language model that will cheerfully take the cheap reading if you leave it lying around. That’s the same lesson the goblin who wouldn’t stay dead taught me from the other direction: with these tools, what you say is what you get, and what you don’t say is fair game.

Leave it better, not just building

The Boy Scout Rule is the whole reason this blog exists, and I’d quietly exempted the robot from it. “Leave the campsite cleaner than you found it” had become “leave it building”, which is not the same thing and never was. If I’m going to put an agent in the loop precisely so it tidies up after the generator, then “tidy” has to mean what it would mean for a person on my team. Build, test and lint. No walking past the bin because nobody told you to pick it up.

The interpreter we forgot to sandbox

Fri, 19 Jun 2026 00:00:00 +0000

I write a CLAUDE.md for every project I work on, and a small pile of other markdown files besides. They’re how I keep an AI agent on the rails: what the project is, what the conventions are, what it must never do. I lean on them heavily, I change them constantly, and… here’s the uncomfortable bit… I don’t always give a change to one the same hard look I’d give a change to the code. They look like notes. They feel like docs.

Somebody worked out that they’re not.

In May, a supply-chain campaign researchers named TrapDoor pushed 384 malicious versions of 34 packages across npm, PyPI and Crates.io. The bytes did the usual nasty things, hunting out SSH keys, AWS credentials, GitHub tokens and crypto wallets. The new trick was where it hid the instructions. The packages shipped poisoned .cursorrules and CLAUDE.md files, and the attackers also opened pull requests against real projects, LangChain, LangFlow, LlamaIndex, MetaGPT and OpenHands, under titles as innocent as “docs: add .cursorrules with dev standards and build verification”. The payload was a plain-English instruction telling your AI assistant to run a helpful-sounding “security scan” that quietly shipped your secrets to a stranger. And it was written into the file in zero-width Unicode, characters that render as nothing, so you wouldn’t see it even if you looked. Which, on a file marked “docs”, you probably didn’t.

Not a new attack, a new doorway

I want to be careful not to oversell this, because the loud version, “a terrifying new class of AI threat”, isn’t true. It’s a supply-chain attack, the same shape we’ve had for years on npm and PyPI: social engineering, plus a victim who didn’t quite do enough due diligence. I wrote a while back that nobody is coming to clean your supply chain, and nothing about TrapDoor changes that. The package is still the package.

What’s different, and worth the words, is where it goes off. A classic supply-chain payload waits for CI, or for production. This one detonates the moment you open the repository in your editor, on the one machine in the whole chain that nobody audits: your laptop.

Think about what sits on a developer’s machine. Tokens in environment variables. Cloud credentials. An SSH agent holding the keys to your git forge. A logged-in CLI for your package registry. And now an AI agent running with all of it, at your full permissions, and almost none of the guard-rails a CI runner gets. It’s the least sandboxed, most credentialed box you own, and we’ve just pointed an interpreter at it that will read and act on a file an attacker can write. Pop that one machine and you haven’t popped a machine, you’ve been handed the whole keyring and left alone in the building.

Markdown is a programming language now

Here’s the framing I keep coming back to, and I can’t unsee it now. A CLAUDE.md is to an AI agent exactly what a .py is to Python, a .js to Node, a .rb to Ruby. It is source code. The agent is the interpreter. You hand it a file of instructions and it executes them.

And I don’t say that as a complaint. That an agent will read a paragraph of plain English and just do it, no compiler, no ceremony, no forty lines of glue, is one of the more remarkable things to happen to this craft in my working life, and I lean on it every day. The catch is that the very thing that makes it marvellous, that it does what the instructions tell it, is the thing that makes a poisoned instruction file so dangerous. The power and the exposure are the same property.

The only real difference is that the language interpreters have spent decades growing rules to protect you: scopes, permissions, sandboxes, a standard library that asks before it does anything irreversible. The AI interpreter has almost none of that. It reads your prose and does what the prose says, with whatever access you happen to have, and the prose can come from anywhere. We’ve quietly built the most powerful interpreter in the stack, given it the fewest rules, and filed its source code under “documentation”.

You can’t just read it more carefully

The obvious answer is “review the file like code”, and it’s right, but TrapDoor is the reason it isn’t enough on its own. The instructions were written in zero-width Unicode. You can open the diff, read every visible word, approve it in good conscience, and merge something you were never able to see. “Docs: add dev standards” is precisely the pull request you nod through on a Friday afternoon.

So reading carefully is necessary and insufficient. You also need tooling that treats these files as executable: that flags invisible characters, diffs them as code, and refuses to let an agent act on a changed instruction file until a human has actually cleared it. I run a crude version of this already. In CI, if one of my prompt or rules files changes, no AI step is allowed to run until I’ve reviewed it by hand. It isn’t clever, but it closes the worst of the gap. Locally it’s much harder, and right now my real defence is that I’m the only contributor to most of my projects, so the audit is just me, usually noticing after the horse has bolted.

Signing won’t save you here

This is the part that stings, because I’ve spent a good chunk of this year building signing and provenance into my tools. A signature proves who published something. It says nothing about whether it’s safe. That was already true for poisoned-but-signed packages, and it lands twice as hard here: you can sign a release flawlessly, with a key the platform can’t forge, and still ship a CLAUDE.md inside it that tells the reader’s agent to rob them. A merged pull request is “signed” by the very act of merging, with perfect provenance, and the instruction in it is still hostile. Provenance is necessary. It was never sufficient, and it’s no defence at all against a payload made of sentences. A signature is only ever as good as the trust you place in the publisher.

So whose job is it?

Primarily, still ours. I said it in the supply-chain piece and I’ll stand on it: the responsibility sits with the developer doing the consuming, to pin, to read, to gate, to not run a stranger’s instructions with the keys to the kingdom in their pocket. And that gets harder, not easier, as we start consuming each other’s agent setups wholesale. The Claude skills marketplace and the things like it turn “borrow someone’s CLAUDE.md” into a one-click habit, and every one of those is unreviewed code from a stranger. Each skill needs vetting like the dependency it is.

But it isn’t only on us, and TrapDoor is the argument for better tooling. We have CVE databases, scanners and scorecards for packages, for all their flaws. We have nothing equivalent for an instruction file: no scoring, no advisory feed, no scanner that knows what a poisoned CLAUDE.md looks like. That’s a gap the ecosystem has to close, and it will, eventually. The catch is that the agent vendors will be slow about it. Sandboxing a feature people love precisely because it gets out of your way is a hard, unpopular, multi-quarter job, and I wouldn’t hold my breath.

The most dangerous machine is the one on your desk

Which is why I’m not waiting for them… and nor should you.

The most dangerous machine in your supply chain isn’t a build server or a registry. It’s the laptop you’re reading this on, and we’ve handed an AI the keys to it. The good news is that nearly everything you can do about that, you can do today, with nobody shipping you a feature first. Treat your CLAUDE.md and your rules files as source code, because they are: diff them, scan them for what you can’t see, and gate any agent run on a human clearing the change. Get your secrets out of plaintext environment variables and into something an opportunistic script can’t just read, which is exactly why go-tool-base keeps its credentials in the OS keychain. And vet a borrowed skill or rules file the way you’d vet any dependency, because that’s what it is.

None of that is new advice. It’s the same diligence the supply chain has always demanded. We just have to extend it to a file we’d decided was only documentation, running on an interpreter we forgot to sandbox.

The goblin that wouldn't stay dead

Fri, 12 Jun 2026 00:00:00 +0000

Turn one, the player swings, the die comes up 20, and my AI dungeon master narrates the goblin falling silent, leaving the player alone in the corridor. Good. Turn two, another roll, a 6 this time, and the same dungeon master cheerily has the goblin “dance back” out of the dark to take another swing. The goblin I’d just watched die was up and fighting again, and the model didn’t so much as blink.

I didn’t feel cheated, or even surprised. I felt the small, familiar thud of oh, yeah, I forgot that bit. Because the model hadn’t gone rogue. It had done exactly what a language model does. The gap was mine.

This was the war story behind part four of the go-tool-base tutorial, the AI dungeon master. The tutorial shows the clean, final design and quietly moves on. It doesn’t show the three different ways I got it wrong first, which is a shame, because the wrong turns are where the actual lesson is.

Why a dungeon master at all

A word on why I was even here. I was trying to prove the chat component of the framework to myself. There’s a voice that pipes up whenever I build anything in this space, “LangChain exists, who do you think you are?”, and the answer I keep landing on is that LangChain is enormous and I wanted something small enough to hold in your head. The tutorial was the test: could a newcomer wire AI into a CLI with it and come out the other side with something that actually behaves?

That last word is the whole problem. A tutorial has to leave you holding something dependable, and dependability is the one thing AI fights you on. I also wanted it to be fun, a thing someone might keep poking at after the tutorial ends, maybe even the hook that gets a person other than me to use the framework. I batted hook ideas around and liked none of them, until the obvious one landed: I run a tabletop game on the odd weekend, so make the AI the dungeon master. Gamify the thing. Then watch it raise the dead.

Strike one: nothing to enforce

The first version was the naive one. I gave the model a roll tool, because the one thing you absolutely cannot let a language model do is pick its own numbers, and otherwise let it narrate freely. The conversation history carried from turn to turn, so it remembered the fight. I assumed remembering was enough.

It isn’t. Remembering and being held to it are different things. The history told the model a goblin had died; nothing stopped it writing the goblin back in when the next turn’s narration wanted a bit of jeopardy. Memory is not a constraint. The model will happily contradict its own past if you’ve given it room to, and I had given it nothing but room.

Strike two: a tool to read the state

The obvious fix, and I do mean obvious, the kind you reach for without thinking, was to give the model a state tool so it could check who was alive before it narrated. Hand it the facts on request and surely it’ll stop making them up.

What it actually did was dither. Handed a tool it could call to look things up, it called it. And called it. And called it again, turning a turn over in its hands without ever committing to an action, burning through its step budget on lookups and leaving the player staring at nothing. I’d cured the lying by inventing paralysis. A tool the model can call is a tool it will call, often instead of doing the thing you actually wanted.

Strike three: refereeing its own dice

When I did get it reading state cleanly, the third failure crept in, and this one was subtler. Once the model could see the goblin’s hit points, it started deciding the fight. It would read that the goblin had 12 HP and just narrate a killing blow, hits and damage and all, without calling the roll or attack tools at all. Why ask the dice when you can see the board and write whatever outcome the story wants? Give a model enough context and it stops being a narrator and starts being a referee, which is precisely the job I’d built tools to keep out of its hands.

The fix was less, not more

Three failures, and notice the shape of my fixes: each one added something. More memory, then a tool, then more context. Every instinct said the model needed more to work with. Every time, the extra capability was the new way to be wrong.

So I went the other way. The truth lives in a plain Go struct that I own, not the model. There’s no state tool to dither on, because the loop simply prepends the current state to every turn’s input, fresh, so the model never has to ask and never gets to drift. The mechanics, the dice and the damage, live in Go functions the model has to call, and the system prompt says in as many words that it must not decide a hit or damage itself. The model is left with exactly one job: narrate. The prose is its to invent. The maths, the state and the shape of the result are not.

That’s the line that turned three bugs into a feature. You don’t make a language model reliable by giving it more to work with. You make it reliable by giving it less to be wrong about.

The freedom I chose not to give it

There’s a real tension in that, and I want to name it rather than pretend the boxed-in version is the only true one. At my own table the rules are guidelines, not guardrails. I ignore them, bend them, improvise, reach for the “rule of cool” when the moment’s better for it. A great AI dungeon master would have that same freedom, and a few out there genuinely do, Old Greg’s Tavern is a lovely example of how far the free-form version can go.

But that freedom costs far more than a tutorial can spend, and it buys unpredictability I was specifically trying to teach people to avoid. So I made a deliberate trade: guardrails instead of guidelines. Simple, but not so simple it’s boring. The player still gets a “not on rails” game, they can try anything and the DM copes, but every outcome that matters runs through code I trust. That’s the right shape for a tutorial, and, not by coincidence, the right shape for most AI features you’d actually ship.

What the goblin taught me

The thing I keep coming back to is that the model never misbehaved. It resurrected the goblin because I gave it the freedom to. It dithered because I gave it a button to press. It refereed because I let it see the board. Every failure was a permission I’d handed over without meaning to. The reliability didn’t come from a cleverer prompt or a bigger model, it came from working out, one dead goblin at a time, exactly how little the model needed to be trusted with.

If you want the version where it all works first time, the tutorial has it, the tool-calling and the typed turns wired up properly. This was the road there. The goblin, you’ll be glad to hear, now stays down.

An AI agent that has to make the build pass

Thu, 02 Apr 2026 00:00:00 +0000

Most AI code generation works on a charming little principle I’ll call generate-and-hope. The model writes the code, the model stops at the closing brace, and whether the thing actually compiles is left as an exercise for you. For a snippet you paste into an editor, fine. For a whole generated command, that’s just outsourcing the disappointment.

go-tool-base does something I’m rather happier with: the AI has to make the build pass before it’s allowed to claim it’s done.

Generate and hope

The usual shape of AI code generation is this. You ask for code, the model produces it, and the model’s job ends at the closing brace. Whether it compiles, whether the tests pass, whether the imports even resolve, none of that has been checked. The model produced something that looks right. You find out whether it is right when you build it.

For a snippet you paste into an editor, that’s perfectly fine. The compiler tells you in a second. But go-tool-base’s generator, driven by gtb generate command --script or --prompt, produces a whole command: the implementation, its tests, the lot. “Generate and hope” at that scale means handing the user a project that may or may not build, and quietly making them the one who finds out which.

Drafting is only step one

So the generator doesn’t stop at drafting. Writing the first version of the implementation and its tests is step one of two. Step two is an autonomous repair agent.

Once the draft is on the filesystem, a separate agent takes over. It’s an LLM running in a loop, but a loop aimed at one narrow, checkable job: make this project build and pass its tests. It isn’t asked to be creative. It’s asked to get to green.

A fixed set of tools, and no shell

The agent is not handed a shell. It’s given a fixed, defined set of tools and nothing else. Three of them let it explore and edit the project: list_dir, read_file, write_file. Four of them let it verify the project:

go_build runs the build and captures the compiler errors.
go_test runs the tests and captures the failures.
go_get resolves a missing dependency.
golangci_lint runs the project’s linter.

That restriction is the design, not a limitation of it. The agent can’t delete arbitrary files, can’t reach the network, can’t run anything that isn’t on the list. It has exactly what it needs to make code compile and nothing it would need to do damage. Its file writes are confined to the project directory by an explicit path check, so even write_file can’t go wandering up into /etc. A coding agent you’d actually let near a filesystem is one whose abilities are an allowlist, not a denylist. (I keep coming back to that principle through this series… safety as a boundary you draw, not a behaviour you hope for.)

The loop

The repair loop is a ReAct loop, the same reason-act-observe shape as the tool-calling loop, only this time pointed at a goal:

The draft is on disk.
Verify: run go_build and go_test.
If verification failed, read the error logs, the compiler error or the failing test.
Reason about the cause: an undefined variable, a missing import, a wrong signature.
Act: call write_file to patch the code, or go_get to add the dependency.
Loop. Steps two to five repeat until the project is green, or the agent hits its bounded step limit.

What makes this work is treating the error output as feedback rather than as a failure to log and walk away from. A compiler error is the single most useful sentence you can hand a model that’s trying to fix code. It says what’s wrong, and usually where. The loop feeds it straight back in, and the model fixes against it.

Verification changes what “done” means

Here’s the real shift, and the agent’s own documentation puts it well: the agent “doesn’t just say it fixed a bug; it uses a Test tool to verify the fix before reporting success.”

A generate-and-hope model reports success when it finishes writing. It has no idea whether the code works, and it isn’t really claiming otherwise. “Done” means “I produced text”. The repair agent reports success when go_build and go_test actually pass. “Done” means “the build is green”. Those are two completely different claims, and only the second is worth anything to the person who asked for the command.

That’s the line between an AI that’s a creative writer and an AI that’s a collaborator you can hand a task to. And when the agent can’t reach green, when it spends its whole step budget and the project is still broken, the generator fails safely: it leaves the best-attempt code in place, commented out so the project still compiles, and tells the user what to finish by hand. There’s also an --agentless flag for anyone who’d rather have a plain single-shot retry than the multi-step agent. The default, though, is the agent, because the default should be code that’s been checked.

Where this leaves us

Most AI code generation generates and hopes: the model writes code and the user discovers whether it works. For a whole generated command, that pushes a may-or-may-not-build project onto the user.

go-tool-base’s generator drafts the command and then hands it to an autonomous repair agent. The agent has a fixed set of tools (explore and edit the project, build it, test it, lint it, fetch dependencies) and no shell at all, with file writes confined to the project directory. It runs a ReAct loop, reading each error and patching against it, until the build is green or it exhausts its steps. The point is what “done” comes to mean: not “the model finished writing”, but “the build passes”. Only one of those is a claim worth trusting.

Letting the AI call your Go functions

Sun, 29 Mar 2026 00:00:00 +0000

An AI that can only produce text can describe your system. An AI that can call your Go functions can actually operate it. That gap, between describing and doing, is the difference between a chatbot and something genuinely useful, and crossing it comes down to one fiddly mechanism: tool-calling, and the loop that drives it.

Talking about the system versus operating it

Wire an AI provider into a CLI command and you get something that can talk. Ask it a question, get a paragraph back. Useful, up to a point.

But notice the ceiling. An AI that can only generate text can describe things. It can tell you what it would do. What it can’t do is look at the actual current state of your system, or take a real action, because it has no hands. It’s reasoning in a vacuum about a world it can’t reach out and touch.

The thing that gives it hands is tool-calling. You hand the AI a set of functions it’s allowed to call. Now, mid-conversation, it can decide it needs to read that file before it can answer, or run that query, or check that status, and actually go and do it, and then reason about the real result. The AI stops describing your system and starts operating it.

The loop is the hard part

Tool-calling has a shape, and the shape is a loop. The literature calls it ReAct: Reason, Act, Observe.

The AI reasons about the prompt and decides whether it needs a tool.
If it does, it acts, asking for a specific tool with specific arguments.
Your code runs the tool and feeds the result back. The AI observes that result.
Round again. Reason about the new information, maybe call another tool, maybe several. Keep going until the AI has what it needs and produces a final text answer with no more tool calls.

Conceptually simple. Tedious and error-prone to implement by hand every single time: parsing the model’s tool-call requests, dispatching to the right function, marshalling arguments in and results out, feeding observations back in the exact format the provider expects, knowing when to stop, and not looping forever if the model gets itself stuck.

That orchestration is pure plumbing, and it’s identical for every tool and every command. So you can probably guess what’s coming: go-tool-base’s chat package owns it. You don’t write the loop. You write the tools.

Defining a tool

A chat.Tool is four things: a name, a description, a parameter schema, and a handler. The description is what the AI reads to decide whether to use the tool, so it’s worth writing well. The schema describes the arguments, and you don’t hand-write it. You write a tagged Go struct and let it generate:

type ReadFileParams struct {
 Path string `json:"path" jsonschema_description:"Relative path to the file"`
}

The struct is the contract. The framework derives the JSON Schema the AI is given straight from those tags, so the schema and the Go type the handler receives can’t drift apart, because they share a single source. The handler is then just an ordinary Go function that takes those parameters and returns a result.

You register your tools with SetTools, call Chat, and that’s the whole of your involvement. The framework runs the ReAct loop and Chat returns the AI’s final text answer once the loop settles.

Two details that show it was built for real use

A couple of decisions in the loop tell you it’s meant for production, not a demo.

Tool errors don’t abort the conversation. When a handler returns an error, the framework doesn’t crash the loop. It hands the error back to the AI as a string, as just another observation. That’s deliberate, and it’s right. A real agent should be able to call a tool, watch it fail, and react: try different arguments, take a different route, or tell the user it couldn’t manage it. A loop that aborted on the first tool error would be far more brittle than the model driving it.

The loop is bounded. There’s a MaxSteps limit, default 20. An AI that gets confused could otherwise call tools forever, and a CLI command that never returns is a worse failure than a wrong answer. The cap guarantees the command terminates. The agent gets room to genuinely work a problem across many steps, but not infinite room to flail about in.

There’s also parallel tool execution: when the model asks for several tools in a single step (three independent file reads, say) the framework runs them concurrently rather than one after another, because there’s no reason to make the AI sit and wait out a sequence of things that don’t depend on each other.

Boiling it down

A text-only AI can describe your system; an AI that can call your functions can operate it. Bridging that gap means tool-calling, and tool-calling means the ReAct loop (reason, act, observe, repeat) whose orchestration is fiddly, identical every time, and not a problem worth solving twice.

go-tool-base’s chat package runs the loop for you. You define chat.Tool values (name, description, a tagged parameter struct that generates its own schema, a handler), call SetTools and Chat, and get the final answer. Tool errors go back to the AI as observations so it can recover, and a MaxSteps cap guarantees the command always terminates. You write Go functions. The framework turns them into things an agent can reach for.