<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Testing on PHP Boy Scout</title><link>https://phpboyscout.uk/tags/testing/</link><description>Recent content in Testing on PHP Boy Scout</description><generator>Hugo -- gohugo.io</generator><language>en-gb</language><copyright>Matt Cockayne</copyright><lastBuildDate>Tue, 16 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://phpboyscout.uk/tags/testing/index.xml" rel="self" type="application/rss+xml"/><item><title>When you hand the same key to every call</title><link>https://phpboyscout.uk/when-you-hand-the-same-key-to-every-call/</link><pubDate>Tue, 16 Jun 2026 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/when-you-hand-the-same-key-to-every-call/</guid><description>&lt;img src="https://phpboyscout.uk/when-you-hand-the-same-key-to-every-call/cover-when-you-hand-the-same-key-to-every-call.png" alt="Featured image of post When you hand the same key to every call" /&gt;&lt;p&gt;I was building a tutorial, the kind where the whole point is that the reader runs
every command and it just works. So I generated a fresh project with go-tool-base,
added a command, then added a command &lt;em&gt;underneath&lt;/em&gt; that command, and hit build. It
didn&amp;rsquo;t.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pkg/cmd/hello/cmd.go: props.ChildCmd undefined
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; (type *props.Props has no field or method ChildCmd)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;My own generator, in my own framework, had just written code that referenced a
thing that didn&amp;rsquo;t exist&amp;hellip; which is a special kind of embarrassing.&lt;/p&gt;
&lt;h2 id="a-bug-with-a-two-month-alibi"&gt;A bug with a two-month alibi
&lt;/h2&gt;&lt;p&gt;&lt;code&gt;git blame&lt;/code&gt; walked me straight to the commit that
&lt;a class="link" href="https://phpboyscout.uk/middleware-for-cli-commands-not-just-web-servers/" &gt;introduced the command middleware system&lt;/a&gt;
back in &lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/commit/8974154" target="_blank" rel="noopener"
 &gt;March&lt;/a&gt;.
Middleware here is the web-style idea of wrapping a command&amp;rsquo;s run function with
cross-cutting behaviour, timing, auth, recovery, that sort of thing. To wire it
in, &lt;a class="link" href="https://phpboyscout.uk/scaffolding-that-respects-your-edits/" &gt;the generator&lt;/a&gt;
started emitting this for a nested command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;setup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddCommandWithMiddleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;NewCmdChild&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ChildCmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;where the line before had simply been:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddCommand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;NewCmdChild&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The catch is that third argument. &lt;code&gt;props.ChildCmd&lt;/code&gt; is meant to be one of a set of
constants, but those constants are hand-declared for the framework&amp;rsquo;s &lt;em&gt;built-in&lt;/em&gt;
commands only (&lt;code&gt;UpdateCmd&lt;/code&gt;, &lt;code&gt;DocsCmd&lt;/code&gt;, and friends). The generator never declares
one for a user&amp;rsquo;s &lt;code&gt;child&lt;/code&gt; command, so the generated parent referenced a name that
nothing had ever declared. Undefined. Won&amp;rsquo;t compile.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s the part that should worry you more than the bug. It shipped in March and
nobody noticed until late May. Partly because it only bites &lt;em&gt;nested&lt;/em&gt; commands, a
command under another user command; top-level commands register by a different
path and were fine. But mostly because the generator&amp;rsquo;s tests checked the generated
code as &lt;em&gt;text&lt;/em&gt;, asserting that it contained the right strings, and never once ran
&lt;code&gt;go build&lt;/code&gt; on the result. CI was green for two months on code that could not
compile. We were grading the essay without ever reading it aloud.&lt;/p&gt;
&lt;h2 id="what-the-key-actually-was"&gt;What the key actually was
&lt;/h2&gt;&lt;p&gt;Once I stopped staring at the missing name, the real problem came into focus&amp;hellip;
and it wasn&amp;rsquo;t the missing constant at all.&lt;/p&gt;
&lt;p&gt;That third argument is a middleware &lt;em&gt;lookup key&lt;/em&gt;. The framework keeps a table of
middleware registered against each key, and the key tells it which to apply. It is
not an on/off switch and it is not optional, so the generator had to supply one at
&lt;em&gt;every&lt;/em&gt; registration site. It was being asked to guess, on every call, a value it
had no reliable way to produce for a user command.&lt;/p&gt;
&lt;p&gt;And the tell was sitting right there in the same generator: everywhere else, the
idiom was &lt;code&gt;props.FeatureCmd(&amp;quot;name&amp;quot;)&lt;/code&gt;, a function that derives a key from a string.
The nested-registration path was the one place that assumed a hand-declared
constant instead. One call site out of step with all the others.&lt;/p&gt;
&lt;p&gt;That is the actual lesson, and it has nothing to do with cobra or codegen. When you
find yourself threading the same derived value through every single call site, and
getting it wrong, the value is in the wrong place. The feature key was never the
caller&amp;rsquo;s business. It belonged to the command.&lt;/p&gt;
&lt;h2 id="changing-my-mind-about-cobra"&gt;Changing my mind about cobra
&lt;/h2&gt;&lt;p&gt;This is where I had to eat a helping of my own opinion.&lt;/p&gt;
&lt;p&gt;go-tool-base is built on &lt;a class="link" href="https://github.com/spf13/cobra" target="_blank" rel="noopener"
 &gt;cobra&lt;/a&gt;, the de-facto Go
library for building command trees, and I like it a great deal. I had &lt;em&gt;deliberately&lt;/em&gt;
not wrapped it. Every abstraction over a good library is a tax the reader pays, so
my standing rule was: use cobra directly, don&amp;rsquo;t hide it behind something of mine.&lt;/p&gt;
&lt;p&gt;The trouble is the middleware pattern kept growing, and the bigger it got the more
plainly &amp;ldquo;don&amp;rsquo;t abstract cobra&amp;rdquo; was a position I was holding past its evidence. The
very thing I&amp;rsquo;d refused to build was the thing the design had come to need. It
helped that, having recently
&lt;a class="link" href="https://phpboyscout.uk/why-we-left-github-for-gitlab/" &gt;moved the project to GitLab&lt;/a&gt;,
the version had reset to a &lt;code&gt;0.x&lt;/code&gt; prerelease, which makes a breaking change cheap.
The window to stop patching and do it properly was open, and not for long.&lt;/p&gt;
&lt;p&gt;Go&amp;rsquo;s composition model made it almost painless. You can embed a pointer to one
struct inside another and the outer type inherits all the inner one&amp;rsquo;s methods for
free, which is about as close to monkey-patching as a statically typed language
gets. So a &lt;code&gt;setup.Command&lt;/code&gt;
&lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/61844e6/pkg/setup/command.go#L22-L60" target="_blank" rel="noopener"
 &gt;became&lt;/a&gt;
a cobra command &lt;em&gt;plus&lt;/em&gt; the feature it belongs to:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;	&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;cobra&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;	&lt;/span&gt;&lt;span class="nx"&gt;Feature&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;FeatureCmd&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;Wrap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;feature&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;FeatureCmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;cmd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;cobra&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;	&lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Feature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;Register&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;children&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;...*&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;	&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;range&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;children&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;		&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;nil&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;			&lt;/span&gt;&lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;		&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;		&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;RunE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;!=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;nil&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;			&lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;RunE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;Chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Feature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;RunE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;		&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;		&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddCommand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;	&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Because &lt;code&gt;*cobra.Command&lt;/code&gt; is embedded, a &lt;code&gt;setup.Command&lt;/code&gt; &lt;em&gt;is&lt;/em&gt; a cobra command for
every method cobra offers; the one place you need the raw pointer, you reach for
&lt;code&gt;.Command&lt;/code&gt;. The generated command now carries its own identity:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;NewCmdChild&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Props&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;setup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;	&lt;/span&gt;&lt;span class="nx"&gt;cmd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nx"&gt;cobra&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Use&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;child&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;RunE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;	&lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;setup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Wrap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FeatureCmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;child&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;and the parent registers it with nothing threaded through the call:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Register&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;NewCmdChild&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The feature key now lives on the command, derived from the command&amp;rsquo;s own name,
which is the one place the generator can &lt;em&gt;always&lt;/em&gt; produce it correctly. The
bug isn&amp;rsquo;t so much fixed as made unsayable: there&amp;rsquo;s no call site
left to write the wrong thing into. The wiring got cleaner on the way past, too,
each command&amp;rsquo;s run is wrapped exactly once with its own feature, instead of the old
recursive pass that re-applied the &lt;em&gt;parent&amp;rsquo;s&lt;/em&gt; feature down the whole subtree. And
the old free function stays on as a deprecated shim that just calls &lt;code&gt;Register&lt;/code&gt;, so
nothing downstream breaks before v1.0.&lt;/p&gt;
&lt;h2 id="changing-my-mind-about-the-tests"&gt;Changing my mind about the tests
&lt;/h2&gt;&lt;p&gt;The redesign was the satisfying fix. The test was the important one.&lt;/p&gt;
&lt;p&gt;The reason a non-compiling generator sailed through CI for two months is that its
tests read the generated source as text. A generator is a program that writes a
program, and we were checking that it wrote the expected words without ever asking
whether the words formed a working program. So the redesign shipped with
&lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/21a90e4/internal/generator/compile_integration_test.go" target="_blank" rel="noopener"
 &gt;a different kind of test&lt;/a&gt;,
one that scaffolds a real project, adds a nested command, and actually builds it.
Its own comment says the quiet part out loud:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;The previous test suite asserted file-content shapes but never tried to &lt;code&gt;go build&lt;/code&gt;
the generated module, so the nested-command path that referenced undefined
&lt;code&gt;props.&amp;lt;Name&amp;gt;Cmd&lt;/code&gt; symbols compiled cleanly in tests and broke only when downstream
users built their tools.&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;It&amp;rsquo;s gated behind an integration flag, because it shells out to the Go toolchain
and that&amp;rsquo;s too heavy for every unit run, but it closes the exact gap that hid the
bug. The only &lt;a class="link" href="https://phpboyscout.uk/an-ai-agent-that-has-to-make-the-build-pass/" &gt;real test of a code generator&lt;/a&gt;
is whether its output compiles.&lt;/p&gt;
&lt;h2 id="what-it-comes-down-to"&gt;What it comes down to
&lt;/h2&gt;&lt;p&gt;Three times in one bug I had to change my mind. I&amp;rsquo;d decided cobra shouldn&amp;rsquo;t be
abstracted; the evidence said abstract it. I&amp;rsquo;d reached for the one-line patch; the
evidence said redesign. I&amp;rsquo;d trusted tests that read the generated code; the evidence
said build it. None of those were comfortable, and the version reset is the only
reason the timing was kind.&lt;/p&gt;
&lt;p&gt;Both technical lessons are worth keeping. When the same derived value is threaded
through every call, the abstraction is in the wrong place. And the only proof that
something which writes code actually works is to compile what it writes. But the one underneath
both, the one I apparently have to keep relearning, is simpler than either: don&amp;rsquo;t
get so attached to an implementation that you can&amp;rsquo;t change your mind when the
evidence says it doesn&amp;rsquo;t fit. The framework is better for it, and it now has a
composition seam to hang the features cobra doesn&amp;rsquo;t give us natively. A nested
command that wouldn&amp;rsquo;t build was just the thing that finally made me look.&lt;/p&gt;</description></item><item><title>From allow_failure to blocking</title><link>https://phpboyscout.uk/from-allow-failure-to-blocking/</link><pubDate>Sat, 30 May 2026 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/from-allow-failure-to-blocking/</guid><description>&lt;img src="https://phpboyscout.uk/from-allow-failure-to-blocking/cover-from-allow-failure-to-blocking.png" alt="Featured image of post From allow_failure to blocking" /&gt;&lt;p&gt;There&amp;rsquo;s a special kind of CI job that everyone on a team quietly learns to
ignore: the one marked &lt;code&gt;allow_failure: true&lt;/code&gt;. It runs, it goes red, the
pipeline goes green anyway, and after the third time you stop looking at it. I
inherited six of those when I moved rust-tool-base&amp;rsquo;s CI to GitLab. Over a few
days I turned three of them into real gates, and the interesting part was never
the YAML. It was working out which ones had earned the right to block, and
which hadn&amp;rsquo;t.&lt;/p&gt;
&lt;h2 id="what-allow_failure-actually-buys-you"&gt;What allow_failure actually buys you
&lt;/h2&gt;&lt;p&gt;&lt;code&gt;allow_failure: true&lt;/code&gt; is genuinely useful, and quietly corrosive. It lets a job
report a problem without stopping the pipeline, which is exactly right for a
check that&amp;rsquo;s noisy, not yet stable, or guarding against something you can&amp;rsquo;t fix
this minute. The trouble is that a warning nobody is forced to act on is a
warning nobody acts on. Leave a job advisory long enough and it becomes
scenery: red, ignored, pointless. So an advisory check is really a promise,
&amp;ldquo;I&amp;rsquo;ll make this blocking once it&amp;rsquo;s trustworthy&amp;rdquo;, and a promise you only ever
mean to keep is just a lie you haven&amp;rsquo;t noticed yet.&lt;/p&gt;
&lt;p&gt;When I &lt;a class="link" href="https://gitlab.com/phpboyscout/rust-tool-base/-/blob/2213f8e/.gitlab-ci.yml" target="_blank" rel="noopener"
 &gt;migrated rust-tool-base from GitHub Actions to GitLab CI&lt;/a&gt;,
the move landed six jobs as &lt;code&gt;allow_failure: true&lt;/code&gt;: the macOS and Windows tests,
the integration tests, &lt;code&gt;cargo-audit&lt;/code&gt;, &lt;code&gt;trivy&lt;/code&gt;, and coverage. That wasn&amp;rsquo;t
laziness. A migration is the wrong moment to also be fighting flaky gates. But
it left me holding six promises to either keep or admit I wasn&amp;rsquo;t going to.&lt;/p&gt;
&lt;h2 id="a-check-earns-the-right-to-block"&gt;A check earns the right to block
&lt;/h2&gt;&lt;p&gt;Here&amp;rsquo;s the rule I settled on. A check earns the right to fail your build when
two things are true: it&amp;rsquo;s &lt;em&gt;meaningful&lt;/em&gt; (a red result is a real problem, not
noise) and it&amp;rsquo;s &lt;em&gt;reliable&lt;/em&gt; (it goes red only when there genuinely is a problem,
and it can actually run to completion). Flip a check to blocking before both
hold and you haven&amp;rsquo;t raised the bar, you&amp;rsquo;ve taught the team to force-merge past
red, which is worse than no gate at all, because now the red means nothing.&lt;/p&gt;
&lt;p&gt;Three of my six crossed that line within a few days. Three deliberately didn&amp;rsquo;t.
The reasons are the whole story.&lt;/p&gt;
&lt;h2 id="trivy-blocked-once-there-was-nothing-to-block-on"&gt;trivy: blocked once there was nothing to block on
&lt;/h2&gt;&lt;p&gt;&lt;a class="link" href="https://gitlab.com/phpboyscout/rust-tool-base/-/blob/f9cab20/.gitlab-ci.yml#L247-256" target="_blank" rel="noopener"
 &gt;&lt;code&gt;trivy&lt;/code&gt;&lt;/a&gt;
scans the dependency tree for HIGH and CRITICAL advisories. It went across as
advisory for an honest reason: the &lt;code&gt;Cargo.lock&lt;/code&gt; at migration time already
carried two known HIGH/CRITICAL advisories I hadn&amp;rsquo;t cleared yet, a
path-traversal in &lt;code&gt;gix-validate&lt;/code&gt; and a DNS-rebinding issue in &lt;code&gt;rmcp&lt;/code&gt;. Make
trivy blocking with those sitting there and the pipeline is red from day one,
over problems I already knew about and was already fixing. So it stayed
advisory until the dependency bumps cleared both, and then the &lt;code&gt;allow_failure&lt;/code&gt;
line came out. The gate never changed. The tree underneath it got clean enough
to stand on.&lt;/p&gt;
&lt;h2 id="integration-tests-blocked-once-it-could-actually-run"&gt;integration-tests: blocked once it could actually run
&lt;/h2&gt;&lt;p&gt;The &lt;a class="link" href="https://gitlab.com/phpboyscout/rust-tool-base/-/blob/193f380/.gitlab-ci.yml#L200-226" target="_blank" rel="noopener"
 &gt;integration tests&lt;/a&gt;
stand up a real Gitea in a Docker-in-Docker service and talk to it. They were
advisory for a different reason: they couldn&amp;rsquo;t reliably run. dind needs a
privileged runner, and the suite was resolving the container host with a
hardcoded &lt;code&gt;127.0.0.1&lt;/code&gt; that didn&amp;rsquo;t hold everywhere. Blocking a job that fails
for infrastructure reasons rather than code reasons is the fastest way to make
people distrust the entire pipeline. So the fix wasn&amp;rsquo;t in the YAML, it was
making the thing dependable: &lt;code&gt;privileged&lt;/code&gt; set on the runner, and the host
resolved through the test library&amp;rsquo;s own &lt;code&gt;get_host()&lt;/code&gt; instead of a hardcoded
address. Once it ran the same way every time, it earned the gate.&lt;/p&gt;
&lt;h2 id="coverage-blocked-once-it-could-run-at-all-then-once-it-cleared-the-bar"&gt;coverage: blocked once it could run at all, then once it cleared the bar
&lt;/h2&gt;&lt;p&gt;Coverage is the two-step one, and my favourite, because it nearly didn&amp;rsquo;t make
it for a thoroughly undramatic reason: it ran out of memory. &lt;code&gt;cargo llvm-cov&lt;/code&gt;
instruments every test binary, and linking hundreds of instrumented object
files needs more RAM than the shared medium runner had, so the job bus-errored
on the link. I tagged it onto a larger runner, and then the shared SaaS runners
were switched off entirely, so the tag matched nothing and the job sat pending
forever.&lt;/p&gt;
&lt;p&gt;The fix was a &lt;a class="link" href="https://gitlab.com/phpboyscout/rust-tool-base/-/blob/193f380/.gitlab-ci.yml#L200-226" target="_blank" rel="noopener"
 &gt;self-hosted homelab runner&lt;/a&gt;
with the RAM the instrumented link actually needs. I moved coverage there but
kept it advisory &lt;em&gt;for one run&lt;/em&gt;, to confirm the box could finish the build
before I trusted it. It did, at
&lt;a class="link" href="https://gitlab.com/phpboyscout/rust-tool-base/-/blob/1c9e589/.gitlab-ci.yml#L286-320" target="_blank" rel="noopener"
 &gt;73.22% line coverage&lt;/a&gt;,
so I set the gate to fail under 70% and made it blocking. Three points of
headroom: enough that ordinary churn won&amp;rsquo;t trip it, tight enough that a real
drop will. A coverage gate pinned to the current number is a tripwire that
fires on the very next commit; set it a touch below and it catches regressions
instead of normal life.&lt;/p&gt;
&lt;h2 id="the-three-i-left-advisory-on-purpose"&gt;The three I left advisory, on purpose
&lt;/h2&gt;&lt;p&gt;The point was never &amp;ldquo;block everything&amp;rdquo;. Three jobs are still &lt;code&gt;allow_failure&lt;/code&gt; in
&lt;a class="link" href="https://gitlab.com/phpboyscout/rust-tool-base/-/blob/d3c23fc/.gitlab-ci.yml" target="_blank" rel="noopener"
 &gt;the current pipeline&lt;/a&gt;,
deliberately. The macOS and Windows tests run on SaaS runners that bill by the
minute; they&amp;rsquo;re worth running, not worth blocking every merge of a Linux-first
project over a quota I&amp;rsquo;m choosing to ration. And &lt;code&gt;cargo-audit&lt;/code&gt; stays advisory
because &lt;code&gt;cargo-deny&lt;/code&gt; already does the blocking advisory check: cargo-audit is a
second opinion from a different database, and a second opinion that can veto
isn&amp;rsquo;t a second opinion, it&amp;rsquo;s a duplicate gate that will eventually disagree with
the first and block you on the difference.&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s the same rule from the other side. Those three haven&amp;rsquo;t earned the right
to block, because blocking them would cost more than it ever caught.&lt;/p&gt;
&lt;h2 id="the-upshot"&gt;The upshot
&lt;/h2&gt;&lt;p&gt;&lt;code&gt;allow_failure: true&lt;/code&gt; is fine as a waiting room and corrosive as a destination.
Every advisory check is a promise to make it blocking once it&amp;rsquo;s both meaningful
and reliable, and the job is to keep the promise or admit you won&amp;rsquo;t. trivy
earned its gate when the advisories cleared, the integration tests when they
ran the same way every time, coverage when it had a runner with enough memory
and a threshold set just below the current mark. The three I left advisory
earned that standing too, by costing more to block than they&amp;rsquo;d catch. The YAML
is one deleted line per job. Knowing which line to delete, and when, is the
whole skill.&lt;/p&gt;</description></item><item><title>Process isolation won't save you from the filesystem</title><link>https://phpboyscout.uk/process-isolation-wont-save-you-from-the-filesystem/</link><pubDate>Sat, 25 Apr 2026 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/process-isolation-wont-save-you-from-the-filesystem/</guid><description>&lt;img src="https://phpboyscout.uk/process-isolation-wont-save-you-from-the-filesystem/cover-process-isolation-wont-save-you-from-the-filesystem.png" alt="Featured image of post Process isolation won't save you from the filesystem" /&gt;&lt;p&gt;A test that passed every single time I ran it on its own, and failed maybe one run in five when I ran the whole suite. The failure was always the same: the self-update test downloaded a release archive, went to extract it, and found the archive corrupt. Half-written. As if something had been scribbling in the file while it read it. Something had.&lt;/p&gt;
&lt;h2 id="the-comfort-i-was-leaning-on"&gt;The comfort I was leaning on
&lt;/h2&gt;&lt;p&gt;The self-update tests are heavier than a unit test wants to be. They stand up a fake release, download the artefact, verify its checksum, extract it, swap a binary. Real files, real I/O. So they&amp;rsquo;d been built to run as separate &lt;em&gt;processes&lt;/em&gt;, not just separate threads, each one its own little world.&lt;/p&gt;
&lt;p&gt;And I&amp;rsquo;d quietly filed that under &amp;ldquo;solved&amp;rdquo;. Separate processes don&amp;rsquo;t share an address space. One can&amp;rsquo;t reach into another&amp;rsquo;s memory and corrupt a value mid-read. That whole category of data race, the kind you reach for a mutex to fix, simply can&amp;rsquo;t happen across a process boundary. So I&amp;rsquo;d stopped thinking about concurrency in these tests at all, because I&amp;rsquo;d convinced myself the isolation was total.&lt;/p&gt;
&lt;p&gt;It wasn&amp;rsquo;t total. It was isolation of &lt;em&gt;memory&lt;/em&gt;, and I&amp;rsquo;d let myself hear it as isolation of &lt;em&gt;everything&lt;/em&gt;.&lt;/p&gt;
&lt;h2 id="two-processes-one-path"&gt;Two processes, one path
&lt;/h2&gt;&lt;p&gt;The thing two processes very much do still share is the filesystem. And the self-update flow, sensibly, caches its download rather than re-fetching it. The default cache directory is computed from the tool&amp;rsquo;s name and the release version, in &lt;a class="link" href="https://gitlab.com/phpboyscout/rust-tool-base/-/blob/9c22aa8/crates/rtb-update/src/flow.rs#L228" target="_blank" rel="noopener"
 &gt;&lt;code&gt;crates/rtb-update/src/flow.rs&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-rust" data-lang="rust"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;cache_dir_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;: &lt;span class="kp"&gt;&amp;amp;&lt;/span&gt;&lt;span class="kt"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt;: &lt;span class="kp"&gt;&amp;amp;&lt;/span&gt;&lt;span class="kt"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-&amp;gt; &lt;span class="nc"&gt;PathBuf&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;directories&lt;/span&gt;::&lt;span class="n"&gt;ProjectDirs&lt;/span&gt;::&lt;span class="n"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map_or_else&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;::&lt;span class="n"&gt;env&lt;/span&gt;::&lt;span class="n"&gt;temp_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cache_dir&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;to_path_buf&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;update&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Read that with two parallel test processes in mind. They&amp;rsquo;re testing the &lt;em&gt;same&lt;/em&gt; tool, against the &lt;em&gt;same&lt;/em&gt; fake release tag. So &lt;code&gt;tool_name&lt;/code&gt; matches and &lt;code&gt;version&lt;/code&gt; matches, which means &lt;code&gt;cache_dir_for&lt;/code&gt; hands both of them the &lt;em&gt;identical path&lt;/em&gt;. Two processes, isolated in every way that involves memory, both downloading and extracting into one shared directory on disk, at the same time. One writes the archive while the other is partway through reading it, and you get exactly the corrupt half-written file the test kept tripping over.&lt;/p&gt;
&lt;p&gt;Process isolation did nothing here, because the contention was never in memory. It was on a path string that came out the same for both of them.&lt;/p&gt;
&lt;h2 id="the-fix-is-to-stop-sharing-the-path"&gt;The fix is to stop sharing the path
&lt;/h2&gt;&lt;p&gt;Once it&amp;rsquo;s framed as &amp;ldquo;they share a path&amp;rdquo;, the fix writes itself: don&amp;rsquo;t share the path. Give each invocation its own cache directory. The updater builder already had the seam for it, and the doc comment now says exactly why it&amp;rsquo;s there, in &lt;a class="link" href="https://gitlab.com/phpboyscout/rust-tool-base/-/blob/9c22aa8/crates/rtb-update/src/updater.rs#L396" target="_blank" rel="noopener"
 &gt;&lt;code&gt;crates/rtb-update/src/updater.rs&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-rust" data-lang="rust"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt;/// Tools call this when they want isolation per-invocation
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt;/// (e.g. CI runners, tests with parallel processes) or to honour
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt;/// a user-supplied `--cache-dir` flag.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;cache_dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cache_dir&lt;/span&gt;: &lt;span class="nc"&gt;impl&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;Into&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;PathBuf&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-&amp;gt; &lt;span class="nc"&gt;Self&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cache_dir&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cache_dir&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;into&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Each test now builds its updater with &lt;code&gt;cache_dir(its_own_tempdir)&lt;/code&gt;, so two parallel processes land on two different directories and never meet. No lock, no serialisation, no clever cross-process file mutex. Just the realisation that the shared thing was a directory, and the cure for shared mutable state is usually to stop sharing it, not to guard it.&lt;/p&gt;
&lt;h2 id="the-fix-that-turned-out-to-be-a-feature"&gt;The fix that turned out to be a feature
&lt;/h2&gt;&lt;p&gt;The part I&amp;rsquo;m quietly pleased about is that this didn&amp;rsquo;t stay a test-only hack. The override I needed to isolate the tests is exactly the override a real tool wants for its own reasons. A CI runner doing self-update wants a writable cache path it controls, not wherever &lt;code&gt;directories-rs&lt;/code&gt; decides the system cache lives. A user might reasonably want to point the whole thing somewhere specific. That&amp;rsquo;s a &lt;code&gt;--cache-dir&lt;/code&gt; flag, and &lt;code&gt;cache_dir()&lt;/code&gt; is precisely the hook you&amp;rsquo;d wire it to.&lt;/p&gt;
&lt;p&gt;So the thing I added to stop a flaky test is the same thing a downstream tool reaches for to expose &lt;code&gt;--cache-dir&lt;/code&gt;. The test forced the seam to exist, and the seam was worth having anyway. I&amp;rsquo;ll take that trade every time over a fix that only the test suite benefits from.&lt;/p&gt;
&lt;h2 id="what-it-comes-down-to"&gt;What it comes down to
&lt;/h2&gt;&lt;p&gt;I&amp;rsquo;d treated &amp;ldquo;separate processes&amp;rdquo; as a synonym for &amp;ldquo;can&amp;rsquo;t race&amp;rdquo;, and it isn&amp;rsquo;t. Processes don&amp;rsquo;t share memory, so the memory races are gone. They absolutely still share the filesystem, the network, every named resource the OS will hand to anyone who asks for it by the same name. My two test processes computed the same cache path from the same tool and tag, and raced on the files in it, and no amount of address-space isolation was ever going to touch that.&lt;/p&gt;
&lt;p&gt;Shared mutable state on disk is still shared mutable state. The fix wasn&amp;rsquo;t a bigger hammer, it was giving each process its own directory and letting the isolation I thought I already had actually be true.&lt;/p&gt;</description></item><item><title>The test-mocking pattern that races</title><link>https://phpboyscout.uk/the-test-mocking-pattern-that-races/</link><pubDate>Thu, 16 Apr 2026 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/the-test-mocking-pattern-that-races/</guid><description>&lt;img src="https://phpboyscout.uk/the-test-mocking-pattern-that-races/cover-the-test-mocking-pattern-that-races.png" alt="Featured image of post The test-mocking pattern that races" /&gt;&lt;p&gt;I&amp;rsquo;m going to tell you about a bug go-tool-base shipped, because it&amp;rsquo;s one of those bugs that&amp;rsquo;s so reasonable-looking you&amp;rsquo;ll find it in textbooks, conference talks, and an awful lot of otherwise excellent Go code. We had it too. It passed every test on my laptop, every single time, and then quietly fell over on CI while blaming an innocent bystander.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s the classic Go trick for mocking a dependency, and it races.&lt;/p&gt;
&lt;h2 id="a-pattern-that-looks-completely-reasonable"&gt;A pattern that looks completely reasonable
&lt;/h2&gt;&lt;p&gt;Here&amp;rsquo;s a thing you need to do constantly in Go tests: stop a function from really shelling out. It calls &lt;code&gt;exec.LookPath&lt;/code&gt; to find a binary, or &lt;code&gt;exec.Command&lt;/code&gt; to run one, and your test very much does not want it touching the real &lt;code&gt;$PATH&lt;/code&gt; or spawning a real process.&lt;/p&gt;
&lt;p&gt;The Go community has a well-worn answer. Hoist the function into a package-level variable, call &lt;em&gt;that&lt;/em&gt;, and let tests reassign it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// production code&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;execLookPath&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;LookPath&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;findTool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;execLookPath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;sometool&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// test&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;TestFindTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;testing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;old&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;execLookPath&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;defer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;execLookPath&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;old&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;execLookPath&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;/fake/path&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;nil&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// ...assert...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;It&amp;rsquo;s tidy. No interface to thread through, no constructor to change. You&amp;rsquo;ll find it in a great deal of Go code, including some very respectable Go code indeed. go-tool-base had it too.&lt;/p&gt;
&lt;p&gt;And it works. It works on your machine, it works in code review, it works the first hundred times CI runs it. Which is precisely what makes it dangerous, because it&amp;rsquo;s wrong, and it&amp;rsquo;s just been biding its time.&lt;/p&gt;
&lt;h2 id="add-one-line-and-it-detonates"&gt;Add one line and it detonates
&lt;/h2&gt;&lt;p&gt;Go&amp;rsquo;s &lt;code&gt;t.Parallel()&lt;/code&gt; is more or less free performance. Mark your tests with it and the runner overlaps them instead of plodding through one at a time. On a package with a few hundred tests it&amp;rsquo;s a real, worthwhile speed-up, so naturally you reach for it.&lt;/p&gt;
&lt;p&gt;Now picture two tests, both using the pattern above, both marked &lt;code&gt;t.Parallel()&lt;/code&gt;. They run concurrently. Test A assigns its fake to &lt;code&gt;execLookPath&lt;/code&gt;. Test B assigns &lt;em&gt;its&lt;/em&gt; fake to &lt;code&gt;execLookPath&lt;/code&gt;. Test A reads &lt;code&gt;execLookPath&lt;/code&gt; expecting its own fake. Two goroutines, one variable, writes and reads with nothing synchronising them. That&amp;rsquo;s a textbook data race, and the textbook is right: the behaviour is undefined. Test A might see B&amp;rsquo;s fake. The deferred restore might land in the wrong order and leave the variable pointing at a fake after both tests have finished, poisoning a third one for good measure.&lt;/p&gt;
&lt;p&gt;The truly nasty part is the &lt;em&gt;intermittency&lt;/em&gt;. Whether the race actually bites depends on goroutine scheduling, which depends on machine load and core count. Your laptop running eight tests at once might never lose the coin-toss. A CI runner under load, scheduling differently, loses it and fails a test that has nothing obviously to do with the change in the commit. You re-run the pipeline, it passes, everyone shrugs and moves on. A test suite that fails one run in twenty trains your team to ignore it, and an ignored CI failure is worse than no CI at all.&lt;/p&gt;
&lt;p&gt;I can tell you this one from direct, slightly embarrassed experience, because go-tool-base shipped exactly this bug and CI caught it the honest way: green on the laptop, red on the runner, with the failure cheerfully pointing at innocent bystander tests rather than the global that was actually the culprit. &lt;code&gt;go test -race&lt;/code&gt; will name it for you if you crank the parallelism up high enough to lose the toss reliably&amp;hellip; but you have to go looking, and you only go looking once it&amp;rsquo;s already ruined an afternoon.&lt;/p&gt;
&lt;h2 id="the-fix-isnt-synchronisation-its-structure"&gt;The fix isn&amp;rsquo;t synchronisation, it&amp;rsquo;s structure
&lt;/h2&gt;&lt;p&gt;The instinct is to slap a mutex around the variable. Resist it. A mutex makes the race &lt;em&gt;defined&lt;/em&gt;, but it doesn&amp;rsquo;t make the design any good. You&amp;rsquo;ve still got global mutable state, you&amp;rsquo;ve just queued the fight instead of cancelling it. And tests that serialise on a shared lock aren&amp;rsquo;t really parallel any more, so you&amp;rsquo;ve also handed back the speed-up you came for in the first place.&lt;/p&gt;
&lt;p&gt;The real fix is to not have a shared variable at all. The dependency was always an &lt;em&gt;input&lt;/em&gt; to the code; the package-level var was just a way of avoiding saying so out loud. So say it. Inject it.&lt;/p&gt;
&lt;p&gt;A struct field:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Finder&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;lookPath&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// defaults to exec.LookPath&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;Finder&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lookPath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;sometool&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Or a functional option, if you&amp;rsquo;d rather keep the zero value clean. Either way, each test constructs its &lt;em&gt;own&lt;/em&gt; &lt;code&gt;Finder&lt;/code&gt; with its &lt;em&gt;own&lt;/em&gt; fake. There&amp;rsquo;s no shared variable, so there&amp;rsquo;s no race, and &lt;code&gt;t.Parallel()&lt;/code&gt; is free again because the tests genuinely don&amp;rsquo;t touch each other.&lt;/p&gt;
&lt;p&gt;go-tool-base wrote this straight into its standing rules: no package-level mocking hooks, full stop. Dependencies come in through struct fields, functional options, or config fields. (The same injection discipline that makes &lt;a class="link" href="https://phpboyscout.uk/props-the-container-that-does-the-heavy-lifting/" &gt;Props&lt;/a&gt; so testable, applied one rung further down.) And to stop everyone hand-rolling the same &lt;code&gt;exec&lt;/code&gt; fakes, there&amp;rsquo;s a small internal package, &lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/5c78fc9/internal/exectest/exectest.go" target="_blank" rel="noopener"
 &gt;&lt;code&gt;internal/exectest&lt;/code&gt;&lt;/a&gt;, with ready-made &lt;code&gt;LookPath&lt;/code&gt; and &lt;code&gt;CommandContext&lt;/code&gt; doubles you construct per-test. The pattern is gone, and the door it came in through is shut.&lt;/p&gt;
&lt;h2 id="the-rule-worth-taking-away"&gt;The rule worth taking away
&lt;/h2&gt;&lt;p&gt;A package-level variable that tests reassign is shared mutable state. It reads as a harmless convenience because in a single-threaded test run it behaves like one. &lt;code&gt;t.Parallel()&lt;/code&gt; is the thing that reveals it was never harmless, only unobserved.&lt;/p&gt;
&lt;p&gt;The general lesson is older than Go: &lt;strong&gt;if a value is an input to your code, make it an input.&lt;/strong&gt; Smuggling it in as a global is borrowing test-time convenience against a debt that comes due, with interest, the day someone wants their tests to run in parallel. Pay cash. Inject the dependency.&lt;/p&gt;
&lt;h2 id="worth-remembering"&gt;Worth remembering
&lt;/h2&gt;&lt;p&gt;Mocking via a reassignable package-level variable is a beloved Go shortcut and a latent data race. It survives because single-threaded test runs hide it; &lt;code&gt;t.Parallel()&lt;/code&gt; exposes it as intermittent, bystander-blaming CI flake that&amp;rsquo;s miserable to trace. A mutex only makes the bad design &lt;em&gt;defined&lt;/em&gt;. The fix is structural: inject the dependency as a struct field or functional option, so each test owns its own double and there&amp;rsquo;s no shared state to race over. go-tool-base banned the global-hook pattern outright and ships &lt;code&gt;internal/exectest&lt;/code&gt; so nobody&amp;rsquo;s tempted back to it.&lt;/p&gt;
&lt;p&gt;If a piece of code depends on something, let it &lt;em&gt;say&lt;/em&gt; so in its signature. Your future self, staring at a CI failure that flatly refuses to reproduce, will thank you.&lt;/p&gt;</description></item><item><title>Testing code that calls an LLM: yes, you actually can</title><link>https://phpboyscout.uk/testing-code-that-calls-an-llm/</link><pubDate>Wed, 08 Apr 2026 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/testing-code-that-calls-an-llm/</guid><description>&lt;img src="https://phpboyscout.uk/testing-code-that-calls-an-llm/cover-testing-code-that-calls-an-llm.png" alt="Featured image of post Testing code that calls an LLM: yes, you actually can" /&gt;&lt;p&gt;&amp;ldquo;You can&amp;rsquo;t test code that calls an AI.&amp;rdquo; I&amp;rsquo;ve heard it said with great confidence, and it&amp;rsquo;s half right, which is the most dangerous kind of right. You genuinely can&amp;rsquo;t assert on what a non-deterministic model says. But the model isn&amp;rsquo;t your code, and the bits sitting either side of it most certainly are.&lt;/p&gt;
&lt;h2 id="you-cant-test-ai-code"&gt;&amp;ldquo;You can&amp;rsquo;t test AI code&amp;rdquo;
&lt;/h2&gt;&lt;p&gt;It&amp;rsquo;s a fair worry. Your command calls an LLM. The LLM returns something slightly different every run. A test that asserts &lt;code&gt;response == &amp;quot;...&amp;quot;&lt;/code&gt; is broken before you&amp;rsquo;ve finished typing it. So the conclusion arrives quickly: the AI path can&amp;rsquo;t be tested, leave it uncovered.&lt;/p&gt;
&lt;p&gt;Which is a shame, because the AI call is usually the riskiest line in the whole command.&lt;/p&gt;
&lt;p&gt;The conclusion is also wrong. It mistakes &amp;ldquo;I can&amp;rsquo;t test the model&amp;rdquo; for &amp;ldquo;I can&amp;rsquo;t test my code&amp;rdquo;. The model is not your code. Your code is the two pieces sitting on either side of it.&lt;/p&gt;
&lt;h2 id="your-code-is-a-prompt-and-a-handler"&gt;Your code is a prompt and a handler
&lt;/h2&gt;&lt;p&gt;Strip the command down to what it actually does:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;It builds a prompt. It assembles a system prompt, the user&amp;rsquo;s input, perhaps some context, and sends it.&lt;/li&gt;
&lt;li&gt;The model does something. This is not your code.&lt;/li&gt;
&lt;li&gt;It takes the response and does something with it. It parses it, branches on it, prints it, stores it.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Steps one and three are entirely yours, and entirely deterministic. The same inputs build the same prompt and handle the same response the same way, every single time. That&amp;rsquo;s testable. Step two is the only part that isn&amp;rsquo;t, and step two was never yours to test in the first place.&lt;/p&gt;
&lt;p&gt;So the job is to pin step two to a known value, and then test one and three properly.&lt;/p&gt;
&lt;h2 id="test-the-prompt-snapshot-it"&gt;Test the prompt: snapshot it
&lt;/h2&gt;&lt;p&gt;Step one produces a prompt, and a prompt is just a string, which means you can pin it.&lt;/p&gt;
&lt;p&gt;Both frameworks lean on snapshot testing here. go-tool-base uses a golden-file approach: the prompt your code generates is recorded to a file, and the test re-generates it and compares against that file. rust-tool-base does the same with &lt;code&gt;insta&lt;/code&gt;, snapshotting the request body the client would send.&lt;/p&gt;
&lt;p&gt;The reason this matters is that the prompt is load-bearing and quietly easy to break. You refactor how context gets assembled. Without noticing, you&amp;rsquo;ve changed the wording, or the ordering, or dropped a line the model was leaning on. Nothing fails to compile. The behaviour just drifts, silently.&lt;/p&gt;
&lt;p&gt;A snapshot test catches exactly that. It fails, it shows you the diff between the old prompt and the new one, and it makes you stop and make a decision. Was this change intended? If yes, you accept the new snapshot and move on. If no, you&amp;rsquo;ve just caught a bug before it shipped. Either way the prompt never changes by accident, which for AI code is most of the battle.&lt;/p&gt;
&lt;h2 id="test-the-handler-mock-the-response"&gt;Test the handler: mock the response
&lt;/h2&gt;&lt;p&gt;Step three needs a response to handle, and in a unit test you don&amp;rsquo;t get that response from the real model. You supply it.&lt;/p&gt;
&lt;p&gt;go-tool-base ships &lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/5c78fc9/mocks/pkg/chat/ChatClient.go" target="_blank" rel="noopener"
 &gt;generated mocks for the &lt;code&gt;ChatClient&lt;/code&gt; interface&lt;/a&gt;. A test builds a mock client, tells it &amp;ldquo;when &lt;code&gt;Ask&lt;/code&gt; is called, return &lt;em&gt;this&lt;/em&gt; canned value&amp;rdquo;, and runs the command against it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;mockClient&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;mock_chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;NewMockChatClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;mockClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EXPECT&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;Ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Anything&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;mock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Anything&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;mock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AnythingOfType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;*main.Analysis&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;RunAndReturn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;any&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;.(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;Analysis&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Analysis&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Severity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;critical&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;nil&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Because &lt;a class="link" href="https://phpboyscout.uk/an-ai-interface-that-fits-on-one-screen/" &gt;the interface is only four methods&lt;/a&gt;, that mock is trivial to set up and complete by construction. rust-tool-base takes the same idea one layer down: HTTP-bound tests use &lt;code&gt;wiremock&lt;/code&gt;, which stands up a fake server returning a canned response body. The client makes a real HTTP request; it just goes to a fake endpoint the test controls.&lt;/p&gt;
&lt;p&gt;Either way, step two is now fixed to a value you chose, which makes step three deterministic. And that unlocks the tests that actually matter: given a malformed response, does the command fail gracefully? Given a rate-limit error, an empty answer, a field missing? Those are the cases a live model almost never hands you on demand, and a mock hands you every time, on the first run.&lt;/p&gt;
&lt;p&gt;This is, incidentally, the same discipline as &lt;a class="link" href="https://phpboyscout.uk/the-test-mocking-pattern-that-races/" &gt;the test-mocking work elsewhere in the framework&lt;/a&gt;: the dependency is injected, so the test gets to decide what it does.&lt;/p&gt;
&lt;h2 id="what-you-deliberately-dont-test"&gt;What you deliberately don&amp;rsquo;t test
&lt;/h2&gt;&lt;p&gt;One boundary worth stating. None of this tests whether the model gives &lt;em&gt;good&lt;/em&gt; answers. That question is real, but it&amp;rsquo;s a different activity (evaluations, run as their own suite) and not something to mix into the unit tests.&lt;/p&gt;
&lt;p&gt;The unit suite&amp;rsquo;s job is your code: that it builds a sound prompt, and that it handles every shape of response correctly, including the ugly ones. Keep that well away from &amp;ldquo;is the model clever today&amp;rdquo;. A unit test that depends on the model being clever is a unit test that fails when the weather changes, and a flaky test just teaches people to ignore the whole suite.&lt;/p&gt;
&lt;h2 id="what-it-comes-down-to"&gt;What it comes down to
&lt;/h2&gt;&lt;p&gt;Code that calls an LLM is testable; the model is not, and those are different statements. Your code is a prompt builder and a response handler, both deterministic, with the model sat in between.&lt;/p&gt;
&lt;p&gt;go-tool-base and rust-tool-base converge on the same approach. Snapshot the prompt, with golden files or &lt;code&gt;insta&lt;/code&gt;, so a refactor can&amp;rsquo;t change what you send without a test noticing. Mock the response, with generated &lt;code&gt;ChatClient&lt;/code&gt; mocks or a &lt;code&gt;wiremock&lt;/code&gt; server, so tests run with no network and you can feed in the malformed and error cases a real model won&amp;rsquo;t reliably produce. Leave &amp;ldquo;are the answers any good&amp;rdquo; to a separate evaluation suite. Test the two halves you own, and the non-determinism in the middle stops being an excuse to leave the riskiest line uncovered.&lt;/p&gt;</description></item><item><title>BDD where it earns its place, and nowhere else</title><link>https://phpboyscout.uk/bdd-where-it-earns-its-place/</link><pubDate>Sat, 28 Mar 2026 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/bdd-where-it-earns-its-place/</guid><description>&lt;img src="https://phpboyscout.uk/bdd-where-it-earns-its-place/cover-bdd-where-it-earns-its-place.png" alt="Featured image of post BDD where it earns its place, and nowhere else" /&gt;&lt;p&gt;I have a slightly complicated relationship with BDD. I&amp;rsquo;ve watched it turn a tangled test suite into something the whole team could read and reason about, and I&amp;rsquo;ve watched it turn a perfectly good unit test into a paragraph of ceremonial English that nobody benefits from. So when go-tool-base brought in Cucumber-style BDD, the interesting decision wasn&amp;rsquo;t adopting it. It was being ruthless about where &lt;em&gt;not&lt;/em&gt; to.&lt;/p&gt;
&lt;h2 id="two-tests-that-hurt-for-different-reasons"&gt;Two tests that hurt for different reasons
&lt;/h2&gt;&lt;p&gt;Most of go-tool-base&amp;rsquo;s tests are ordinary table-driven Go tests, and they&amp;rsquo;re absolutely fine. A function, a slice of input/expected pairs, a loop. Nobody needs Gherkin to understand a parser test.&lt;/p&gt;
&lt;p&gt;But two areas were genuinely painful, and they were painful in the same way: the &lt;em&gt;test&lt;/em&gt; had become harder to understand than the &lt;em&gt;thing it was testing&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The first was &lt;code&gt;pkg/controls&lt;/code&gt;, the &lt;a class="link" href="https://phpboyscout.uk/lifecycle-management-for-long-running-go-services/" &gt;service-lifecycle package&lt;/a&gt;. It runs a small state machine (Unknown, Running, Stopping, Stopped) with signal handling, health monitoring, restart policies and graceful shutdown all woven through it. The integration tests for graceful shutdown had grown to over three hundred lines of imperative goroutine and channel coordination. They worked. But reviewing them was a slog, and a test you can&amp;rsquo;t review with confidence is a test you can&amp;rsquo;t trust when it fails. The behaviour being checked, &amp;ldquo;when a shutdown signal arrives mid-startup, the controller stops cleanly&amp;rdquo;, was a simple sentence buried under a heap of synchronisation scaffolding.&lt;/p&gt;
&lt;p&gt;The second was the CLI itself. &lt;code&gt;init&lt;/code&gt;, &lt;code&gt;update&lt;/code&gt;, &lt;code&gt;doctor&lt;/code&gt; are user &lt;em&gt;workflows&lt;/em&gt;. &amp;ldquo;Given a config file with a custom value, when I run init, then the custom value survives the merge.&amp;rdquo; That&amp;rsquo;s already a Given/When/Then; it just happened to be written out as Go.&lt;/p&gt;
&lt;h2 id="godog-and-the-line-i-drew"&gt;Godog, and the line I drew
&lt;/h2&gt;&lt;p&gt;Godog is the official Go implementation of Cucumber. You write &lt;code&gt;.feature&lt;/code&gt; files in plain Gherkin and bind each step to a Go function. The shutdown scenario stops being three hundred lines of channels and becomes this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-gherkin" data-lang="gherkin"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;Scenario:&lt;/span&gt;&lt;span class="nf"&gt; graceful shutdown completes within the deadline
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt; Given &lt;/span&gt;&lt;span class="nf"&gt;a controller with two registered services
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nf"&gt; &lt;/span&gt;&lt;span class="k"&gt;When &lt;/span&gt;&lt;span class="nf"&gt;a shutdown signal is received
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nf"&gt; &lt;/span&gt;&lt;span class="k"&gt;Then &lt;/span&gt;&lt;span class="nf"&gt;both services stop in registration order
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nf"&gt; &lt;/span&gt;&lt;span class="k"&gt;And &lt;/span&gt;&lt;span class="nf"&gt;the controller reports a clean shutdown
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The goroutine choreography doesn&amp;rsquo;t vanish, of course. It moves into the step definitions, written once and reused. What changes is that the &lt;em&gt;scenario&lt;/em&gt; is now readable by someone who&amp;rsquo;s never opened the file before, including someone from an ops team who&amp;rsquo;ll never write a line of Go but absolutely has opinions about how shutdown should behave.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s the part I want to dwell on, because it&amp;rsquo;s the part most BDD adoptions get wrong. The first design decision written down for this work was: &lt;strong&gt;strategic, not universal.&lt;/strong&gt; Use Godog &lt;em&gt;only&lt;/em&gt; where BDD adds clarity. Keep table-driven Go tests as the baseline everywhere else.&lt;/p&gt;
&lt;p&gt;That sounds obvious written down. It is not obvious in practice, because BDD has a gravitational pull. Once a team has feature files, there&amp;rsquo;s a powerful urge to express &lt;em&gt;everything&lt;/em&gt; as feature files, for consistency. And that&amp;rsquo;s how you end up with Gherkin scenarios for a pure function (&lt;code&gt;Given the number 2, When I double it, Then I get 4&lt;/code&gt;) which is pure ceremony. You&amp;rsquo;ve wrapped a one-line table test in a paragraph of English and a step-definition indirection, and made it actively worse.&lt;/p&gt;
&lt;p&gt;The test for whether BDD belongs is this: &lt;strong&gt;is this test a narrative, or is it a matrix?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;A matrix is the same logic with many input/output pairs. That&amp;rsquo;s a table-driven test, that&amp;rsquo;s most unit tests, and Gherkin actively harms them. A narrative is a sequence of steps where the &lt;em&gt;ordering&lt;/em&gt; and the &lt;em&gt;state between steps&lt;/em&gt; is the thing under test, and that&amp;rsquo;s where Gherkin pays for itself. Lifecycle transitions are narratives. A user running three commands in sequence is a narrative. Doubling a number is not.&lt;/p&gt;
&lt;p&gt;go-tool-base drew that line and stuck to it. Feature files live in &lt;code&gt;features/&lt;/code&gt; at the project root, where a non-Go developer can find and read them. Step definitions live in &lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/5c78fc9/test/e2e/steps/controls_steps_test.go" target="_blank" rel="noopener"
 &gt;&lt;code&gt;test/e2e/&lt;/code&gt;&lt;/a&gt;, kept well away from the unit tests. And the unit tests stayed exactly what they were, because they were already the right tool.&lt;/p&gt;
&lt;h2 id="made-to-fit-not-bolted-on"&gt;Made to fit, not bolted on
&lt;/h2&gt;&lt;p&gt;A couple of smaller decisions kept the BDD layer from feeling like a foreign object.&lt;/p&gt;
&lt;p&gt;It runs under &lt;code&gt;go test&lt;/code&gt;. There&amp;rsquo;s no separate Cucumber runner to install or remember. A &lt;code&gt;godog.TestSuite&lt;/code&gt; is invoked from an ordinary &lt;code&gt;TestFeatures(t *testing.T)&lt;/code&gt;, so the BDD scenarios run in the same &lt;code&gt;go test ./...&lt;/code&gt; as everything else. CI didn&amp;rsquo;t need a new concept bolted onto it.&lt;/p&gt;
&lt;p&gt;And the CLI end-to-end tests build the &lt;code&gt;gtb&lt;/code&gt; binary &lt;em&gt;once&lt;/em&gt; and reuse it across every scenario. Compiling a binary per scenario would make the suite slow enough that people would quietly start skipping it, and a test suite people skip is just decoration. Build once, test many.&lt;/p&gt;
&lt;h2 id="stepping-back"&gt;Stepping back
&lt;/h2&gt;&lt;p&gt;go-tool-base brought in Godog for BDD, but the decision worth writing about is the restraint. BDD was applied to exactly two things: the service-lifecycle state machine, where a 300-line goroutine tangle became a four-line scenario anyone can review, and CLI workflows, which are Given/When/Then by their very nature. Everywhere else, table-driven Go tests remained the baseline, because wrapping a matrix test in Gherkin makes it worse, not better.&lt;/p&gt;
&lt;p&gt;The useful rule: BDD fits a &lt;em&gt;narrative&lt;/em&gt;, ordered steps with meaningful state in between, and fights a &lt;em&gt;matrix&lt;/em&gt;. Adopt it as a scalpel for the narratives. Resist the pull to turn it into a religion.&lt;/p&gt;</description></item><item><title>Loaded Testing</title><link>https://phpboyscout.uk/loaded-testing/</link><pubDate>Sat, 30 Jun 2012 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/loaded-testing/</guid><description>&lt;p&gt;I recently had to do some load testing for a site recently that would allow me to test in excess of 100k requests in a 60 second period&amp;hellip;&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="http://jmeter.apache.org/" target="_blank" rel="noopener"
 &gt;&lt;img alt="JMeter" class="gallery-image" data-flex-basis="520px" data-flex-grow="216" data-title-escaped="jmeter-logo" height="102" loading="lazy" sizes="(max-width: 767px) calc(100vw - 30px), (max-width: 1023px) 700px, (max-width: 1279px) 950px, 1232px" src="https://phpboyscout.uk/loaded-testing/jmeter-logo_hu_fffa1c3fe08b4c30.webp" srcset="https://phpboyscout.uk/loaded-testing/jmeter-logo_hu_fffa1c3fe08b4c30.webp 221w" title="jmeter-logo" width="221"&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;So I decided to do some testing using JMeter as it seemed like a suitable tool for doing what I needed and I had used it for some simpler testing in the past.&lt;/p&gt;
&lt;p&gt;After a little fumbling around I managed to get a test plan designed that would simulate 10k users actually navigating the site and adding to a cart etc, with a number of various interactions. It wasnt perfect but it would correctly simulate over 100k requests.&lt;/p&gt;
&lt;p&gt;So feeling quite pleased with myself I started the test from my laptop. Now I&amp;rsquo;m not a big gamer, I&amp;rsquo;m known to play a little World or Warcraft from time to time but that&amp;rsquo;s about it. So when it comes to computing power i tend to opt for battery life over sheer grunt.&lt;/p&gt;
&lt;p&gt;Suffice to say, my laptop fell flat on its face, and if it hadn&amp;rsquo;t it turns out that the connection I was using just wasn&amp;rsquo;t up to the task of handling that much traffic adequately.&lt;/p&gt;
&lt;p&gt;So plan B&amp;hellip;&lt;/p&gt;
&lt;p&gt;I quickly fired up the largest AWS instance available and got a copy of jmeter installed. A little tinkering with my test plan and some googling on how to run jmeter without a gui and a quick&lt;/p&gt;
&lt;p&gt;&lt;code&gt;./jmeter -n -t test-plan.jmx&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;and it appeared to be running.&lt;/p&gt;
&lt;p&gt;(Please bear in mind that I&amp;rsquo;m being overly kind&amp;hellip; it took a LOT of tinkering and twice as much Googling to work out how to get the test results out so i could actually get some idea of WTF was happening during the test)&lt;/p&gt;
&lt;p&gt;So&amp;hellip; client &amp;ldquo;happy&amp;rdquo;&amp;hellip; I decided to go and find a better way to do my load testing in the future.&lt;/p&gt;
&lt;p&gt;Sticking with JMeter I managed to find this gem of a page&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="http://jmeter.apache.org/usermanual/remote-test.html" title="Remote Testing"
 target="_blank" rel="noopener"
 &gt;http://jmeter.apache.org/usermanual/remote-test.html&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;tl;dr &amp;gt; use your local install of jmeter to trigger tests to run on one or more remote &amp;ldquo;nodes&amp;rdquo; and then have all the results sent to your local install.&lt;/p&gt;
&lt;p&gt;So I set to work!&lt;/p&gt;
&lt;h2 id="building-a-node"&gt;&lt;strong&gt;Building a Node&lt;/strong&gt;
&lt;/h2&gt;&lt;p&gt;First I need to set up an AWS instance that we can use and duplicate so I can quickly build a cluster of nodes on demand. I&amp;rsquo;m a big fan of Ubuntu so I spin up a micro instance of 12.04 server. Next I shell into the instance and install the default Java runtime from apt&lt;/p&gt;
&lt;p&gt;&lt;code&gt;apt-get install openjdk-7-jre&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Yes I know there are other more appropriate runtimes, but i dont really care&amp;hellip; i just need it to work and it does.&lt;/p&gt;
&lt;p&gt;next I grab a copy of the latest stable from &lt;a class="link" href="http://jmeter.apache.org/download_jmeter.cgi" title="Download Apache JMeter"
 target="_blank" rel="noopener"
 &gt;http://jmeter.apache.org/download_jmeter.cgi&lt;/a&gt; and un-tar it to &lt;code&gt;/usr/local/jmeter&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;(N.B. JMeter is available through the apt but I had issues with that version and you need to make sure that both your local version and all the nodes run the same version of jmeter)&lt;/p&gt;
&lt;p&gt;We can now test that the install is working running &lt;code&gt;/usr/local/jmeter/bin/jmeter-server&lt;/code&gt; and you should get some output that looks similar to&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Created remote object: UnicastServerRef [liveRef: [endpoint:[10.???.???.???:38939](local),objID:[46522b57:138381f1023:-7fff, 2635011707874933136]]]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Which tells us that the server is running.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;BUT&lt;/strong&gt; unfortunately its not going to work just yet. Because we are using Amazons EC2 we are going to relying on their NAT for routing. Out of the box JMeter just wont work properly.&lt;/p&gt;
&lt;p&gt;However there is something we can do to combat this. We can set the parameter &lt;code&gt;RMI_HOST_DEF&lt;/code&gt; that the &lt;code&gt;/usr/local/jmeter/bin/jmeter-server&lt;/code&gt; script will include in starting the server.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-gdscript3" data-lang="gdscript3"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="n"&gt;RMI_HOST_DEF&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="n"&gt;Djava&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rmi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hostname&lt;/span&gt;&lt;span class="o"&gt;=$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wget&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="mf"&gt;169.254&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;169.254&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;latest&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;public&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;hostname&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;O&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;I&amp;rsquo;ll explain what we are doing here. Amazon have been quite clever by providing a &lt;a class="link" href="http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/AESDG-chapter-instancedata.html" target="_blank" rel="noopener"
 &gt;meta-data endpoint&lt;/a&gt; that you can poll from within your instance to get key pieces of data&amp;hellip; Including the public dns record.&lt;/p&gt;
&lt;p&gt;We can use this endpoint and using wget pipe that into the &lt;code&gt;RMI_HOST_DEF&lt;/code&gt; param (ensuring that we prepend &lt;code&gt;-D&lt;/code&gt;) and then export that so it becomes available to the &lt;code&gt;/usr/local/jmeter/bin/jmeter-server&lt;/code&gt; script.&lt;/p&gt;
&lt;p&gt;Now to get the server to start on boot.&lt;/p&gt;
&lt;p&gt;a quick upstart script should solve this&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-gdscript3" data-lang="gdscript3"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Upstart script to initialise jmeter-server&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;JMeter Server&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;author&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;Dev in Charge &amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt; &lt;span class="n"&gt;started&lt;/span&gt; &lt;span class="n"&gt;networking&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;stop&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt; &lt;span class="n"&gt;stopping&lt;/span&gt; &lt;span class="n"&gt;networking&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;stop&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt; &lt;span class="n"&gt;stopping&lt;/span&gt; &lt;span class="n"&gt;shutdown&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;console&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;script&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# get the current public DNS record&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="n"&gt;RMI_HOST_DEF&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="n"&gt;Djava&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rmi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hostname&lt;/span&gt;&lt;span class="o"&gt;=$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wget&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="mf"&gt;169.254&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;169.254&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;latest&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;public&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;hostname&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;O&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# start jmeter in server mde&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;usr&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;local&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jmeter&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;bin&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jmeter&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;end&lt;/span&gt; &lt;span class="n"&gt;script&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;saving this to &lt;code&gt;/etc/init/jmeter-server.conf&lt;/code&gt; will mean that it will auto-start jmeter-server on boot and allow you to manually control the process using &lt;code&gt;start jmeter-server&lt;/code&gt; and &lt;code&gt;stop jmeter-server&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;and thats it&amp;hellip; instance configured&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="http://aws.amazon.com/" target="_blank" rel="noopener"
 &gt;&lt;img alt="Powered by AWS" class="gallery-image" data-flex-basis="590px" data-flex-grow="245" data-title-escaped="AWS_Logo_PoweredBy_300px" height="122" loading="lazy" sizes="(max-width: 767px) calc(100vw - 30px), (max-width: 1023px) 700px, (max-width: 1279px) 950px, 1232px" src="https://phpboyscout.uk/loaded-testing/AWS_Logo_PoweredBy_300px_hu_8d853d053dae16c9.webp" srcset="https://phpboyscout.uk/loaded-testing/AWS_Logo_PoweredBy_300px_hu_8d853d053dae16c9.webp 300w" title="AWS_Logo_PoweredBy_300px" width="300"&gt;
&lt;/a&gt;All you need to do now is save the instance as an AMI and you have an on-demand image for spinning up a cluster of remote JMeter servers for you to play with.&lt;/p&gt;
&lt;h2 id="configuring-your-local-installation"&gt;Configuring your local installation
&lt;/h2&gt;&lt;p&gt;Now that the server side is working we need to configure our local installation to allow it to connect.&lt;/p&gt;
&lt;p&gt;First things first however, make sure you are using the same version of JMeter as you are running on the server.&lt;/p&gt;
&lt;p&gt;We need to edit the &lt;code&gt;jmeter.properties&lt;/code&gt; file that can be found in the bin folder of the installtion you downloaded. Look for the parameter &lt;code&gt;remote_hosts&lt;/code&gt; This needs to be set with the public dns of the remote server(s) your connecting to. for example&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;remote_hosts=ec2-176-34-164-170.eu-west-1.compute.amazonaws.com,ec2-123-34-456-789.eu-west-1.compute.amazonaws.com
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Thats your local version configured. You will now be able to tell your local version to run tests on any or all of your specified remotes.&lt;/p&gt;
&lt;p&gt;However if your like me you work behind a router/firewall. If so this isnt the end of the story. When you send a test plan to a remote from your local install it will also send the IP address of your local machine for it to send the results back to. JMeter does this by looking up where your current hostname resolves to. In my circumstance it resolved to &lt;code&gt;127.0.1.1&lt;/code&gt;. The reason it did this is down to the fact my systems host file had the line&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;127.0.1.1 devincharge.local
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;To resolve this I had to change it to my external IP address&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;89.345.871.79 devincharge.local
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;And set up port forwarding from my router to my local machine for all ports from 1024 to 65535. Now, you can if you want use specific ports so you dont have to port forward everything from your router, but i&amp;rsquo;ll leave that for you to lookup as there are plenty resources on how to do this for you to google and I&amp;rsquo;ve waffled on for far too long already.&lt;/p&gt;
&lt;p&gt;Happy testing&lt;/p&gt;</description></item></channel></rss>