<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Telemetry on PHP Boy Scout</title><link>https://phpboyscout.uk/tags/telemetry/</link><description>Recent content in Telemetry on PHP Boy Scout</description><generator>Hugo -- gohugo.io</generator><language>en-gb</language><copyright>Matt Cockayne</copyright><lastBuildDate>Sat, 06 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://phpboyscout.uk/tags/telemetry/index.xml" rel="self" type="application/rss+xml"/><item><title>The consent you can't ask for</title><link>https://phpboyscout.uk/the-consent-you-cant-ask-for/</link><pubDate>Sat, 06 Jun 2026 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/the-consent-you-cant-ask-for/</guid><description>&lt;img src="https://phpboyscout.uk/the-consent-you-cant-ask-for/cover-the-consent-you-cant-ask-for.png" alt="Featured image of post The consent you can't ask for" /&gt;&lt;p&gt;There&amp;rsquo;s a comfortable story going round about telemetry, and it goes like this.
There are two kinds. There&amp;rsquo;s the creepy kind, the usage data a vendor harvests to
work out who you are and what you do, and that kind needs your permission. And
there&amp;rsquo;s the innocent kind, the operational data a service emits so the people
running it can keep it up, and that kind is just plumbing, nobody&amp;rsquo;s business, no
permission required. Two neat boxes, and only one of them has a lock on it.&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t think the boxes are that neat. And I think a fair few of the people
drawing them that way know it.&lt;/p&gt;
&lt;p&gt;Because there&amp;rsquo;s no clean line where operational data stops being personal. A web
service&amp;rsquo;s logs carry IP addresses. Its traces carry the path you walked through
the system, the ids of the things you touched, sometimes the very fields you sent.
Point at almost any of it and a GDPR lawyer will cheerfully tell you it can be
personal data, and that the law doesn&amp;rsquo;t much care whether you filed it under
&amp;ldquo;analytics&amp;rdquo; or &amp;ldquo;observability&amp;rdquo;. The word you picked to describe the data was never
the thing that decided whether it was personal. The data decided that, and a lot
of operational data is personal.&lt;/p&gt;
&lt;p&gt;So if you can&amp;rsquo;t hide behind the box marked &amp;ldquo;just plumbing&amp;rdquo;, what do you actually
do?&lt;/p&gt;
&lt;h2 id="where-im-coming-from"&gt;Where I&amp;rsquo;m coming from
&lt;/h2&gt;&lt;p&gt;I should say up front that I haven&amp;rsquo;t always been this relaxed about it. I spent a
good few years in righteous fury at every tool that phoned home, every &amp;ldquo;we collect
anonymous telemetry to improve the product&amp;rdquo; I never agreed to. Then I started
building the tools, and I needed the data myself: the kind that tells you which
features people actually use and which command falls over on first run, the kind
that lets you make the next decision with something better than a hunch. And it
softened me. Not into thinking it&amp;rsquo;s fine to take it without asking. Into
understanding why everyone wants to.&lt;/p&gt;
&lt;p&gt;What the fury left me with, the one thing I&amp;rsquo;ve never talked myself out of, is
being pro-choice. Not pro-collection, not anti. Pro-choice. Any tool I ask another
person to run will never quietly opt them into sending me a thing. It asks. On
first run it &lt;a class="link" href="https://phpboyscout.uk/telemetry-that-asks-first/" &gt;makes its case&lt;/a&gt;,
says what it wants and why, and lets them say no and mean it. I&amp;rsquo;ll try hard to win the yes, because the data is genuinely useful and a
tool gets better when people share it. But I won&amp;rsquo;t presume it. The choice is
theirs, and the prompt exists so they actually get to make it.&lt;/p&gt;
&lt;h2 id="the-trouble-with-a-service"&gt;The trouble with a service
&lt;/h2&gt;&lt;p&gt;Which is a lovely principle right up until you build a web service. Because who,
exactly, do you prompt? An API doesn&amp;rsquo;t have a first run. It has a thousand callers
a second, none of them sat at a terminal waiting to tick a box. You can&amp;rsquo;t show a
consent dialog to a webhook. The answer the industry reaches for is &amp;ldquo;consent is
implied by use&amp;rdquo;, and&amp;hellip; maybe. It&amp;rsquo;s a grey area, full stop. Implied consent is the
same hand-wave that gave us the cookie banner, the thing we all click through
without reading. I&amp;rsquo;m not going to stand here and call it clean.&lt;/p&gt;
&lt;p&gt;But there&amp;rsquo;s a version of the principle that survives the grey, and it&amp;rsquo;s the one I
&lt;a class="link" href="https://phpboyscout.uk/telemetry-that-asks-and-telemetry-that-doesnt/" &gt;built the framework around&lt;/a&gt;. Consent belongs to whoever can actually give it. For a
command-line tool, that&amp;rsquo;s the person running it, so you ask them. For a web
service, the person who can give it was never the end user at all, because you
can&amp;rsquo;t reach them. It&amp;rsquo;s the engineer who deploys the thing. They know what their
service collects, who its users are, which law they sit under, whether they owe
anyone a privacy notice. They are the one party in the whole chain who can make
the call with any of the facts in front of them. So that&amp;rsquo;s where the choice goes.&lt;/p&gt;
&lt;p&gt;Which is why, in go-tool-base, the web-service telemetry is a switch. On or off,
the engineer&amp;rsquo;s hand on it, collecting only what you need to keep the lights on by
default. There&amp;rsquo;s no consent prompt, not because consent stopped mattering, but
because there&amp;rsquo;s nobody in the loop I could ask. The accountability sits with the
person who can hold it.&lt;/p&gt;
&lt;h2 id="the-part-ill-own"&gt;The part I&amp;rsquo;ll own
&lt;/h2&gt;&lt;p&gt;I&amp;rsquo;m pro-choice on telemetry, which is exactly why I built a way to switch it off
and a way to force it on. Because for a web service the person holding the choice
was never the end user, it&amp;rsquo;s the engineer who ships it, and &amp;ldquo;pro-choice&amp;rdquo; has to
mean putting the switch in their hand, not pretending a popup would have meant
anything.&lt;/p&gt;
&lt;p&gt;That force-it-on part is the bit I&amp;rsquo;ll answer for. I built a way for a tool author
to bypass the first-run prompt entirely and bake the consent in. There&amp;rsquo;s a real
use case behind it, the enterprise tool deployed under a policy where collection
is contractual rather than optional. But I also know I&amp;rsquo;ve handed someone a way to
take the choice away, and I did it deliberately. Rightly or wrongly, I made the
framework flexible enough to do the wrong thing, and the line I care about is now
only as safe as the judgement of whoever picks it up.&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s the uncomfortable place this lands, and I&amp;rsquo;ve come to think it&amp;rsquo;s the true
one. A framework can put the choice in the right hands. It cannot make the right
choice. I can build the prompt, build the switch, set the defaults to the modest
thing, and after that I have to trust the engineer on the other side to use it
justly and with some wisdom, because there is nothing further down the stack that
makes them. When the blame gets shared out, and it&amp;rsquo;s always shared, a piece of it
has my name on it, for every escape hatch I left in.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m at peace with that, mostly. Not because the grey went away, but because the
alternative, pretending there&amp;rsquo;s a clean line and that &amp;ldquo;operational&amp;rdquo; means &amp;ldquo;not
your problem&amp;rdquo;, is the real dodge. I&amp;rsquo;d rather say it plainly: this data can be
personal, the consent is real even when there&amp;rsquo;s nobody to ask, and the most a tool
can do is hand the decision to the person who can make it, and trust them with it.&lt;/p&gt;</description></item><item><title>Telemetry that asks, and telemetry that doesn't</title><link>https://phpboyscout.uk/telemetry-that-asks-and-telemetry-that-doesnt/</link><pubDate>Thu, 04 Jun 2026 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/telemetry-that-asks-and-telemetry-that-doesnt/</guid><description>&lt;img src="https://phpboyscout.uk/telemetry-that-asks-and-telemetry-that-doesnt/cover-telemetry-that-asks-and-telemetry-that-doesnt.png" alt="Featured image of post Telemetry that asks, and telemetry that doesn't" /&gt;&lt;p&gt;go-tool-base has had a thing called telemetry for a long while now. It&amp;rsquo;s the
opt-in kind: the &lt;a class="link" href="https://phpboyscout.uk/telemetry-that-asks-first/" &gt;product analytics&lt;/a&gt;
that asks a user&amp;rsquo;s permission before it phones a single byte home, sits there as
a no-op until they say yes, and can be wiped on request. The whole package is
built around consent.&lt;/p&gt;
&lt;p&gt;Then the &lt;a class="link" href="https://phpboyscout.uk/building-a-web-service-with-go-tool-base-part-6/" &gt;web-service series&lt;/a&gt;
went and needed telemetry too. Not that telemetry. The other one, the one the
rest of the industry means when it says the word: traces, metrics and logs of a
running service. And the awkward thing about those two is that they share a name,
they want to share a package, and they pull in exactly opposite directions on the
one question that matters most.&lt;/p&gt;
&lt;p&gt;This is the story of how 0.7.x grew a second telemetry without breaking the
first, and where the line between them ended up getting drawn.&lt;/p&gt;
&lt;h2 id="why-bother-putting-it-in-the-framework-at-all"&gt;Why bother putting it in the framework at all
&lt;/h2&gt;&lt;p&gt;The starting point is that I could have left observability out. A reader could
wire up OpenTelemetry in their own service and go about their day. But the six
parts of the web-service series spent a lot of effort making the transports
first-class: a gRPC server, an HTTP server, a gateway, TLS across all of them,
each one a &lt;code&gt;Register&lt;/code&gt; call against the controller. Turning a CLI into a real
long-running service and then shrugging &amp;ldquo;observability is your problem&amp;rdquo; would
have left a hole exactly where it hurts.&lt;/p&gt;
&lt;p&gt;Because a service you can&amp;rsquo;t see into is a liability the moment it leaves your
laptop. The series ended with a macguffin service that was typed, fast and served
over TLS, and was also a black box: when it got slow, you had nowhere to look.
Metrics and traces are how you get the lights on, and they deserved the same
first-class treatment as the things they observe.&lt;/p&gt;
&lt;p&gt;The other half of the reason is that the framework already had a foot in this
world. The analytics package&amp;rsquo;s preferred backend speaks OTLP, the OpenTelemetry
wire protocol. So OpenTelemetry was already in the building. Doing observability
any other way would have meant two standards where one would do.&lt;/p&gt;
&lt;h2 id="the-catch-two-telemetries-opposite-instincts"&gt;The catch: two telemetries, opposite instincts
&lt;/h2&gt;&lt;p&gt;Here&amp;rsquo;s where it gets interesting, and it&amp;rsquo;s the part worth slowing down on.&lt;/p&gt;
&lt;p&gt;The analytics telemetry is about a user. It collects usage data, hashed machine
id, which command ran, exit code, and the entire design assumes you have to ask
first. It is off by default. The collector you get when it&amp;rsquo;s disabled is a no-op,
so nothing is recorded until the user opts in, and there&amp;rsquo;s a deletion path for
when they change their mind. That&amp;rsquo;s not an add-on, that&amp;rsquo;s by design.&lt;/p&gt;
&lt;p&gt;The observability telemetry is about a service. It emits operational data, how
long a request took, which span was slow, how many errored, to a collector the
operator runs. And there is no user in the loop to ask. The operator deploys the
service, points it at their collector, and that act is itself the consent. Asking
would be nonsensical: whose permission, for data about their own service, on
their own infrastructure?&lt;/p&gt;
&lt;p&gt;So you have two things called telemetry, wanting to live in one package, with the
opposite default on consent. One is off until someone says yes; the other is on
the moment it&amp;rsquo;s configured. Get that wiring wrong and you fail in one of two ugly
ways. Gate the operational telemetry behind the user&amp;rsquo;s analytics opt-in, and a
service&amp;rsquo;s tracing silently does nothing because nobody ticked a box meant for
something else. Or loosen the analytics gate to make observability flow, and you
start leaking usage data the user never agreed to share. Neither is acceptable,
and &amp;ldquo;just use two packages&amp;rdquo; throws away everything the two genuinely have in
common.&lt;/p&gt;
&lt;h2 id="what-they-actually-share"&gt;What they actually share
&lt;/h2&gt;&lt;p&gt;Quite a lot, as it turns out, and all of it below the consent line.&lt;/p&gt;
&lt;p&gt;Both ship their data over OTLP to a collector. Both need to describe who is
emitting, the service name and version, the resource in OpenTelemetry&amp;rsquo;s terms.
Both parse an endpoint, attach headers, decide whether the connection is
plaintext. None of that has the faintest thing to do with consent. It&amp;rsquo;s just the
plumbing of getting bytes to a collector, and the analytics backend already had
all of it, written inline.&lt;/p&gt;
&lt;p&gt;So the shape of the solution fell out of the problem. Lift the shared plumbing
into one place, let both telemetries stand on it, and keep the consent decision
firmly out of that shared layer. The structure under &lt;code&gt;pkg/telemetry&lt;/code&gt; ended up
like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pkg/telemetry/
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; telemetry.go the analytics Collector (consent-gated)
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; backend_otel.go its OTLP backend
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; posthog/ datadog/ vendor analytics backends
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; otelcore/ shared: OTLP endpoint, resource, telemetry.* config
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; tracing/ observability signal
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; metrics/ observability signal
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; logs/ observability signal
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; observability.go Setup: builds the enabled signals (implied consent)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The new &lt;code&gt;otelcore&lt;/code&gt; is the keystone. It holds the three things both sides need and
nothing they don&amp;rsquo;t:
&lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/f627270/pkg/telemetry/otelcore/endpoint.go#L22" target="_blank" rel="noopener"
 &gt;&lt;code&gt;ParseEndpoint&lt;/code&gt;&lt;/a&gt;
for the OTLP URL,
&lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/f627270/pkg/telemetry/otelcore/resource.go#L11" target="_blank" rel="noopener"
 &gt;&lt;code&gt;Resource&lt;/code&gt;&lt;/a&gt;
for the service identity, and
&lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/f627270/pkg/telemetry/otelcore/config.go#L33" target="_blank" rel="noopener"
 &gt;&lt;code&gt;Resolve&lt;/code&gt;&lt;/a&gt;
for reading the shared &lt;code&gt;telemetry.*&lt;/code&gt; config (a base endpoint, plus per-signal
overrides, in the same cascade as the TLS config). It imports no signal exporter
and knows nothing about traces, metrics, logs or analytics. It is deliberately
dumb plumbing.&lt;/p&gt;
&lt;h2 id="the-refactor-making-the-old-telemetry-stand-on-the-new-core"&gt;The refactor: making the old telemetry stand on the new core
&lt;/h2&gt;&lt;p&gt;This next part is where the old telemetry and the new one become a single thing.
The analytics OTLP backend was the first user of OTLP in the framework, and it had
grown its own copy of all that
plumbing: a function that parsed the endpoint URL, split out the host and path,
worked out the insecure flag, and built the resource from a service name. Exactly
the code the three new signals were about to need.&lt;/p&gt;
&lt;p&gt;So rather than write it a second time and let the two drift, the analytics
backend was refactored onto &lt;code&gt;otelcore&lt;/code&gt;. Its exporter builder,
&lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/f627270/pkg/telemetry/backend_otel.go#L134" target="_blank" rel="noopener"
 &gt;&lt;code&gt;buildOTelExporterOpts&lt;/code&gt;&lt;/a&gt;,
now calls &lt;code&gt;otelcore.ParseEndpoint&lt;/code&gt;, the same function &lt;code&gt;tracing&lt;/code&gt;, &lt;code&gt;metrics&lt;/code&gt; and
&lt;code&gt;logs&lt;/code&gt; call, and the resource comes from &lt;code&gt;otelcore.Resource&lt;/code&gt;, the same one they
use. One implementation of &amp;ldquo;talk OTLP to a collector&amp;rdquo;, four callers: the
analytics backend and the three observability signals. Change how the framework
forms an OTLP endpoint, and every signal moves together.&lt;/p&gt;
&lt;p&gt;The reassuring part was that the analytics tests didn&amp;rsquo;t budge. The refactor moved
code without changing behaviour, and the consent machinery, the opt-in, the
no-op-when-disabled, the deletion path, never came near &lt;code&gt;otelcore&lt;/code&gt;. Which is
exactly the point.&lt;/p&gt;
&lt;h2 id="where-the-line-is"&gt;Where the line is
&lt;/h2&gt;&lt;p&gt;Because the shared core is the easy half. The half that earns its keep is the bit
that isn&amp;rsquo;t shared, and it&amp;rsquo;s a single, deliberate line.&lt;/p&gt;
&lt;p&gt;The analytics collector keeps its gate. The constructor,
&lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/f627270/pkg/telemetry/telemetry.go#L84" target="_blank" rel="noopener"
 &gt;&lt;code&gt;NewCollector&lt;/code&gt;&lt;/a&gt;,
still returns a no-op the moment telemetry is disabled, so a user who hasn&amp;rsquo;t opted
in gets a collector that silently discards everything. Informed consent, untouched.&lt;/p&gt;
&lt;p&gt;Observability gets a different door entirely.
&lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/f627270/pkg/telemetry/observability.go#L47" target="_blank" rel="noopener"
 &gt;&lt;code&gt;Setup&lt;/code&gt;&lt;/a&gt;
builds whichever signals the operator has switched on, and it is gated only by
&lt;code&gt;telemetry.tracing.enabled&lt;/code&gt; and its siblings, which the operator sets. It never
consults the analytics opt-in. Turning on tracing doesn&amp;rsquo;t turn on analytics;
disabling analytics doesn&amp;rsquo;t silence tracing. The two enable flags live under the
same &lt;code&gt;telemetry.*&lt;/code&gt; config root, sit next to each other, and never read each
other.&lt;/p&gt;
&lt;p&gt;So that&amp;rsquo;s the whole architecture in a sentence: one package, one OTLP export core,
two consent models that share everything except the answer to &amp;ldquo;do we need to
ask&amp;rdquo;. The principle underneath, the one that decided every one of these calls, is
that the &lt;em&gt;kind of data&lt;/em&gt; sets the consent model. Usage data about a person needs
informed consent. Operational data about a service runs on implied consent. The
CLI and the web service are just where each kind tends to live.&lt;/p&gt;
&lt;h2 id="where-this-leaves-the-framework"&gt;Where this leaves the framework
&lt;/h2&gt;&lt;p&gt;0.7.x came out the other side with both telemetries: the one that asks first,
exactly as it was, and a new one that doesn&amp;rsquo;t, because it has nobody to ask. They
share an export core, a config root and a name, and they part company on the only
thing they were ever going to disagree about.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve been careful here to describe how the two consent models are kept apart, not
to argue why they have to be. That argument, that &amp;ldquo;the kind of data decides
the consent model&amp;rdquo; is a line worth holding rather than a convenient bit of
engineering, is a piece of its own, and it&amp;rsquo;s the one I&amp;rsquo;m writing next.&lt;/p&gt;</description></item><item><title>Two telemetry events, one mangled line</title><link>https://phpboyscout.uk/two-events-one-mangled-line/</link><pubDate>Sun, 03 May 2026 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/two-events-one-mangled-line/</guid><description>&lt;img src="https://phpboyscout.uk/two-events-one-mangled-line/cover-two-events-one-mangled-line.png" alt="Featured image of post Two telemetry events, one mangled line" /&gt;&lt;p&gt;A line in a log file that no parser would touch. Not a wrong value, not a missing field. Half of one telemetry event spliced into the middle of another, like two people typing into the same text box at once. Which, it turns out, is pretty much exactly what had happened.&lt;/p&gt;
&lt;h2 id="a-format-with-exactly-one-rule"&gt;A format with exactly one rule
&lt;/h2&gt;&lt;p&gt;rust-tool-base writes its telemetry to a file as JSONL: one JSON object per line, newline at the end, next object on the next line. It&amp;rsquo;s a lovely format to work with precisely because it has one rule, and the rule is simple. Every line is a complete object. Honour that and you can &lt;code&gt;tail&lt;/code&gt; it, &lt;code&gt;grep&lt;/code&gt; it, stream it into anything. Break it once and the whole file is suspect, because now a reader can&amp;rsquo;t trust that a line is a line.&lt;/p&gt;
&lt;p&gt;So the one job the file sink has, beyond writing the right bytes, is to never let two events end up sharing a line.&lt;/p&gt;
&lt;h2 id="appending-is-atomic-though-isnt-it"&gt;&amp;ldquo;Appending is atomic, though, isn&amp;rsquo;t it?&amp;rdquo;
&lt;/h2&gt;&lt;p&gt;The mental model I started with, and I suspect I&amp;rsquo;m not alone, was this: open the file with &lt;code&gt;O_APPEND&lt;/code&gt;, write the serialised event, and the operating system tacks it onto the end atomically. Two writers can&amp;rsquo;t tread on each other because each &lt;code&gt;write&lt;/code&gt; goes to wherever the end currently is, no questions asked. I&amp;rsquo;d half-remembered &lt;code&gt;O_APPEND&lt;/code&gt; as the thing that makes concurrent appending safe, full stop.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s half true, and the half that&amp;rsquo;s missing is the half that bit me.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;O_APPEND&lt;/code&gt; does guarantee one thing: the seek-to-end and the write happen as a unit, so you never get the classic lost-update where two writers compute the same offset and clobber each other. Good. What it does &lt;em&gt;not&lt;/em&gt; guarantee, on POSIX, is that a single &lt;code&gt;write()&lt;/code&gt; of arbitrary size is atomic with respect to other writers. That atomicity has a ceiling, and the ceiling is &lt;code&gt;PIPE_BUF&lt;/code&gt;: 4096 bytes on Linux. Under it, a write is all-or-nothing against other writes to the same file. Over it, the kernel is entirely within its rights to split your write into chunks, and another writer&amp;rsquo;s bytes can land in the gap between them.&lt;/p&gt;
&lt;h2 id="the-fat-event-that-went-over-the-edge"&gt;The fat event that went over the edge
&lt;/h2&gt;&lt;p&gt;For a long time nothing went wrong, which is the most dangerous way for a bug like this to behave. A typical event, a command name, a duration, a status, an attribute or two, serialises to a few hundred bytes. Comfortably under four kilobytes, so comfortably inside the atomic window. Hundreds of them a day, never a problem.&lt;/p&gt;
&lt;p&gt;Then an event came along with a lot of attributes on it, and its serialised form sailed past 4 KiB. Two of &lt;em&gt;those&lt;/em&gt; emitted at roughly the same moment, both over the line, and &lt;code&gt;O_APPEND&lt;/code&gt; did the only thing it had ever promised: it put each write at the end. It said nothing about not interleaving the bytes on the way, because past &lt;code&gt;PIPE_BUF&lt;/code&gt; that was never on offer. One spliced line, one file a parser would now choke on.&lt;/p&gt;
&lt;h2 id="the-fix-isnt-a-bigger-write-its-a-smaller-gate"&gt;The fix isn&amp;rsquo;t a bigger write, it&amp;rsquo;s a smaller gate
&lt;/h2&gt;&lt;p&gt;You can&amp;rsquo;t buy your way out of this with a bigger buffer, because there&amp;rsquo;s no buffer size that&amp;rsquo;s reliably atomic above &lt;code&gt;PIPE_BUF&lt;/code&gt;. The fix is to stop relying on the kernel for mutual exclusion you can do yourself: serialise the events through a lock, so only one &lt;code&gt;write&lt;/code&gt; is ever in flight at a time. The &lt;code&gt;FileSink&lt;/code&gt; carries a mutex for exactly that, and the doc comment on it is the whole post in a paragraph, from &lt;a class="link" href="https://gitlab.com/phpboyscout/rust-tool-base/-/blob/9c22aa8/crates/rtb-telemetry/src/sink.rs#L99" target="_blank" rel="noopener"
 &gt;&lt;code&gt;crates/rtb-telemetry/src/sink.rs&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-rust" data-lang="rust"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;FileSink&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;: &lt;span class="nc"&gt;PathBuf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// Serialises concurrent `emit` calls. Shared across `Clone`s of
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// the same `FileSink` so multiple handles to the same path also
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// serialise correctly.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;gate&lt;/span&gt;: &lt;span class="nc"&gt;Arc&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;tokio&lt;/span&gt;::&lt;span class="n"&gt;sync&lt;/span&gt;::&lt;span class="n"&gt;Mutex&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If you don&amp;rsquo;t write Rust day to day (the &lt;a class="link" href="https://phpboyscout.uk/just-enough-rust-to-follow-along/" &gt;primer&lt;/a&gt; has the rest of the basics): &lt;code&gt;tokio::sync::Mutex&lt;/code&gt; is an async-aware lock, &lt;code&gt;.await&lt;/code&gt; is where a task waits its turn for that lock without blocking the whole thread, and the &lt;code&gt;Arc&lt;/code&gt; wrapper is shared ownership. That &lt;code&gt;Arc&lt;/code&gt; is the load-bearing bit: it means every clone of the &lt;code&gt;FileSink&lt;/code&gt; points at the &lt;em&gt;same&lt;/em&gt; gate, rather than each getting its own lock that guards nothing.&lt;/p&gt;
&lt;p&gt;The detail I like is &lt;em&gt;where&lt;/em&gt; the lock sits. The event is serialised to a string first, outside the critical section, because turning an event into JSON is the expensive part and there&amp;rsquo;s no reason to hold the gate while you do it. Only then does &lt;code&gt;emit&lt;/code&gt; take the lock, and it holds it across the whole open-write-flush, so no other emit can interleave a single byte:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-rust" data-lang="rust"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// Serialise the line outside the critical section.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;serde_json&lt;/span&gt;::&lt;span class="n"&gt;to_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;redacted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;&amp;#39;\n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;_guard&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// ...create parent dir, open with append(true)...
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;as_bytes&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;write_all&lt;/code&gt; makes sure the whole line goes out as one logical write from our side, and the gate makes sure ours is the only one happening. The 4 KiB cliff is still there in the kernel. We just never walk near it any more, because we&amp;rsquo;ve serialised the writers ourselves rather than hoping the OS would.&lt;/p&gt;
&lt;h2 id="the-bit-even-the-lock-cant-fix"&gt;The bit even the lock can&amp;rsquo;t fix
&lt;/h2&gt;&lt;p&gt;There is however a genuine limit, and the comment is upfront about it. The mutex lives in the process. Two &lt;code&gt;FileSink&lt;/code&gt;s in two &lt;em&gt;different&lt;/em&gt; processes, both pointed at the same file, are back to relying on &lt;code&gt;O_APPEND&lt;/code&gt; alone, and back under the 4 KiB ceiling. The lock can&amp;rsquo;t reach across a process boundary, so it doesn&amp;rsquo;t pretend to. The guidance there is the older, duller, correct one: give each process its own file and aggregate them somewhere else. Don&amp;rsquo;t have two processes fighting over one log file and expect the filesystem to referee.&lt;/p&gt;
&lt;h2 id="what-it-comes-down-to"&gt;What it comes down to
&lt;/h2&gt;&lt;p&gt;&lt;code&gt;O_APPEND&lt;/code&gt; is a real guarantee, just a much smaller one than its name talks you into. It keeps your write at the end of the file, and it keeps concurrent writes from interleaving only while they stay under &lt;code&gt;PIPE_BUF&lt;/code&gt;, which on Linux is 4096 bytes. A fat JSON event slides straight over that and takes your file&amp;rsquo;s one rule with it.&lt;/p&gt;
&lt;p&gt;The fix was never exotic. Serialise the line, take a mutex, do the write under it, and the interleave can&amp;rsquo;t happen because there&amp;rsquo;s only ever one writer at a time. The POSIX manual had all of this written down long before I went and learned it the interesting way, which is, I&amp;rsquo;m told, how most people meet &lt;code&gt;PIPE_BUF&lt;/code&gt; too.&lt;/p&gt;</description></item><item><title>Telemetry that asks first</title><link>https://phpboyscout.uk/telemetry-that-asks-first/</link><pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate><guid>https://phpboyscout.uk/telemetry-that-asks-first/</guid><description>&lt;img src="https://phpboyscout.uk/telemetry-that-asks-first/cover-telemetry-that-asks-first.png" alt="Featured image of post Telemetry that asks first" /&gt;&lt;p&gt;Usage telemetry is genuinely useful. Knowing which commands people actually run, where the errors cluster, whether anyone ever touched the feature you spent a fortnight on&amp;hellip; that&amp;rsquo;s the stuff that makes you a better maintainer. Wanting it is completely legitimate.&lt;/p&gt;
&lt;p&gt;The trouble is that the &lt;em&gt;usual&lt;/em&gt; way of getting it, on by default and quietly hoovering up everything, is a small betrayal of the people who installed your tool to get a job done. I wasn&amp;rsquo;t willing to build that, so go-tool-base&amp;rsquo;s telemetry starts from a different question.&lt;/p&gt;
&lt;h2 id="the-data-you-want-and-the-line-you-shouldnt-cross"&gt;The data you want, and the line you shouldn&amp;rsquo;t cross
&lt;/h2&gt;&lt;p&gt;If you maintain a tool, you want to know how it&amp;rsquo;s actually used. Which commands matter and which are dead weight. Where the error rate spikes. Whether anyone touched the feature you spent that fortnight on. That information makes you a better maintainer, and, to say it again, wanting it is completely legitimate.&lt;/p&gt;
&lt;p&gt;The trouble is the standard way of getting it. Telemetry on by default. An opt-out buried three levels down in a settings file nobody reads. And once it&amp;rsquo;s running, it quietly collects far more than it ever admitted to: the arguments people passed, the paths they were working in, an IP address for good measure.&lt;/p&gt;
&lt;p&gt;Every one of those is a small betrayal of someone who installed your tool to get a job done, not to become a data point. And the cost when users notice isn&amp;rsquo;t a slap on the wrist. It&amp;rsquo;s trust, and trust in a developer tool does not grow back quickly. A tool that surprises you once with what it was quietly collecting is a tool you uninstall and warn your colleagues about.&lt;/p&gt;
&lt;p&gt;So go-tool-base&amp;rsquo;s telemetry started from a different question. Not &amp;ldquo;how do we collect the most data&amp;rdquo; but &amp;ldquo;how do we collect &lt;em&gt;useful&lt;/em&gt; data without ever putting the user in a position they didn&amp;rsquo;t choose&amp;rdquo;.&lt;/p&gt;
&lt;h2 id="rule-one-it-is-off-until-you-say-otherwise"&gt;Rule one: it is off until you say otherwise
&lt;/h2&gt;&lt;p&gt;The foundation is the simplest possible rule, and it&amp;rsquo;s absolute. Telemetry is &lt;strong&gt;never enabled by default.&lt;/strong&gt; A freshly installed tool built on go-tool-base sends nothing. Not a heartbeat, not a ping, nothing at all.&lt;/p&gt;
&lt;p&gt;It only starts collecting when the user makes an explicit, visible choice to let it. Three honest doors: they run &lt;code&gt;telemetry enable&lt;/code&gt;, they say yes to a clear prompt during &lt;code&gt;init&lt;/code&gt;, or they set &lt;code&gt;TELEMETRY_ENABLED&lt;/code&gt; themselves. All three are deliberate acts. None of them is a pre-ticked box or a default they have to discover and then undo.&lt;/p&gt;
&lt;p&gt;This is opt-&lt;em&gt;in&lt;/em&gt;, and the distinction from a well-hidden opt-&lt;em&gt;out&lt;/em&gt; is the entire point. Opt-out telemetry treats consent as something to be assumed and grudgingly reversed. Opt-in treats it as something that has to be &lt;em&gt;given&lt;/em&gt;. Only one of those is actually consent.&lt;/p&gt;
&lt;h2 id="rule-two-no-personally-identifiable-information-full-stop"&gt;Rule two: no personally identifiable information, full stop
&lt;/h2&gt;&lt;p&gt;Consent to &amp;ldquo;some telemetry&amp;rdquo; is not consent to &amp;ldquo;any telemetry&amp;rdquo;, so the second rule constrains what can ever be collected, even from a user who&amp;rsquo;s opted in.&lt;/p&gt;
&lt;p&gt;No personally identifiable information. The framework does not record command arguments (they routinely contain paths, hostnames, the occasional secret someone&amp;rsquo;s pasted in). It does not record file contents. It does not record IP addresses.&lt;/p&gt;
&lt;p&gt;It does need &lt;em&gt;some&lt;/em&gt; notion of &amp;ldquo;distinct installations&amp;rdquo; for the numbers to mean anything, so it derives a machine ID from a handful of system signals and runs it through &lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/5c78fc9/pkg/telemetry/machine.go#L12" target="_blank" rel="noopener"
 &gt;SHA-256&lt;/a&gt;. What leaves the machine is a hash. It tells you &amp;ldquo;this is the same install as last week&amp;rdquo; and tells you precisely nothing about whose install it is, and the hash can&amp;rsquo;t be walked backwards into the signals it came from.&lt;/p&gt;
&lt;p&gt;The events themselves are deliberately thin. Which command ran, roughly how long it took, whether it errored. The shape of usage, not a transcript of it.&lt;/p&gt;
&lt;h2 id="rule-three-the-author-picks-the-destination"&gt;Rule three: the author picks the destination
&lt;/h2&gt;&lt;p&gt;Even with consent given and PII excluded, there&amp;rsquo;s a third question: where does the data actually &lt;em&gt;go&lt;/em&gt;? go-tool-base doesn&amp;rsquo;t answer that for you, because it can&amp;rsquo;t. A corporate internal tool, an open-source CLI and an air-gapped utility have completely different right answers.&lt;/p&gt;
&lt;p&gt;So the backend is the tool author&amp;rsquo;s choice. The framework ships several (a noop backend, stdout, a file, plain HTTP, and OpenTelemetry over OTLP) and supports custom ones. The noop backend matters more than it looks: it lets a tool wire up the whole telemetry surface, commands and all, while sending data precisely nowhere. A perfectly reasonable, fully supported configuration.&lt;/p&gt;
&lt;p&gt;Pluggable backends also mean the data never has to touch any infrastructure I run. It goes where the tool&amp;rsquo;s author decides, on their terms. The framework provides the plumbing and stays well out of the destination.&lt;/p&gt;
&lt;h2 id="and-a-way-back-out"&gt;And a way back out
&lt;/h2&gt;&lt;p&gt;One last thing, because it&amp;rsquo;s the part that makes the opt-in real rather than decorative. A user who opted in can opt straight back out, and the package includes a &lt;a class="link" href="https://gitlab.com/phpboyscout/go-tool-base/-/blob/5c78fc9/pkg/telemetry/deletion.go#L24" target="_blank" rel="noopener"
 &gt;GDPR-aligned deletion path&lt;/a&gt;, so &amp;ldquo;stop, and remove what you have&amp;rdquo; is an actual supported request rather than a polite fiction.&lt;/p&gt;
&lt;p&gt;Consent you can&amp;rsquo;t withdraw isn&amp;rsquo;t consent. It&amp;rsquo;s a one-way door with a friendly sign on it. The deletion path is what keeps the front door an actual door.&lt;/p&gt;
&lt;h2 id="the-bottom-line"&gt;The bottom line
&lt;/h2&gt;&lt;p&gt;Telemetry is genuinely useful to a maintainer and genuinely dangerous to the trust of the people running the tool, and the usual implementation (on by default, opt-out buried, collecting everything) spends that trust recklessly. go-tool-base&amp;rsquo;s telemetry holds three lines: never enabled without an explicit user action, never collecting personally identifiable information even once enabled, and always sending data to a destination the tool&amp;rsquo;s author chose, up to and including nowhere. A real deletion path makes the opt-in something you can take back.&lt;/p&gt;
&lt;p&gt;You can have your usage numbers. You just have to ask for them, the way you would for anything else that wasn&amp;rsquo;t yours to begin with.&lt;/p&gt;</description></item></channel></rss>