Featured image of post A signature the platform can't forge

A signature the platform can't forge

A self-updating tool has a chicken-and-egg problem baked into it. The thing doing the updating is the thing being updated, so when it reaches out and pulls down a newer version of itself, it’s the one that has to decide whether to trust what just landed. No human in the loop, nobody to ask. I’ve been closing that gap in go-tool-base’s self-updater in two phases. The first gave it a checksum: download the new binary, hash it, compare it against the release’s checksums.txt. That catches the accidents, the truncated download, the flipped bit on a dodgy mirror. And I said at the time, plainly, that it does nothing about a determined attacker who owns the release platform… the checksums file sits right next to the binary, so whoever can swap one can swap both. I left that as an IOU. This second phase is me paying it.

The thing a checksum can’t do

A checksum is a promise that the bytes you got match the manifest. It says nothing about who wrote the manifest. So if GitLab, or my account, or a leaked CI token gets compromised, the attacker rewrites the binary and the checksums.txt in the same breath, and the hash matches perfectly, because they’re the one who computed it. It’s the same wall I keep walking into whenever I think about supply-chain trust: a checksum is only ever as good as whatever’s standing behind it, and the thing standing behind a checksum is the very platform that just handed you the file. Same hands, both times.

To get past that, you need a signature whose root of trust lives somewhere the platform can’t reach.

The crypto is the easy part

Here’s the bit that caught me slightly off guard while I was building this: the cryptography is the easy part. Verifying a detached OpenPGP signature is a library call, and go-tool-base’s TrustSet wraps it up in one method:

func (t *TrustSet) VerifyManifestSignature(manifest, signature []byte) error {
	// ...
	signer, err := openpgp.CheckArmoredDetachedSignature(
		t.entities, bytes.NewReader(manifest), bytes.NewReader(signature), nil)
	if err != nil {
		return errors.Wrap(ErrSignatureInvalid, err.Error())
	}
	if signer == nil {
		return errors.Wrap(ErrSignatureInvalid, "no signer in trust set matched")
	}
	return nil
}

Hand it the manifest, the detached signature, and a set of trusted public keys (the entities), and it tells you whether any one of them signed it. That’s the whole of the cryptography, and it’s genuinely not where the hard work lives.

The hard work is that set of trusted public keys. Where do they come from? Because if the answer is “we ship them right next to the binary”, well… you’re straight back to the checksum problem. Whoever can swap the binary can swap the key too, sign with their own, and the check waves it through none the wiser.

Pulling the two questions apart

So the design splits along exactly that seam. The verification half is fixed, and deliberately boring (the method above). The trust anchor, the actual keys, comes from a swappable KeyResolver:

// The interface separates "where the trust anchor comes from" from "how a
// signature is verified against it", so SelfUpdater can be wired with
// whichever resolver chain a tool needs without changing verification logic.
type KeyResolver interface {
	Name() string
	Resolve(ctx context.Context) (*TrustSet, error)
}

That little seam is really the whole game. Everything interesting about standing up to a compromised platform comes down to which resolver you hand the updater, and the verification code never has to know the difference.

Three answers to “where does the key live”

The first option is to embed it. Bake the public key straight into the binary at build time (NewEmbeddedResolver), so it rides along inside a release you already trusted enough to run. Tidy and self-contained. The catch is that a future malicious release could embed a different key, so on its own, embedding really just trusts whoever cut the most recent binary.

The second is WKD, the Web Key Directory. Fetch the key over HTTPS from a well-known path on a domain you control (NewWKDResolver), nothing to do with where the release itself is hosted. Now the key isn’t in the binary at all, so poisoning a release doesn’t touch it. You haven’t made the problem disappear, mind… you’ve moved the trust onto your domain’s host and its DNS. A different blast radius, but a blast radius all the same.

The third option is to do both, and make them agree. Run embedded and WKD, and insist they agree:

func (c *CompositeResolver) Resolve(ctx context.Context) (*TrustSet, error) {
	// ... run each child resolver concurrently ...
	if err := checkAgreement(successes); err != nil {
		return nil, err // ErrKeyResolverMismatch
	}
	return successes[0].ts, nil
}

Think of it as the two-key rule on a safe deposit box, or two witnesses who’ve never met telling you the same story. One source on its own you might quietly doubt. But if the key baked into the binary and the key sitting on my domain hand back the same fingerprint, that agreement is worth a great deal more than either of them alone. And if they ever come back different, that’s not a maybe, that’s an alarm: ErrKeyResolverMismatch. Poison one source and the mismatch is the thing that gives the game away.

That composite is the real answer, and it’s why the interface exists at all. There’s nothing a single attacker can get their hands on that holds the whole thing up by itself. The key is baked into a release you trusted, and fetched from a domain well off the release platform, and the two have to match before a single byte of the update is allowed through.

The separation is the whole point

It’s easy to nod along at “two sources” and miss the part that actually does the work. The agreement between the embedded key and the WKD key is only worth something if an attacker can’t reach both of them from the same place. If the key I bake into the binary and the key I serve over WKD both came out of the same release pipeline, whoever owns that pipeline swaps the pair of them, the fingerprints still match, and the cross-check happily waves the forgery through. Same hands, both times. Again.

So they don’t share a pipeline, and that’s the entire design, not an accident of how things ended up. The binary, and the key embedded in it, are built and signed in GitLab CI, which federates into AWS KMS to do the signing itself. The WKD key lives somewhere else completely: a Cloudflare Pages site serving openpgpkey.phpboyscout.uk, deployed by hand at rotation time with the Wrangler CLI and a token allowed to do nothing but edit that one Pages project. No Git integration, no webhook, nothing that lets a push to the repo or a run of the release pipeline so much as touch it. The Cloudflare account is even administered under a different email and a different second factor from the GitLab and AWS ones, so the three anchors really are independent rather than just feeling that way.

Which is what makes them fail independently, and that independence is the only thing that makes the agreement worth checking. To forge a release that survives the cross-check, an attacker doesn’t have to beat one system, they have to beat two unrelated ones, on different platforms, behind different credentials, in the same window, without either of them noticing.

There’s a quieter benefit in the cadence, too. Releases go out constantly and automatically; the WKD key changes rarely, and only ever by hand. So the busy, automated path, the one an attacker is most likely to prise open, is exactly the one with no power to rewrite the key everyone checks against.

Requiring it, without breaking everyone

Now, a check nobody ever switches on is just theatre. But switch it on before the keys are actually out there in people’s installs, and you’ve handed everyone a self-inflicted outage instead. So the default is deliberately timid. The framework ships DefaultRequireSignature = false: a tool built on go-tool-base doesn’t suddenly start rejecting its own updates the day its author bumps the framework version.

The tool author flips it to true in main(), but only after they’ve shipped a release that embeds the key, so every install out there already holds the trust anchor before the first release that insists on one. Ship the key, then turn the lock: the same leave-yourself-a-way-back discipline as any migration you’d like to still have a job after. And the end user still gets an override (update.require_signature, or an env var) for the day it all goes sideways and they need out.

What it actually buys

The first phase stopped accidents. This one stops the platform. And not because the cryptography is clever, OpenPGP checks the signature in a single call, but because the trust anchor is arranged so that nothing the attacker can actually reach holds the whole thing up on its own. A signature only ever proves the sender, never the contents. All of this is really about making “the sender” something a compromised release host can’t quietly fake its way into being.

Which leaves one last thread dangling. The verifying key gets fetched from somewhere, fine… but the signing key, the private half that actually produces these signatures, has to live somewhere the platform can’t reach either, or none of the rest holds up. That’s the capstone, and where this series ends: where that key lives, and why it never leaves the box it’s born in.

Built with Hugo
Theme Stack designed by Jimmy