Featured image of post Redacting the secret you didn't know was in the string

Redacting the secret you didn't know was in the string

Dammit! How did that get there?

A log line that should never have existed. Not a password I’d carelessly printed, nothing as obvious as that. An upstream API handed me back an error, and it had quoted my own bearer token inside the message, and that error went straight into the logs the way errors do. I didn’t put the secret there. The error did. And I’d never have caught it by being careful, because being careful only protects you from the secrets you know you’re handling.

The easy half of redaction

Hiding the secrets you know about is the part everyone does. You’ve got an API key field, a password flag, so you mask them at the point you print them. key=****. Done, and it feels like you’ve solved redaction, when really you’ve solved the half that was never going to bite you.

The half that bites

The secrets that escape are the ones that arrive inside strings you don’t control. An upstream service echoes your token back in a 401 body. A connection string with the password in the userinfo, https://user:pass@host, lands in a debug line. A library stringifies a whole request, headers and all, for a “helpful” trace. You cannot field-mask a secret you didn’t know was in the string, because you never watched it go in.

You can’t register a value you never had, so match the shape

This is the bit I got wrong in my own head at first. I assumed redaction meant handing it the secrets I was holding so it could watch for them. But the dangerous secrets are exactly the ones I’m not holding a copy of. So pkg/redact doesn’t keep a registry of your values at all. It knows what secrets look like.

pkg/redact/redact.go carries a set of RE2 patterns: a credential in URL userinfo, an Authorization: header sitting in free text, query-string credentials, and the well-known provider prefixes:

prefixPatterns = []*regexp.Regexp{
	regexp.MustCompile(`sk-[A-Za-z0-9_\-]{16,}`),       // OpenAI / Anthropic-style
	regexp.MustCompile(`ghp_[A-Za-z0-9]{30,}`),         // GitHub PAT classic
	regexp.MustCompile(`github_pat_[A-Za-z0-9_]{30,}`), // GitHub fine-grained PAT
	regexp.MustCompile(`xox[baprs]-[A-Za-z0-9-]{10,}`), // Slack
	regexp.MustCompile(`AIza[A-Za-z0-9_\-]{30,}`),      // Google API key
	regexp.MustCompile(`AKIA[A-Z0-9]{16}`),             // AWS access key ID
}

Run any string through redact.String and an OpenAI key, a GitHub token or an AWS access key ID gets caught wherever it’s hiding, in an error you didn’t write, in a URL, in a stack trace, because each has a recognisable shape. For the secrets that don’t announce themselves with a prefix there’s a fuzzy fallback: any opaque alphanumeric run of 41 characters or more. The 41 is chosen on purpose, to clear UUIDs (36), MD5 (32) and git SHA-1 (40) without flagging them, while accepting that a SHA-256 (64) will trip it. A deliberate, documented trade rather than a magic number.

Where it runs

At the boundary where a string leaves for somewhere you can’t reach back into. The telemetry backend runs every event argument and error message through redact.String before it emits anything (pkg/telemetry/telemetry.go), and both telemetry and HTTP logging drop the value of any header redact flags as sensitive. It doesn’t matter which code path produced the string, or whether you even wrote that path; everything goes through the same gate and gets the same scrub.

rust-tool-base’s rtb-redact crate takes the same shape-matching approach: regex patterns, the same family of well-known provider prefixes, and an is_sensitive_header check for header values.

A realistic limit

It isn’t a force field. A secret with no recognisable shape, shorter than the fallback threshold, will sail through. You cannot redact what you cannot recognise. But the leak that actually keeps happening isn’t some exotic unknown, it’s a well-known token turning up in a place you didn’t expect, and a shape-matcher sitting at the edge catches exactly that, including secrets you never told it about. Which is the one thing registering your own values could never have done. Storing the key safely is a separate job, where a CLI keeps it; this is about making sure that, having stored it, it doesn’t quietly fall out through a log.

Built with Hugo
Theme Stack designed by Jimmy