Sign your own binaries with go-tool-base, part 2: a signing key in AWS KMS

Part 1 left you with a working signing loop and one glaring weakness: the private key was a .pem on your laptop, and files get copied. This part fixes that. You’ll generate the production signing key inside AWS KMS, where it’s created and never comes out, and stand up a role that can sign with it. The key itself is the only thing we build here; wiring CI in over OIDC is Part 3.

The big idea is the same one the deep-dive spends its whole length on, so I’ll keep it short: you never hold the private key and you never sign with it. You ask KMS to sign on your behalf with kms:Sign, and the private half stays inside the HSM for its entire life. There is no export, no download, no “just this once” copy onto a runner. An attacker who owns your CI still can’t walk away with the key, because the key was never on the runner to begin with.

What you’ll need first

This part is OpenTofu (or Terraform, the module is fine with either). You’ll need:

An AWS account you can apply infrastructure into.
An IAM OIDC identity provider already registered in that account. KMS doesn’t need it, but the signer role we create trusts it, so it has to exist. If you haven’t got one, the sibling module terraform-aws-bootstrap provisions it (it’s the same family as the terraform-aws-security-baseline module). Its oidc_provider_arn output is exactly what we feed in below.

You don’t need a CLI, a pipeline or a public key yet. This is just the vault and the key that lives in it.

The module

The key, its alias, the signer role and the full key policy come from one public module, terraform-aws-signing-kms. Here’s the whole consumer block:

data "aws_iam_openid_connect_provider" "gitlab" {
  url = "https://gitlab.com"
}

module "signing_kms" {
  source  = "gitlab.com/phpboyscout/signing-kms/aws"
  version = "0.1.2"

  name        = "acme-release-signing-v1"
  description = "Acme release binary signing"

  oidc_provider_arn  = data.aws_iam_openid_connect_provider.gitlab.arn
  ci_subject_filters = ["project_path:acme/acme-cli:ref_type:tag:ref:v*"]

  key_administrator_arns = [ /* operator role + account root */ ]
  automation_role_arn    = data.aws_iam_role.automation.arn
}

# Carry these forward: Part 3 wires the signer role into CI, Part 4 mints
# the public key from the alias.
output "signer_role_arn" {
  value = module.signing_kms.signer_role_arn
}

output "signing_key_alias" {
  value = module.signing_kms.key_alias_name
}

A few of those values are doing more than they look, so let’s walk them.

name is acme-release-signing-v1, with the v1 on the end deliberately. The name derives the role (<name>-signer) and the alias (alias/<name>), and both of those want to outlive the key. When you rotate to a new key in Part 7 you’ll mint a -v2 and repoint things, so bake the version in now rather than wishing you had.

ci_subject_filters is in GitLab’s OIDC sub format here, and it’s the line that says which pipeline is allowed to assume the signer role: tag pipelines for any v* ref on acme/acme-cli, and nothing else (no branch builds, no merge requests). It’s the heart of Part 3, so I’ll leave the full explanation there. For now, know it’s not optional: an empty list would trust every token from the issuer, and the module refuses to let you do that.

key_administrator_arns and automation_role_arn are the two roles that manage the key, and the distinction between them matters. More on that in a moment.

tofu init
tofu apply

What you just built

A single asymmetric KMS key, plus the IAM scaffolding around it. The key is created with key_usage = SIGN_VERIFY and the spec RSA_4096, which is the module’s default and the one you want.

The obvious question, if you’ve signed things before, is why RSA-4096 and not Ed25519, which is smaller and faster. Two reasons, and neither is preference. The first is that AWS KMS simply doesn’t offer Ed25519 for asymmetric signing, so it’s off the table the moment you decide the key lives in KMS. The second is that OpenPGP, the format your signatures end up in, ties its packet encoding to the signing algorithm: the algorithm isn’t a detail you can swap underneath, it’s written into the bytes. RSA-4096 is the spec that satisfies both constraints, so it’s the secure default and you shouldn’t need to touch it.

Two more things to note about the key itself:

enable_key_rotation is off, and that’s intentional. AWS’s automatic yearly rotation only works on symmetric keys; an asymmetric SIGN_VERIFY key can’t be auto-rotated, because a new key would mean a new public half and every embedded trust anchor breaking at once. Rotation for signing keys is a deliberate, staged operation (mint a new key, publish it, repoint the alias), which is its own part later in the series.
The deletion window defaults to 30 days, the longest AWS allows. For a key this important, the longest possible “oops, undo” window is the safe choice.

The module also creates a stable alias, alias/acme-release-signing-v1. Always reference the key through its alias, never the raw key ID. The alias is what survives rotation: when v2 arrives, the alias gets repointed and everything calling through it keeps working.

Four principals, one key policy

This is the part worth slowing down for. A KMS key is governed by its key policy, and this module writes one policy that names four classes of principal, each with deliberately different reach:

The account root keeps kms:*. That’s AWS’s recommended break-glass: if every other path is locked out, the account owner can still recover.
The key administrators (key_administrator_arns, typically your operator role plus root) can administer the key, schedule its deletion, that sort of thing. They are not signers, and they can’t sign.
The automation role (automation_role_arn, the role your infra apply pipeline assumes) can manage the key as a Terraform resource, read it, tag it, even change its policy. What it deliberately cannot do is kms:Sign. Think about why: the role that applies your infrastructure runs on every change, so if owning that role let an attacker mint signatures, you’d have handed the whole point of the exercise back. Managing the key and using the key are two different powers, and only the signer gets the second one.
The signer role can call kms:Sign, kms:GetPublicKey and kms:DescribeKey, on this one key, and that’s the entire list. It can’t read other keys, can’t administer this one, can’t delete anything.

Here’s the detail that catches people out: the signer role has no attached IAM policy at all. Its permissions live entirely in the key policy. That’s not an oversight, it’s the design. One document, the key policy, is the single source of truth for who can do what to this key, so there’s no second place to check and no way for an attached role policy to drift out of sync with the key policy and quietly grant something nobody intended. If you want to know who can sign, you read one file.

The signer role is assumable via OIDC (those ci_subject_filters again), which is what lets a CI job step into it without any stored credential. That federation is Part 3’s whole job.

The outputs you’ll carry forward

The module hands back everything the later parts consume. The ones you’ll actually use:

key_id and key_arn, the bare ID and full ARN of the key.
key_alias_name and key_alias_arn, the stable alias. key_alias_name is the one you’ll pass to gtb as the KMS key reference, because it survives rotation.
signer_role_arn and signer_role_name, the role CI assumes to sign. signer_role_arn becomes an environment variable in your pipeline next part.

Keep these to hand. Part 3 needs signer_role_arn; Part 4 needs key_alias_name to mint the public key out of KMS.

What this costs

Worth a quick word, because this is the one part of the series that puts a line on an AWS bill. An asymmetric KMS key runs about a dollar a month, plus a tiny per-signature charge on the kms:Sign calls. For release signing, where you sign a handful of checksums files a month, the per-signature cost rounds to nothing. A dollar a month for a key that can’t be stolen off a laptop is the cheapest security control in this entire series.

Where this leaves you

The key exists, it lives somewhere it can never leave, and there’s a role that can sign with it (and a separate role that pointedly can’t). What there isn’t yet is any way for your release pipeline to become that role without a long-lived credential sitting in CI.

That’s the gap Part 3 closes: federating GitLab and GitHub into the signer role over OIDC, so a tagged release can assume it for the length of one job and nothing is stored anywhere. The ci_subject_filters line we glossed over here is where it starts.