Identity & signing
How agents prove who they are, how telemetry is signed end-to-end, and how policy bundles are verified inside the WASM core.
Identity & signing
Every Checkrd agent has a cryptographic identity. That identity signs every telemetry batch, authenticates the agent to the control plane, and gates which policy bundles the agent will accept. This page explains the primitives, the trust boundaries, and what you need to manage operationally.
The design uses only IETF-standard primitives — Ed25519 (RFC 8032), HTTP Message Signatures (RFC 9421), Content-Digest (RFC 9530), and DSSE for policy bundles — so any conformant implementation can verify or interop with the SDK.
Agent identities
When the SDK initializes, it loads an Ed25519 keypair. The private key signs outbound telemetry; the public key is registered with the control plane on first use and looked up server-side to verify subsequent batches.
There are three ways to give an agent its identity:
| Source | When to use | Storage |
|---|---|---|
| Local file (default) | Single-process agents, dev | ~/.checkrd/identity.key, mode 0600 |
from_bytes() | Containers, deterministic identity | Mounted secret, env var (base64) |
| External signer (KMS) | Production fleets, key rotation policies | KMS — key never leaves HSM |
The local-file path is the path of least resistance. For production fleets, prefer an external signer: the SDK calls into your KMS or HSM for each signature and the private key never appears in process memory.
from checkrd import Checkrd, LocalIdentity
# Default: load from ~/.checkrd/identity.key, generate if absent
client = Checkrd(api_key="ck_live_...")
# Deterministic: identity bytes from a mounted secret
identity = LocalIdentity.from_bytes(open("/var/run/secrets/identity").read())
client = Checkrd(api_key="ck_live_...", identity=identity)Once the engine is bound, the private key is zeroized from process memory — only WASM holds a copy. That bounds blast radius if the host process is later compromised.
How telemetry is signed
Every batch the SDK sends to the control plane carries three headers:
| Header | RFC | Purpose |
|---|---|---|
Signature-Input | 9421 | Which fields are signed and the signing parameters |
Signature | 9421 | The Ed25519 signature itself |
Content-Digest | 9530 | SHA-256 of the body, bound into the signature |
The signature covers the HTTP method, target URI, the
Content-Digest, the signing agent's ID, the algorithm
(ed25519), and created / expires parameters. The expires
window is 5 minutes; the control plane rejects anything older than
the wall clock to bound replay.
The signing happens inside the WASM core via the
sign_telemetry_batch FFI export. Wrappers do the I/O; the bytes
that get signed are exactly the bytes that go on the wire.
TELEMETRY_SIGNATURE_MODE controls server enforcement:
off— server accepts unsigned batcheswarn(default) — server accepts but logs unsigned batchesrequired— server rejects unsigned batches with HTTP 401
Anonymous identities (KMS without a registered pubkey) fall back to unsigned with a one-shot warning.
How policy bundles are verified
When the dashboard publishes a new policy, the control plane signs
the bundle with the policy-signing key (held in AWS Secrets Manager
as checkrd/prod/policy-signing-key) and pushes it to every
connected agent over a long-lived SSE stream.
The agent doesn't trust the wire delivery. Every bundle goes through
the WASM core's reload_policy_signed entry point, which verifies
in-WASM against a trust list pinned in the SDK at compile time:
wrappers/python/src/checkrd/_trust.pywrappers/javascript/src/_trust.ts
The trust list contains the public keys the SDK considers
authoritative for policy bundles. A bundle signed by any other key
is rejected and the previous policy stays in place. Pre-publish CI
guards (policy trust-status for Python, verify-trust-roots.mjs
for JavaScript) block tag-driven publishes if the embedded trust
list is empty against api.checkrd.io.
The verification path:
- DSSE envelope parse — wraps the policy YAML in a typed, payload-bound envelope per the DSSE spec.
- Signature verification — Ed25519 against the trust list, in constant time.
- Monotonic version check —
last_policy_versionmust strictly increase. Rollback to an older signed bundle is rejected even if the signature is valid. - Freshness check — bundle's
createdtimestamp must be within 24 hours by default. - Cross-type binding — the DSSE
payloadTypefield is bound into the signature, so a telemetry batch signature can't be replayed as a policy.
On any failure the previous policy stays installed and a structured
warning fires (PolicySignatureError.code carries the stable reason
label for dashboard grouping).
Key rotation
The policy-signing key rotates on an overlap-window pattern:
- Generate the new key.
- Add the new public key to the SDK trust lists.
- Cut a new SDK release that ships both old and new pubkeys.
- Wait for the rollout window (typically 30 days).
- Switch the control plane to sign with the new private key.
- After all agents have updated, drop the old pubkey in a follow-up release.
Until step 6 ships, both old and new keys are accepted, so a botched
rotation never strands customers. The full operator runbook lives in
KEY-CUSTODY.md (internal).
Agent identities (the per-agent keypairs) are independent and rotate on their own schedule — a new keypair is generated whenever the SDK runs without an existing key file, and registered with the control plane on first telemetry post.
What's verifiable independently
Both SDKs ship with reproducible test corpora so you can audit the crypto surface without trusting Checkrd's CI:
- RFC 8032 §7.1 Ed25519 reference vectors, plus the full Project Wycheproof v1 Ed25519 set (150 vectors, 0 failures).
- RFC 9421 §B.2.6 worked example, byte-for-byte.
- DSSE spec compliance against the
secure-systems-lab/dssefixture set. - Mutation tests (
cargo-mutants) on the verification primitives achieve 100% kill rate. Any silent weakening of a signature check fails the test suite.
The WASM core itself ships with an integrity SHA-256 checked at
import time, and PEP 740 attestations / npm provenance bind the
binary to the GitHub workflow that built it. See
WASM-CORE.md
in the SDK repo for the verification recipe.
See also
- Error handling — how policy denies surface in your code
- API authentication — JWT and API key flows for the REST API
- SECURITY.md (Python) — disclosure process and supply-chain controls
- THREAT-MODEL.md (Python) — assets, actors, and what is in scope