Checkrd is not an observability tool, a gateway, or a policy library. It’s the one layer you’d otherwise stitch from all three.
Three honest comparisons — what overlaps, what does not, and when each alternative is the right call.
vs. LLM observability tools
Langfuse · Helicone · LangSmith · Phoenix · Braintrust
LLM observability answers what did it do? Checkrd answers what can it do?
| Capability | Observability tools | Checkrd |
|---|---|---|
| Trace every call | ||
| Store prompts and completions for replay | — by design | |
| Evals, datasets, prompt management | — | |
| Block requests before they go out | — | |
| Kill switch, sub-second | — | |
| Policy as code (8 body + 6 header operators) | — | |
| Rate limit by body field (per model, per user) | partial | |
| Cryptographically signed audit trail | — | |
| Org-level unremovable guardrails | — | |
| In-process (no hosted proxy in the critical path) | — | |
| Air-gap deployment | — |
When an observability tool is the right call
If your primary job is iterating on prompts, comparing models, or building eval suites, a tool like Langfuse or Braintrust is purpose-built for it. Checkrd is not an evaluator. Many teams run both: Checkrd enforces policy and audits decisions; the observability tool analyzes trace content where retention is acceptable.
vs. DIY policy code
A middleware you would write on a Friday afternoon
You could build this. Here is what you would reinvent — and where most teams cut corners.
| Capability | DIY | Checkrd |
|---|---|---|
| A single allowlist rule | ||
| URL pattern matching with ** globs | hours | |
| Rate limit by body field (e.g. per model) | days | |
| Kill switch with SSE + file-watcher fallback | weeks | |
| Ed25519 + RFC 9421 + DSSE signed telemetry | months | |
| 150 Wycheproof vectors verified in CI | rarely done | |
| Client-side PII parameterization | often skipped | |
| Compile-time PII allowlist enforcement | difficult | |
| Policy version history, diff, and test runner | months | |
| Seven vendor auto-instrumentors | each built separately | |
| Multi-tenant control plane with RBAC and audit log | months | |
| IAM SCP–style org-level policy merge | unusual to build |
When DIY is the right call
Single-team, non-regulated, with a static scope — DIY is fine. If the surface grows into a rate limiter, an audit trail, a multi-team policy repo, and cryptographic signing, you have built half of Checkrd — usually without the crypto rigor. The real question: is this infrastructure strategic enough to own as a product, or cheaper as a dependency?
vs. AI gateways
Cloudflare AI Gateway · Portkey Gateway · Kong AI plugins
API gateways proxy requests. Checkrd enforces governance.
| Capability | AI gateways | Checkrd |
|---|---|---|
| Route to multiple LLM providers | — | |
| Cache responses | — | |
| Rate limit by caller | ||
| Rate limit by body field (model, user_id) | rarely | |
| Policy matching on body content (8 operators) | — | |
| Gateway sees your prompts in transit | — never | |
| In-process proxy (no external network hop) | — external | |
| Open-source core | varies | |
| Kill switch for agents (sub-second) | — | |
| Cryptographically signed audit trail | — | |
| Works fully air-gapped | — |
When an AI gateway is the right call
If you want provider routing, response caching, or caller-level rate limiting at the edge, Cloudflare AI Gateway or Kong are purpose-built. Checkrd is a different shape — an in-process governance layer, not a routing fabric. Setups that use both are common: the gateway handles routing and caching; Checkrd handles policy, audit, and the kill switch.
When Checkrd isn’t the right fit.
A shorter list than most vendor sites publish. Worth reading before you install.
Single-agent prototypes
If you are experimenting with one agent and have not shipped to production, a simple allowlist in code is enough. Come back when you have two agents, two teams, or one compliance question.
Pure prompt-iteration workflows
If your daily work is writing, diffing, and evaluating prompts, an observability platform is the right tool. Checkrd does not replace eval tooling.
Zero compliance surface
If you have no CISO, no auditor, no regulated customers, and no plans for any of the three — you might be paying for governance you do not need.
Streaming WebSocket enforcement
v1 enforces on HTTP/1.1 and HTTP/2 via Python httpx. WebSocket-native agents and pure gRPC streams are on the roadmap but not shipped.
Still the right fit?
Install in an afternoon. Talk to us if your deployment needs VPC, air-gap, or a vendor-review packet.