checkrd

LangChain (Python)

Add policy enforcement, kill switch, and observability to LangChain and LangGraph chains.

LangChain / LangGraph (Python)

Checkrd ships a BaseCallbackHandler that hooks into every LLM call, tool call, retriever call, and chain invocation in LangChain and LangGraph. Denials surface as CheckrdPolicyDenied exceptions on the calling stack — LangChain propagates them naturally, so your existing error handling Just Works.

Install

bash
pip install 'checkrd[langchain]'

This pulls langchain-core>=0.3,<1. Compatible with LangChain itself (langchain), LangGraph, and any third-party Runnable.

Quickstart

python
from checkrd import Checkrd
from checkrd.integrations.langchain import CheckrdCallbackHandler
from langchain_openai import ChatOpenAI

with Checkrd(policy="policy.yaml") as client:
    handler = CheckrdCallbackHandler.from_checkrd(client)

    llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
    print(llm.invoke("Tell me a joke"))

Checkrd.from_checkrd() pulls the engine, agent_id, sink, and enforce mode from the client's runtime so the handler matches every other Checkrd instrumentor in the same process.

Per-call attach

If you don't want to register the handler on the LLM itself, attach it per-call via RunnableConfig:

python
chain.invoke(
    {"question": "x"},
    config={"callbacks": [handler]},
)

This pattern is preferred when one process serves multiple agents — each invocation gets its own handler bound to the right agent_id.

Async chains

The handler subclasses BaseCallbackHandler. LangChain's dispatcher automatically runs sync callbacks via asyncio.to_thread for .ainvoke() paths, so the same handler instance works for both sync and async chains. The WASM engine's evaluate() is a sub-millisecond synchronous call; the thread-pool overhead is negligible.

python
result = await chain.ainvoke(input, config={"callbacks": [handler]})

What gets enforced

Every event LangChain emits is policy-evaluated through a synthetic URL the policy engine matches against:

LangChain eventSynthetic URLBody
on_llm_starthttps://langchain.local/llm/{model}{"prompts": [...]}
on_chat_model_starthttps://langchain.local/chat_model/{model}{"messages": [[...]]}
on_tool_starthttps://langchain.local/tool/{tool_name}{"input_str": ..., "inputs": ...}
on_retriever_starthttps://langchain.local/retriever/{name}{"query": ...}
on_chain_starthttps://langchain.local/chain/{name}{"inputs": ...}

Write rules against these URLs the same way you write any other Checkrd rule:

yaml
agent: research-agent
default: allow

rules:
  - name: deny-shell-tools
    deny:
      url: "langchain.local/tool/shell*"

  - name: limit-llm-calls
    rate_limit:
      url: "langchain.local/chat_model/*"
      limit: 100
      window_secs: 3600

Observation mode

Set enforce=False (or pass enforce_override=False to Checkrd(...)) to log denies without aborting:

python
handler = CheckrdCallbackHandler(
    engine=engine,
    agent_id="research",
    sink=sink,
    enforce=False,    # observation mode — log only
)

Useful for rolling out a new policy in shadow mode before flipping enforcement on.

Constructing without Checkrd

If your application has its own engine lifecycle, construct directly:

python
import checkrd

with checkrd.init(policy="policy.yaml"):
    handler = CheckrdCallbackHandler.from_global()
    # ... use with LangChain ...

Or fully explicit:

python
handler = CheckrdCallbackHandler(
    engine=my_engine,
    agent_id="my-agent",
    sink=my_sink,
    enforce=True,
    dashboard_url="https://app.checkrd.io",
)

Telemetry

When a TelemetrySink is configured, every event emits a structured record after completion (or on error). Fields include event_type (e.g. langchain_llm, langchain_tool), agent_id, request_id (matches LangChain's run_id), latency_ms, target (model or tool name), outcome (ok / error), and — for LLM events with usage data — input_tokens, output_tokens, and finish_reason.

run_id is preserved end-to-end: a single LangGraph workflow shows up in the dashboard as a single trace.

Caveats

  • raise_error=True is required. The handler sets this on itself; do not override. Without it, LangChain swallows handler exceptions and the deny decision is lost.
  • Token counts depend on the LLM provider. Anthropic and OpenAI populate them reliably; some local models do not.
  • Streaming: on_llm_new_token is not currently policy-evaluated (per-token gating would 100x the eval call rate). The first / last token boundaries are gated via on_llm_start / on_llm_end.