LangChain (Python)
Add policy enforcement, kill switch, and observability to LangChain and LangGraph chains.
LangChain / LangGraph (Python)
Checkrd ships a BaseCallbackHandler that hooks into every LLM call, tool call, retriever call, and chain invocation in LangChain and LangGraph. Denials surface as CheckrdPolicyDenied exceptions on the calling stack — LangChain propagates them naturally, so your existing error handling Just Works.
Install
pip install 'checkrd[langchain]'This pulls langchain-core>=0.3,<1. Compatible with LangChain itself (langchain), LangGraph, and any third-party Runnable.
Quickstart
from checkrd import Checkrd
from checkrd.integrations.langchain import CheckrdCallbackHandler
from langchain_openai import ChatOpenAI
with Checkrd(policy="policy.yaml") as client:
handler = CheckrdCallbackHandler.from_checkrd(client)
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
print(llm.invoke("Tell me a joke"))Checkrd.from_checkrd() pulls the engine, agent_id, sink, and enforce mode from the client's runtime so the handler matches every other Checkrd instrumentor in the same process.
Per-call attach
If you don't want to register the handler on the LLM itself, attach it per-call via RunnableConfig:
chain.invoke(
{"question": "x"},
config={"callbacks": [handler]},
)This pattern is preferred when one process serves multiple agents — each invocation gets its own handler bound to the right agent_id.
Async chains
The handler subclasses BaseCallbackHandler. LangChain's dispatcher automatically runs sync callbacks via asyncio.to_thread for .ainvoke() paths, so the same handler instance works for both sync and async chains. The WASM engine's evaluate() is a sub-millisecond synchronous call; the thread-pool overhead is negligible.
result = await chain.ainvoke(input, config={"callbacks": [handler]})What gets enforced
Every event LangChain emits is policy-evaluated through a synthetic URL the policy engine matches against:
| LangChain event | Synthetic URL | Body |
|---|---|---|
on_llm_start | https://langchain.local/llm/{model} | {"prompts": [...]} |
on_chat_model_start | https://langchain.local/chat_model/{model} | {"messages": [[...]]} |
on_tool_start | https://langchain.local/tool/{tool_name} | {"input_str": ..., "inputs": ...} |
on_retriever_start | https://langchain.local/retriever/{name} | {"query": ...} |
on_chain_start | https://langchain.local/chain/{name} | {"inputs": ...} |
Write rules against these URLs the same way you write any other Checkrd rule:
agent: research-agent
default: allow
rules:
- name: deny-shell-tools
deny:
url: "langchain.local/tool/shell*"
- name: limit-llm-calls
rate_limit:
url: "langchain.local/chat_model/*"
limit: 100
window_secs: 3600Observation mode
Set enforce=False (or pass enforce_override=False to Checkrd(...)) to log denies without aborting:
handler = CheckrdCallbackHandler(
engine=engine,
agent_id="research",
sink=sink,
enforce=False, # observation mode — log only
)Useful for rolling out a new policy in shadow mode before flipping enforcement on.
Constructing without Checkrd
If your application has its own engine lifecycle, construct directly:
import checkrd
with checkrd.init(policy="policy.yaml"):
handler = CheckrdCallbackHandler.from_global()
# ... use with LangChain ...Or fully explicit:
handler = CheckrdCallbackHandler(
engine=my_engine,
agent_id="my-agent",
sink=my_sink,
enforce=True,
dashboard_url="https://app.checkrd.io",
)Telemetry
When a TelemetrySink is configured, every event emits a structured record after completion (or on error). Fields include event_type (e.g. langchain_llm, langchain_tool), agent_id, request_id (matches LangChain's run_id), latency_ms, target (model or tool name), outcome (ok / error), and — for LLM events with usage data — input_tokens, output_tokens, and finish_reason.
run_id is preserved end-to-end: a single LangGraph workflow shows up in the dashboard as a single trace.
Caveats
raise_error=Trueis required. The handler sets this on itself; do not override. Without it, LangChain swallows handler exceptions and the deny decision is lost.- Token counts depend on the LLM provider. Anthropic and OpenAI populate them reliably; some local models do not.
- Streaming:
on_llm_new_tokenis not currently policy-evaluated (per-token gating would 100x the eval call rate). The first / last token boundaries are gated viaon_llm_start/on_llm_end.