Overview
A TraceCtrl guardrail is a declarative check that runs an LLM judge over an agent’s input or output. Each evaluation emits atracectrl.guardrail.evaluation OTEL span; failed evaluations land in the Alerts page (/alerts) and the agent’s registered guardrails appear in the Guardrails registry (/guardrails).
The in-SDK guardrails work today with Strands agents. The core Guardrail class is framework-agnostic; the Strands binding lives in tracectrl.guardrails.strands_hook.
The Guardrail class
| Field | Type | Notes |
|---|---|---|
name | str | Stable identifier surfaced in the registry and Alerts UI. |
description | str | Free-form description shown on the guardrail detail page. |
judge_prompt | str | Template containing {output} (for post_output) or {input} (for pre_input). |
judge_llm | Any | A Strands BedrockModel-compatible object. The SDK calls Bedrock converse directly via boto3 using the model’s model_id and region_name. |
on_violation | "log" | Only "log" is supported in v1. Passing "block" raises NotImplementedError. |
timing | "post_output" | "pre_input" | When to run the judge. Defaults to "post_output". |
severity | "low" | "medium" | "high" | "critical" | Defaults to "medium". |
Placeholder convention
The judge prompt is formatted with simple string replacement (notstr.format). Use {output} for post-output guardrails and {input} for pre-input guardrails. Both placeholders are scanned — whichever is present in the template is replaced with the evaluated text.
Wiring guardrails to an agent
wrap_agent_with_guardrails(agent, guardrails) swaps the agent’s class for a generated subclass whose __call__ runs pre-input and post-output guardrails around the wrapped invocation. The agent instance is returned for chaining and re-wrapping is idempotent.
register_guardrails(agent, guardrails) only emits the registration spans — it does not wrap the agent. Use this when wrapping happens elsewhere or you want guardrails to appear in the registry without runtime evaluation.
End-to-end example
Span attributes emitted
EveryGuardrail.evaluate() call emits a tracectrl.guardrail.evaluation span as a child of the active agent span:
| Attribute | Description |
|---|---|
tracectrl.guardrail.name | The name field of the Guardrail. |
tracectrl.guardrail.judge_model | Resolved from judge_llm.model_id / .model / config. |
tracectrl.guardrail.severity | "low" / "medium" / "high" / "critical". |
tracectrl.guardrail.timing | "post_output" or "pre_input". |
tracectrl.guardrail.decision | "pass", "fail", or "error" (judge invocation failed). |
tracectrl.guardrail.reason | One-sentence explanation from the judge. |
tracectrl.guardrail.evidence | Verbatim snippet that triggered a fail; empty on pass. |
tracectrl.guardrail.evaluated_at | UTC ISO timestamp. |
tracectrl.agent.id | Stamped so the engine can join the violation back to the agent. |
tracectrl.agent.name | Human-readable agent name. |
wrap_agent_with_guardrails additionally emits one tracectrl.guardrail.registered span per guardrail when the agent is wrapped:
| Attribute | Description |
|---|---|
tracectrl.guardrail.mode | "monitoring" for on_violation="log". |
tracectrl.guardrail.description | Field from the Guardrail. |
tracectrl.guardrail.judge_prompt | Full prompt template, truncated to 16KB. |
tracectrl.guardrail.registered_at | UTC ISO timestamp. |
tracectrl.guardrail.health | "active" on registration; the engine flips to "error" when a subsequent evaluation reports decision="error". |
How the dashboard renders this
- The Guardrails registry at
/guardrailslists every guardrail whose registration span has been ingested, grouped by agent. - The Alerts page at
/alertslists every evaluation wheredecision="fail"or"error", with thereasonandevidencerendered inline and a link back to the parent trace.
Error handling
The judge is called via Bedrock’sconverse API with a forced tool-call schema. If the judge fails or returns invalid JSON twice in a row, the SDK conservatively treats the evaluation as a pass — a broken judge must not spam false-positive violations. The span still carries decision="error" so health monitoring catches the regression.
Exceptions raised inside Guardrail.evaluate() never bubble up to the agent.
