Technology

Guardrails

Configurable policy constraints that define the boundaries of acceptable AI agent behavior and automatically enforce limits.

Full Definition

Guardrails are configurable safety boundaries and policy constraints that define what an AI agent is and isn't allowed to do. They operate as a real-time enforcement layer that automatically prevents agents from taking prohibited actions, generating harmful content, or violating organizational policies. Guardrails can be implemented at multiple levels: input guardrails (filtering or rejecting problematic user inputs), process guardrails (constraining what tools and data the agent can access), and output guardrails (blocking or modifying inappropriate responses before they reach users). Common guardrails include content filtering (toxicity, hate speech, PII exposure), action limits (transaction amount caps, rate limits), scope constraints (preventing agent from operating outside its designated domain), and compliance checks (regulatory requirement verification). Effective guardrails balance safety with usability — overly restrictive guardrails make agents unhelpful, while insufficient guardrails expose organizations to risk.