Safety & Risk

Privacy Leakage

The inadvertent or unauthorized exposure of personally identifiable information or sensitive data by an AI agent during reasoning or output generation.

Full Definition

Privacy Leakage occurs when an AI agent surfaces, transmits, or includes personally identifiable information (PII), confidential business data, or other sensitive content in its outputs or tool calls without authorization. Privacy leakage in autonomous agents can happen through multiple paths: the agent retrieves documents containing PII during RAG and includes it verbatim in its response; the agent's reasoning trace includes sensitive context it received from another system; the agent passes PII to an external API as part of a tool call parameter; or the agent generates outputs that inadvertently identify individuals from aggregate data. Unlike deliberate data breaches, privacy leakage is typically unintentional — a consequence of the agent's general-purpose reasoning operating on sensitive data without adequate guardrails. GDPR, HIPAA, and CCPA all impose liability for inadvertent PII disclosure regardless of intent. Prevention requires PII detection at both the output level (scanning generated text before delivery) and the tool-call level (scanning parameters before external transmission).

Related Terms

Guardrails

Configurable policy constraints that define the boundaries of acceptable AI agent behavior and automatically enforce limits.

Audit Trail

A chronological, immutable record of every decision, action, and data access made by an AI agent.

Compliance Framework

A structured set of regulations, standards, and guidelines that organizations must adhere to when deploying AI systems.

Action Blocking

The real-time interception and prevention of an AI agent's tool call or action before it executes, based on policy evaluation.

← Back to Glossary Read the Blog →