Blocking Agent -

: The blocking agent needs access to the current "state" (conversation history) to identify context-specific risks that might not be apparent in a single message.

: A blocking agent must return deterministic results (e.g., "Pass" or "Fail"). For example, a "ContentFilterMiddleware" might check for banned keywords and return a jump_to: "end" signal to skip further processing if a violation occurs. blocking agent

: This is the "brain" that analyzes incoming data against your rules. In production systems, this often involves a smaller, faster model (like GPT-4o-mini or Claude Haiku) optimized specifically for classification and risk detection. : The blocking agent needs access to the

: Explicitly list what the agent is not allowed to do. This might include blocking the output of API keys, preventing the execution of destructive commands (like rm -rf ), or filtering toxic language. : This is the "brain" that analyzes incoming

: The blocking logic should be decoupled from the primary agent. This allows you to update security policies or "constitutions" without having to retrain or reconfigure the main task-oriented agent. Step-by-Step Development Process

and every week there is a new fire ship video dropping something new where you're like "Oh shit do we now also need to know this?" YouTube·Dave Ebbelaar

Scroll to Top