A crucial issue has been identified in the development of agents with tool-calling capabilities: the lack of enforcement on proposed actions. This becomes particularly problematic when agents control real-world side effects, such as APIs, infrastructure, payments, and workflows.
In a typical setup, the model proposes an action, which is then validated and potentially executed. However, this approach can lead to unintended consequences, as the model indirectly controls the execution. To address this, a new approach has been proposed: proposal -> (policy + state) -> ALLOW / DENY -> execution. This key constraint ensures that no authorization means no execution path, preventing denied actions from reaching the tool.
A demo of this approach can be found on GitHub. As LLM agents transition from thinking to acting, the risk shifts from output to side effects. Most systems currently lack a clear boundary between the two, highlighting the need for effective mitigation strategies.
Developers are encouraged to share their approaches to handling this issue: do you gate execution before tool calls, or rely on retries and monitoring after the fact?
Photo by Rafael Nicida on Pexels
Photos provided by Pexels
