Agent tool expansion and system controllability
The more tools there are, the stronger the actions. What really determines whether the system is controllable is state convergence, permission boundaries and failure fallback.
A common situation is to evaluate an Agent system. The first thing you look at is “how many tools it can accept.”
It can check databases, send messages, modify work orders, issue scripts, and operate browsers. It certainly looks more like a truly working system than a chat-only model. So it is easy for the team to go in this direction: if we connect a few more tools, grant more permissions, and make the links more automatic, the system will be stronger.
The problem is, being strong does not mean being controllable.
My judgment is: **The controllability of the Agent system does not depend on the number of tools, but on state convergence, permission boundaries and failure fallback. When there are more and more tool calls, longer and longer contexts, and more and more side effects of actions, without clear constraints and convergence mechanisms, the system will generally become more difficult to predict. **
There are many tools, the solution is the operable range; controllable, the solution is whether the results can be contained
The solution of “adjustable tools” is that Agent no longer only gives suggestions, but can directly affect the real system.
This is capacity expansion, not a fake issue.
But once the system upgrades from “answering questions” to “performing actions,” the engineering focus changes. It is no longer necessary to judge whether the output content sounds like human speech, but to judge:
- What exactly did it do this time;
- Do this move instead of another, safer move;
- How wide the impact will be if you make a mistake;
- When it fails in the middle, will the system stop, retry, or leave half a set of side effects?
- The next time a similar request comes, will it take a completely different path?
In other words, as the number of tools increases, the complexity shifts from “text correctness issues” to “system behavior predictability issues.”
These two types of problems are not of the same magnitude.
A model that can only answer questions is usually cognitive noise if it makes mistakes; if an Agent can adjust more than a dozen tools, once there are no constraints, if it makes mistakes it will be action noise, and if it makes mistakes it will be system noise.
What really gets out of control first is often the state.
Many teams mess up the Agent, and the first reaction is that the model is unstable. In fact, many problems lie outside the model.
For example, a common process:
- Agent first checks the work order system and gets the tasks to be processed;
- Search the knowledge base again to find historical disposal methods;
- Then call the database query;
- Send another message to the duty group;
- Add a final disposal record.
Every step of this link is “reasonable”, but as long as the intermediate state is not clearly defined, the system will quickly encounter these problems:
- The status of the work order has been changed to Processing, but the notification has not been sent;
- The database query has been executed, but the results have not been recorded in the final record;
- The message has been sent to the group, but the subsequent steps failed, but the outside world thought that the matter had been handled;
- After the context window is truncated, it is not sure which actions have been performed before when the second round of execution continues.
These are the system lacks a state convergence mechanism.
An Agent that can perform actions without a clear task state machine is essentially just stringing multi-step side effects on natural language inference.
This looks great in demo, but is difficult to implement in production.
The boundaries of permissions are unclear, and the Agent can easily change from “being able to do things” to “doing too many things”
Another common misunderstanding is to regard tool access as capability inventory.
Connect it to the browser and give it access to any background; When connected to Shell, most commands can be run by default; When connected to the messaging system, you can proactively notify any group by default; Connect to the database and give mixed read and write permissions.
On the surface, this makes the Agent more versatile, but in fact it is betting on the controllability of the system to avoid errors in a single inference.
This is dangerous because the agent’s risk is that it will call a high-side-effect tool in a locally reasonable but globally wrong context.
For example:
- I should have just checked the status, but instead executed the repair script;
- It was supposed to only reply to the current user, but the notification was sent to the group;
- It should only read data, but the update interface was called;
- It should only continue after human review, but the “suggested action” was directly changed to “executed action”.
Therefore, the focus of permission boundary design is to separate high-side-effect actions from high-uncertainty reasoning.
If an action will cause real damage if it is wrong once, it should not be placed in the same layer of automation as ordinary query actions.
With more tools organized, failure is no longer just an “error”, but a semi-completed state.
Of course, ordinary software systems also fail, but the failure of Agent systems has an additional trouble: it is often a half-complete failure across tools, systems, and semantic boundaries.
For example:
- First create a processing task in Jira;
- Go to Slack to send notification;
- Adjust the internal API to pull the log again;
- Finally write the summary back to the knowledge base.
What should the system do if step three fails?
- Rollback a Jira task? -Delete notification just sent?
- Task retained but flag processing interrupted?
- Let another Agent take over?
The most taboo approach here is to understand “failure handling” as asking the model again.
Because many failures are already side effects. What is really needed is:
- Which steps can be retried;
- Which steps must be idempotent;
- Which steps can only be continued after manual confirmation;
- Which external actions must leave an audit trail;
- After a link is interrupted, where will it be reconnected next time it is restored?
If these are not defined, the Agent appears to be automating, but is actually creating manual aftermath work.
The core of controllability lies in “will it converge?”
I am more and more inclined to regard the Agent system as a workflow system with reasoning capabilities, rather than an all-round portal that can chat.
This means that when designing it, the first thing to answer is:
- Whether the task has clearly started, processing, pending confirmation, completed, or failed status;
- Which tools are allowed to be called in each state;
- Which results can be submitted directly and which ones must be reviewed;
- After the context is lost, can the system recover from the external state instead of relying on model recall;
- Whether there are accountable input, output and execution records for every action.
These things don’t sound sexy, but they determine whether Agent is a system that can be gradually scaled up, or a toy that can only be demonstrated in low-risk corners.
A very simple judgment standard is: **If you temporarily change the model to a weaker level, the system’s efficiency will only decrease; if the state machine, permission boundaries and rollback mechanism are removed, the system will not be able to go online immediately. **
This shows that the real foundation is the system’s convergence ability.
A common counterexample: treating Agent as a universal coordinator
Many internal platforms will eventually grow into something very similar to an “AI middle platform”:
- Connect to any system;
- I want to accept any request;
- Try to complete all actions automatically;
- I thought that as long as I made the prompt words a little more detailed, I could suppress the risk.
The biggest problem with this route is that its marginal complexity is very poor.
Because the more request types there are, the more complex the tool semantics, and the greater the number of combinations of success paths and failure paths. Originally I only needed to “check the release records”, but later it became:
- Check records;
- Judgment of abnormality;
- Decide whether to roll back;
- Send notifications;
- Change status;
- Generate review;
- Updated knowledge base.
It looks like a complete automated closed loop. In fact, with every additional step, the system adds a layer of side effect consistency and permission interpretation costs.
In the end, what really eats up the team’s time is often:
- This step will be executed by itself;
- The same problem took different paths today and yesterday;
- The external system has been changed, but the internal records have not kept up;
- After the accident, it is difficult to restore what Agent relied on at the time.
This is to hand over too many high-uncertainty actions to a reasoning process that lacks boundaries. **
A more stable approach is to layer automation rather than stacking tools flat.
If you really want to make the Agent system controllable, I recommend layering it according to risks and side effects:
1. Low risk layer: query and summary
First let the Agent do reading, retrieval, summary, and drafting.
Even if the judgment of this type of action is not perfect, it usually does not directly change the external state, and is more suitable for increasing the amount first.
2. Medium risk layer: single-step action with constraints
For example, you can only change a field in a specific work order status, you can only reply to the current session, and you can only perform operations in the explicit allowlist.
The key here is to compress the action space to be narrow enough to make the cost of errors acceptable.
3. High risk layer: explicit approval and rollback execution
Whenever data deletion, batch operations, cross-system writing, outbound notifications, and production environment script execution are involved, human review, auditing, and rollback mechanisms should be put in the forefront, rather than left to remediation afterwards.
A truly mature Agent knows what should be done automatically, what can only be suggested, and what should never be done directly.
Applicable boundaries
This article mainly discusses:
- An internal agent connected to multiple tools;
- Process-based Agent with real external side effects;
- Orchestration scenarios involving cross-system operations such as messages, work orders, databases, scripts, browsers, etc.
If the Agent is still stuck in low-side-effect tasks such as “helping users summarize web pages” and “helping customer service draft responses”, having more tools may not necessarily lead to immediate loss of control, because most errors still stay at the text layer.
The real problem arises when the system starts directly changing external state. At that time, we were already facing distributed system-style constrained design.
Summary
Agent can call more tools, which will indeed make the system more useful.
But “more useful” and “more controllable” are not words in the same direction.
When the complexity of actions, side effects, and context increases together, what really determines the upper limit of the system is often whether the activation state can be converged, whether the permission boundaries are clear, and whether the activation can be stably stopped and restored after failure.
Otherwise, the more tools are connected, the more the system will become like an executive with strong capabilities but difficult to predict.
读完之后,下一步看什么
如果还想继续了解,可以从下面几个方向接着读。