The Container Is Not the Contract

Sutha Kamal’s The Container Was Always There is a useful essay because it identifies a real confusion in the current agent discourse: people talk about “security” when what they often mean is “governance.” That distinction matters.

Security, in the article’s framing, is about whether an input is malicious, whether an output leaks sensitive information, and whether a given payload should be routed to one model provider or another. Governance is different. Governance is about what the system is allowed to touch in the first place: which files, which databases, which credentials, which network endpoints, which subprocesses, and under what approval path. Kamal’s key argument is that governance must be enforced outside the agent, because an agent that is compromised cannot be trusted to police itself. (Sutha's Substack)

That is the strongest part of the essay, and it is correct.

The article also makes a needed correction to shallow “we run it in Docker, so it’s fine” thinking. Kamal describes Cortex as already containerized, with filesystem and network boundaries, but admits that those limits were implicit and accidental rather than explicit, auditable, and policy-driven. In that sense, the “container” existed, but the governance model did not. That distinction is important. A boundary that happens to exist is not the same thing as a boundary that has been deliberately specified, reviewed, logged, and enforced as policy. (Sutha's Substack)

The essay becomes more interesting when it introduces the split between structural governance and semantic governance. Structural governance controls capability at the infrastructure layer. Semantic governance classifies the content itself: whether the payload contains health data, whether it should go to one provider and not another, whether a request is suspicious. Kamal’s argument is that both are necessary. Structural controls without semantic awareness are blind, while semantic controls living inside the agent can be bypassed if the agent is compromised. (Sutha's Substack)

This is a good framing, and probably the most useful concept in the piece.

Still, the essay stops short of a full governance architecture.

Its main weakness is that it treats out-of-process enforcement as if that gets us most of the way there. It does not. Out-of-process enforcement is necessary, but it is only one layer. A serious governance model also needs identity, signed decisions, approval provenance, policy versioning, attestation of the enforcement point, and evidence that can later answer a hard question: who or what authorized this action, under which policy, with what inputs, and where is the record that proves it?

The article gestures toward audit trails and human approval, which is good, but mostly at the level of intention rather than mechanism. It says an agent should be able to request elevated permissions with a human approval step, and that policy files and audit trails should replace implicit Docker defaults. That is right as far as it goes. But once you move from architecture sentiment to production reality, the hard parts begin. (Sutha's Substack)

Who signs the approval?

What exactly is approved: a class of actions, a one-time exception, a time-boxed lease, a capability token?

How is the approval bound to a specific request so it cannot be replayed?

How are denials, overrides, and emergency breaks represented?

How do you prove later that the sidecar or policy engine made the decision, rather than a compromised component forging the record?

These are not secondary questions. They are the difference between “a secure-looking system” and a governable one.

The essay’s enthusiasm for OpenShell also deserves a more measured reading. Kamal praises it as a strong attempt to make agent governance explicit and describes its sandbox, policy engine, and privacy router as a serious improvement over ad hoc Docker boundaries. That may all be true in his testing, but it still makes OpenShell an implementation candidate, not the governance model itself. The model is the set of contracts, controls, and proofs around authority. OpenShell may be one enforcement substrate for that model. It is not the model. (Sutha's Substack)

This matters because the article risks encouraging a subtle misconception: that if enforcement has moved outside the agent, governance has been solved. It has not. The location of enforcement is not the same as the completeness of governance.

The “semantic sidecar” proposal is where the essay points toward something stronger. Kamal recognizes that if the content classifier stays inside the agent, a compromised agent can lie about the sensitivity of its own payload and thereby trick the structural layer into allowing exfiltration. His proposed fix is to route payload inspection through an isolated sidecar that the agent cannot tamper with, allowing the governance layer to combine structural policy with semantic classification. (Sutha's Substack)

That is the right direction, but the deeper principle is not just “the guard can read.” The deeper principle is separation of duties.

One component should propose actions.

Another should classify data.

Another should enforce ceilings.

Another should log and attest the decision.

Another, where required, should represent human approval.

And none of these components should be able to silently rewrite the authority of the others.

That is the missing language in much of the agent-security discussion. The problem is not merely one of sandboxing. It is a problem of authority architecture.

A well-governed agent system should therefore be designed more like a high-assurance workflow than a clever chatbot with tools. The agent should not possess broad standing power. It should operate under narrow, inspectable capability leases. It should not self-certify the sensitivity of its own data. It should not silently gain new tool access because a developer changed a config file on a Friday night. It should not be able to produce an action log that no external system can verify.

In that respect, Kamal’s essay is valuable precisely because it reveals the next layer of engineering work that must happen. It is not enough to say, “the agent lives in a container.” It is not enough to say, “the proxy is outside the process.” It is not enough even to say, “the sidecar reads the payload.” A mature agent stack needs explicit contracts between components, signed policy artifacts, stable evidence formats, replay-resistant approvals, and independent verification. Otherwise the system remains interpretive and trust-heavy, even if its walls look impressive.

So the fairest judgment is this:

The Container Was Always There is directionally right and worth reading. It correctly identifies that capability boundaries must be enforced outside the agent, and that semantic policy living only inside the agent is structurally weak. It offers a practical bridge between naive tool-using agents and a more serious governance posture. But it is still an intermediate step. The container is not the contract. The sidecar is not the audit trail. The policy engine is not the proof.

Those are the layers the next generation of agent systems will have to add.

And once they do, we may finally be able to stop asking whether the model is “aligned” and start asking a better question:

What, exactly, is this system authorized to do, and how can it prove it stayed within that authority?