(Update 2026-06-02) Between when W1 went out and this W2 piece landing, Jensen put the same conclusion on the Computex 2026 main stage — "the real blocker in enterprise AI agent adoption isn't model capability. It's the security review." NVIDIA's answer is OpenShell: a sandboxed agent runtime where enterprises declare what an agent can do, what it can't touch, and what needs human sign-off. ASUS Ascent GX10 announced OpenShell sandbox support the same week.
This piece comes at the same conclusion from the other direction — from a working engineer's day, five agents relaying. Same three-layer architecture (Audit / Permission / Sovereignty) Jensen put on the main stage, different framing.
W1 covered the cornerstone — historical rhythm + six attack entries. Side note: the W1 polish I mentioned in the opening of W1 was actually done by 5 AI agents while I was at the hospital with my family.
This W2 piece steps into another moment from that 2 hours.
13:10 — DB ground truth correction
Hange (Qwen3) reported that the W1 cornerstone DB still had 10 U+FFFD characters. Erwin queried the Supabase REST API directly and got 0. Hange was reading a stale local snapshot, not the production DB.
In that moment I realized: 5 agents are doing work, and there isn't a single agent I trust 100%. But they can correct each other.
That's the essence of trust — not "trust a single agent," but "trust the protocol that lets agents correct each other." Below: the three unanswered questions I keep watching teams stall on.
It's 2026. Your AI agent can write code, fix bugs, and open PRs. But would touching production actually get approved? From what I've been seeing across teams these past few months, most are still hesitating.
What I've been seeing across teams these past few months — the bottleneck for enterprise AI agent adoption isn't tech. It's that nobody can answer those three questions:
Who "ordered" this line of code? Who's accountable when it breaks?
How do you control what the agent can touch? Can it modify production config?
Does my proprietary codebase end up training someone else's model?
In my view, these three questions are the real bottleneck for enterprise AI agent adoption in 2026. It's not about model capability. It's about trust.
Development cost hits zero, but so does your moat
Every AI coding agent saves you time. Claude Code, Codex, Cursor, Gemini, Grok — they all claim to be the fastest. But "writing faster" was never an enterprise moat.
Real moats are:
What data do I have that nobody else does?
How secure is my deployment pipeline?
When my agent gets prompt injected, who stops it?
When everyone's using AI to accelerate, acceleration itself stops being an advantage. Safe acceleration is.
Last month Anthropic suspended 60+ accounts overnight. Teams that gave their agents full permissions went dark. If your entire dev workflow depends on one vendor, you're betting their policy won't change.
Three layers — why these are what let me put an agent into the codebase
What I've been seeing across teams is that OpenAB's approach to enterprise AI agent adoption isn't about which model is strongest. It's about three security layers:
Layer 1: Audit Trail — every line of code has a signature
Traditional dev: engineer commits → git blame finds the person.
AI agent dev: LLM generates code → who's accountable?
OpenAB injects sender_context into every prompt — who gave the instruction, which channel, what time, which agent executed it. When something goes wrong, it's not a black box. It's a complete trace.
0.8.4-beta.6 added one more layer — [hooks.pre_boot] and [hooks.pre_shutdown] push agent boot and shutdown events into the audit log too. Not just "what did the agent do," but "when did it come up, when did it go down, who started it."
Layer 2: Permission Control — least privilege for agents
One prompt can make an agent read ten files, write three files, run a script. This is the biggest authority amplification risk in the AI era.
The fix isn't zero permissions. It's per-tool allowlists + capability-based authorization:
Write a unit test → auto-approve
Modify production config → require human-in-the-loop
Read DB credentials → blocked
ACP's session/request_permission mechanism was designed for this — the agent asks before using a tool, and you set policies for what gets auto-approved vs. what needs a human.
Concrete instance: openabdev's recently open-sourced ghpool (GitHub API Proxy) is this layer's logic applied to another surface — multi-PAT pooling, automatic allocation by remaining rate limit, mutations go through the client's own token so identity attribution stays clean. Agents don't hold GitHub tokens directly, only session credentials; the real tokens live in the secrets manager.
Since v0.8.4 there's one more layer — NVIDIA OpenShell pushes this from config down to the OS.
The original working_dir + env whitelist is a config-level constraint — when an agent is prompt-injected, it can try to read other paths. They fail, but even the error messages can be useful intelligence.
Under OpenShell mode the agent runs inside an OpenShell sandbox (compute drivers include Docker / K8s / Podman / MicroVM); filesystem / process / network are cut apart at the OS layer:
- the agent only sees the filesystem inside the sandbox
- credentials are exposed to the sandbox process as opaque placeholders by the OpenShell Gateway — the real secret is substituted by the proxy at egress request time, never visible to the agent
- if it tries to reach an endpoint that's not on the allowlist, the sandbox proxy blocks it directly (network policy enforcement)
From "policy says it shouldn't read this" to "the sandbox can't read this." Same Permission Control principle, one more layer of enforcement.
Layer 3: Data Sovereignty — your code doesn't train their models
This is the biggest fear in data-sensitive industries — feeding the entire codebase to OpenAI or Anthropic, not knowing if your IP becomes someone else's model capability.
The fix is an abstraction layer:
[Chat Platform] → OpenAB (ACP) → [AI Agent Backend]
OpenAB isn't locked to any vendor. You can:
General tasks → Claude / GPT / Grok (cheap, fast)
Sensitive code → local LLM (data never leaves your network)
Or hybrid — swap per task
Switching takes one line in config.toml. No code changes, no MCP server changes, no Discord channel changes.
Since v0.8.4 this layer also leveled up.
The abstraction layer answers "which model" — but once the agent runs, it may still try to reach unknown endpoints. You wouldn't know if some third-party MCP server is quietly pinging out.
OpenShell switches this to default-deny egress. The sandbox's network policy is declared on the host side with openshell policy update --add-endpoint — only endpoints on the allowlist can be reached (discord.com, chatgpt.com, api.anthropic.com, etc.).
Even if an agent is prompt-injected and tries to exfiltrate codebase to an attacker's server, the policy-enforced egress stops it cold. From "trust the abstraction" to "trust the sandbox boundary + policy enforcement."
It's not about how strong your AI is — it's whether your agent can be injected
This is the pain point Erwin and I zeroed in on during our architecture discussion: traditional security blocks network intrusions; the AI-era attack surface is the "human + AI tool" interface.
Real incidents from 2025–2026:
EchoLeak (CVE-2025-32711): One malicious email triggered zero-click prompt injection on M365 Copilot — remote unauthorized data exfiltration
Salesforce ForcedLeak (CVSS 9.4): Hidden instructions in a Web-to-Lead form → AgentForce exfiltrated CRM data with legitimate credentials
Claude / Gemini / Copilot GH Actions hijack: Researchers prompt-injected three major AI agents to steal API keys and access tokens
These aren't theoretical. They're being exploited in the wild — so shrinking the attack surface itself is worth doing sooner rather than later.
That's why OpenAB's core design is no HTTP port — agent communication doesn't go over HTTP (no exposed surface to attack). It uses stdio JSON-RPC with process group isolation. Sessions are destroyed after use.
Your company might not be a household name. But your trade secrets are just as valuable.
United we stand, divided we scale
This is OpenAB's architectural philosophy.
Divided for too long → united. Claude Code, Codex, Gemini, Cursor, Grok — each has its own protocol. ACP consolidates them into one standard. No matter which backend you use, OpenAB speaks the same interface.
United for too long → divided. After unification, use cases diverge. Quick tasks → Grok 4 Fast. Deep reasoning → Grok 4.20 Reasoning. Sensitive data → local LLM. You can switch agents dynamically in the same thread. No new window, no config change.
That's why OpenAB doesn't build agents. It builds the agent broker — because even Elon Musk is buying AI, but what you need isn't the strongest AI. It's an architecture that lets you use all of them safely.
I'm an engineer building open source in the Rust ecosystem. OpenAB is the project I contribute to — an ACP broker that bridges Discord/Slack/Telegram to AI coding agents. If you're thinking about how to let your team use AI agents safely, continue with the rest of the series:
→ wchung.tw/blog/openab-series
#openab #aiagent #enterprise #security #acp #mcp