MCP 是 "Working as Designed" — Builder 必須自己 sandbox

Lab-confirmed：本文主要是 builder-side 防線整理與 reference 設計，不是我（Maki）已完整實跑過的 MCP sandbox 產品化方案。Code snippet 為 reference 設計，非實際在跑的版本。

如果協定不替你擋，那就是你的功課

但 builder 真正該怕的地方也在這裡。

插座標準化，不代表電流變安全。

不是隔離。

不是 sandbox。

如果一個 MCP server 透過 STDIO 跟 host 溝通，而那個 server 本身又跑在你的本機或 CI 機器上，那本質上你只是把 command execution 包成比較漂亮的介面。

更麻煩的是，tool output 還會再回到模型上下文。

這表示攻擊者不只可以想辦法讓 tool 執行不該執行的事。

還可以把惡意指令包在 output 裡，等 agent 下一輪自己接著走。

所以當廠商回你一句「working as designed」。

你最好把它翻譯成工程語言。

意思是協定層不打算替你補。

那就輪到 builder 自己補。

MCP Boundary：你裝的不是工具，是可執行能力

MCP 的風險邊界，不在於它能不能列出工具。

而在於誰替那些工具畫出 OS 級限制。

Case：

主：2026-04-15，OX Security 的 The Mother of All AI Supply Chains 直接把 root problem 寫明了：官方 MCP SDK 的 STDIO 啟動模型存在 configuration-to-command execution 的設計風險。The Hacker News 的後續報導指出這個問題威脅到 1.5 億以上下載量與 7,000 多個對外可見伺服器；Tom's Hardware 的報導則把總暴露面估到約 200,000 個 server instance。更重要的是，風險不是只停在一個 demo repo，而是往 LiteLLM、LangChain、LangFlow、Flowise 這些 builder 真正在用的生態傳遞。

次：如果你記得 2025 Q3 那波 CVE 是「Git MCP server 爆洞」，這裡值得修正成具體日期與具體元件。公開編號 CVE-2025-53110 與 CVE-2025-53109 在 2025-07-02 進入 NVD，對應的是官方 Filesystem reference server 的 prefix-match 與 symlink 邊界繞過，不是 Git server。這個修正很重要，因為它更直接說明：連官方 reference server 都可能把「允許操作的目錄」畫錯，builder 更不該把 reference implementation 當成 production sandbox。

第三層：PoisonedRAG 的公開研究早就證明，只要五筆惡意文本，就能在百萬級知識庫裡把 targeted answer 操到 90% 以上成功率。2026-04-26 的安全社群討論把這件事再往前推了一步：如果 poisoned retrieval 不只影響回答，而是能影響 tool configuration、server recommendation，甚至動態 onboarding 新 MCP server，那攻擊路徑就從 answer hijack 升級成 host action。這裡我把它當成有研究支撐的 builder 風險推論，不是已完整公開的單一 CVE。

核心問題：誰替 MCP server 畫沙箱？

Builder 必做 1：安裝前 audit，先有 trust list，也要有 deny list。

Builder 必做 2：runtime sandbox，Docker、nsjail、firejail 都比裸跑 host 好。

Builder 必做 3：tool call allowlist，不要讓 LLM 任意組 tool name 和 args。

Builder 必做 4：output sanitization，tool output 進 LLM 前先標成 untrusted。

Builder 必做 5：bind 127.0.0.1，沒有明確遠端需求就不要對外。

Rule：MCP 讓整合變簡單，但沒有讓隔離自動出現。

MCP 攻擊面	Mitigation	Builder 行動
Marketplace 安裝未審核 MCP	`audit-skill` / 手動 source review	安裝前先進 quarantine
Tool output 注入下一輪 prompt	tool output 包 `<untrusted>` 標籤	進 LLM 前一律清洗
Dynamic tool onboarding 失控	tool allowlist + capability table	每個 agent 顯式宣告 `required_caps`
MCP server 暴露對外	預設 `127.0.0.1` + Tailscale 才開放遠端	`docker compose` 僅 publish loopback

Reference 實作（示意非 production code，依你環境調整）：

import json
import subprocess

def mcp_proxy(mcp_server_cmd: list[str], allowed_tools: set[str]):
    proc = subprocess.Popen(
        ["docker", "run", "--rm", "--network=none", "-i", "mcp-sandbox", *mcp_server_cmd],
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        text=True,
    )
    while True:
        line = input()
        req = json.loads(line)
        current_tool = None
        if req.get("method") == "tools/call":
            current_tool = req["params"]["name"]
            if current_tool not in allowed_tools:
                raise PermissionError(f"tool not allowed: {current_tool}")
        proc.stdin.write(line + "\n")
        proc.stdin.flush()
        raw = proc.stdout.readline()
        resp = json.loads(raw)
        if current_tool and "result" in resp and isinstance(resp["result"], dict):
            text = resp["result"].get("content", "")
            resp["result"]["content"] = (
                f'<untrusted source="mcp:{current_tool}">{text}</untrusted>'
            )
        print(json.dumps(resp))

這段 code 的目的很單純。

server 先進 container。

tool call 先過 allowlist。

output 先被標成不可信來源。

對應 OWASP：這篇先看哪幾條

MCP05 Command Injection & Execution：當 STDIO server 本質上能啟動本地能力，command boundary 就不能只靠善意。
ASI02 Tool Misuse：很多事故不是模型會不會回答，而是它被允許去碰哪些工具。
LLM03:2025 Supply Chain：MCP server、registry、reference server 全都進了 agent 的供應鏈。

想動手做？ 這篇文的概念可以接著看：攻擊面邊界

如果你想把 MCP sandbox、allowlist、approval gate 做成完整 builder workflow，直接接：攻擊面邊界 101 →

再往權限設計延伸，接著看：權限邊界 101 →