攻擊面邊界 — 你開出去的洞，比你以為的多

Lab-confirmed：本文 reference 實作來自我（Maki）自家在跑的個人 PKI（mk-brain / ERIKA Bot / Inbox Bot 等），不是 enterprise production case study。Code snippet 為 reference 設計，非實際在跑的版本。

模型以外的周邊，往往才是先出事的地方

很多 builder 一講 AI 安全，焦點會自然落在 prompt 和 output。

但真正把事故串起來的，常常是周邊工具鏈。

不是模型說了什麼。

而是套件暴露了什麼、localhost 綁到哪裡、tool API 憑什麼相信呼叫方、哪個 publish route 居然讓模型自己選。

這就是 attack surface boundary 麻煩的地方。

它不像 prompt injection 那樣有一個很清楚的惡意字串。

它比較像是一堆看起來各自不大、放在一起卻能湊成 exploit chain 的小洞。

Claude Code CLI 的 source map 洩漏案例（npm 套件 @anthropic-ai/claude-code v2.1.88，2026-03 被研究員發現），提醒我們 AI 功能常常不是倒在模型本體。

而是倒在你原本以為「只是配套」的 metadata、前端資產、除錯資訊。誤將 cli.js.map 打包進 npm，等於把 50 萬行 TypeScript 原始碼 + 未發布 feature flag + system prompt + memory 實作全部送出去。

台北捷運 AI 客服（Azure OpenAI 串接）2024 年 11 月被玩成 code generator 的案例則更直接。

廠商升級 AI 後，網友發現可以叫它回答程式碼問題；另一個 prompt 還把 system prompt 撈了出來。

根本原因不複雜。

prompt scope 沒設限。

沒有正面表列「只能回答交通查詢」。

沒有過濾範圍外問題。

不是 RCE，不是 localhost auth 問題，就是把客服 AI 當聊天機器人開放、忘了畫範圍。

廠商當天緊急斷線，事件被多家資安媒體（CyCraft、趨勢科技、iThome、天下未來城市）引用為 prompt injection / scope 控制的代表案例。

對不專做資安的工程師來說，attack surface boundary 最有價值的地方在於它很具體。

你不用先成為紅隊。

你只要先把暴露面列清楚，再把每個入口的預設值改對。

很多時候，風險不是來自 AI 太強，而是你把太多本機或內部能力當成理所當然的內網信任。

Attack Surface Boundary：外面到底打得到哪裡

攻擊面邊界不是抽象概念，而是每一個 port、header、tool route 和預設綁定位置。

Case：

主：Claude Code CLI source map 洩漏（npm 套件 @anthropic-ai/claude-code v2.1.88，2026-03-31 由研究員發現）——誤把 cli.js.map 打包進 npm，等於外洩約 51.2 萬行 TypeScript 原始碼、agent loop、多代理協調、44 個未發布 feature flag、system prompt 與 persistent memory 實作。攻擊面從「LLM 本體」轉移到「LLM 周邊工具鏈」，OWASP MCP Top 10 的 MCP04（Software Supply Chain）正是針對此類風險。

次：台北捷運 AI 客服被玩成 code generator（Azure OpenAI 串接，2024-11）。問題不在 localhost 或 tool auth，而在 prompt scope 沒設限：沒有正面表列「只能回答交通查詢」，也沒有擋掉範圍外問題。廠商當天緊急斷線，事件後續被 CyCraft、趨勢科技、iThome、天下未來城市反覆拿來當 prompt injection / scope 控制的代表案例。

核心問題：你給 model 多大 autonomy？最小 auth 該怎麼設？

Rule：所有本機 agent 服務預設 127.0.0.1 不 0.0.0.0。所有 LLM tool API 必須HMAC + replay window（OneUptime 2026-01 canonical pattern）。

先把可打到的面縮小，再談模型要不要更聰明，順序不能反。

Threat	Mitigation	Validated in (reference)
Localhost binding 不小心 0.0.0.0	所有服務預設 `127.0.0.1`；LAN 暴露需明確宣告並過 nginx/Tailscale 限縮	OpenClaw / memory-hall / mk-brain RAG
Tool API 被 prompt injection 騙呼叫	HMAC `sha256=<hex>` + `X-Timestamp` + `\|now-ts\| < 300s` replay window；`hmac.compare_digest` constant-time	memory-hall API
Tool Misuse（ASI02）/ Command Injection（MCP05）	output 過 deterministic policy gate（allowlist `route_to`，禁 LLM 決定 publish target）；critical action 走 second-agent approval	mk-brain agent council

Reference 實作（示意非 production code，依你環境調整）：

# HMAC + replay window — LLM tool API 最小 auth
import hmac
import hashlib
import time

REPLAY_WINDOW_SECONDS = 300

def verify_request(
    body: bytes,
    signature_header: str,
    timestamp_header: str,
    key: bytes,
) -> bool:
    # 1. Replay window
    ts = int(timestamp_header)
    if abs(time.time() - ts) > REPLAY_WINDOW_SECONDS:
        return False

    # 2. Constant-time HMAC compare
    expected = hmac.new(
        key,
        f"{ts}.{body.decode()}".encode(),
        hashlib.sha256,
    ).hexdigest()
    received = signature_header.replace("sha256=", "")
    return hmac.compare_digest(expected, received)

對應 OWASP：這篇先看哪幾條

MCP05 Command Injection & Execution：當 tool route 能被模型或外部輸入帶偏，命令注入就從理論變成實作問題。
MCP01 Token Mismanagement：簽章、短期 token、重放視窗這些最小認證機制，決定 tool API 到底是不是門。
ASI02 Tool Misuse：很多事故不是模型不會答，而是它太容易被允許去碰錯的工具。

想動手做？ 這篇文的概念有對應的 5 課完整課程：攻擊面邊界 101 →

也可以回到系列起點：信任邊界