Abstract
Large language models (LLMs) expose new attack surfaces, prompt/indirect injection, data poisoning, model inversion, and provenance opacity, that current web-era controls don’t adequately address. We propose IC-Guard, a whitepaper-style architecture that uses the Internet Computer (ICP) as a cryptographically anchored verification, authentication, and data-integrity substrate for AI systems. IC canisters issue attestations, certify data and outputs, enforce origin-bound identity and access, and coordinate hybrid confidential execution for high-sensitivity inference. We specify threat models, protocol flows, data structures, and implementation details leveraging ICP primitives (chain-key cryptography, threshold ECDSA, certified data/HTTP certification, Internet Identity, HTTPS outcalls) and confidential computing attestation. The result is a layered trust backbone for AI stacks, one that is verifiable end-to-end and auditable at byte-level granularity across prompts, RAG inputs, model versions, and responses. (OWASP Cheat Sheet Series)
1) Motivation & Threat Model
Threats addressed
-
Prompt & indirect injection: adversarial instructions embedded in user input or external content (e.g., web pages, emails, RAG corpora) hijack model behavior. Documented mitigation need is now mainstream (OWASP LLM01; Microsoft guidance on indirect injection). (OWASP Cheat Sheet Series)
-
Data poisoning: tainted training or RAG data biases outputs; demonstrated in medical/biomedical settings and reviewed systematically in 2025. (Nature)
-
Opaque provenance & audit gaps: users and auditors cannot cryptographically prove which sources, versions, or prompts influenced an answer. NIST’s AI RMF/GenAI guidance calls for governance, traceability, and verifiability. (NIST Publications)
-
Stealth exfiltration: new attacks (e.g., “Imprompter”) hide instructions to extract PII from chats—underscoring the need for cryptographic controls tied to data flow rather than heuristic filters alone. (WIRED)
Security goals
-
Source integrity & lineage: verifiable content digests and attestations for every RAG item, prompt component, and model artifact.
-
Execution integrity: proof that authorized code ran in an expected environment (on-chain or in attested confidential compute), with policy-guardrails evaluated by verifiers.
-
Origin-bound identity & access: per-canister cryptographic authentication and scoped delegations; signed responses tied to canister identities.
-
Tamper-evident outputs: client-verifiable certificates on result payloads, preventing downgrade, replay, and on-path manipulation.
-
Audit & governance: immutable logs and community/DAO control where applicable (e.g., SNS governance). (Internet Computer)
2) ICP Primitives for an AI Trust Backbone
-
Chain-key cryptography & threshold signatures: subnets jointly hold key shares; no single node holds a private key. Enables one-shot signatures, fast finality, and cross-chain signing. (Learn Internet Computer)
-
Threshold ECDSA (tECDSA): canisters sign messages/transactions without ever materializing private keys; keys derivable per canister and path. Useful for attestation tokens, content digests, and cross-chain proofs. (Internet Computer)
-
Certified data & HTTP certification: canisters commit a 32-byte certified root; clients verify query responses against a certificate chain (subnet key → IC root key) and Merkle proofs—enabling CDN-speed queries with cryptographic authenticity. (Internet Computer)
-
Internet Identity (WebAuthn-based): origin-bound, device-anchored passkeys for user auth; supports delegation chains scoped to canisters. (Internet Computer)
-
HTTPS outcalls: canisters fetch off-chain data without trusted oracles; costs paid in cycles; responses can be re-hashed and certified on-chain. (Internet Computer)
-
Boundary nodes & certified content: edge layer routes, rate-limits, and caches cryptographically verified responses; clients verify certificates. (Internet Computer)
-
State & upgrades: stable memory and orthogonal persistence allow large, versioned state with upgrade hooks; certification ties state commitments to outputs. (Limits vary by docs/subnet; design for chunked authenticated structures.) (Internet Computer)
3) Architecture Overview (IC-Guard)
Components
-
Provenance Registry Canister (PRC)
-
Stores authenticated digests for sources (RAG docs, datasets), prompts, model versions, policies.
-
Maintains Merkle trees / authenticated maps; commits root in
certified_data. (Internet Computer)
-
-
Policy & Risk Canister (PRC-P)
-
Encodes allow/deny and taint-propagation policies for inputs (e.g., “external HTML with untrusted origin cannot set instructions”).
-
Uses OWASP LLM01/NIST control mappings for explainable audit. (OWASP Cheat Sheet Series)
-
-
Attestation & Keys Canister (AKC)
-
Issues short-lived attestation tokens (tECDSA-signed JWT-like or COSE_Sign1 structures) binding {prompt_hash, source_set_root, model_hash, policy_version, nonce, time}.
-
Verifies confidential compute quotes (SGX/TDX/SEV) during hybrid inference and records results. (Internet Computer)
-
-
Inference Gateway Canister (IGC)
- Orchestrates inference: checks policy, obtains attestation token from AKC, sends HTTPS outcall to an Attested Inference Service (AIS) or routes to an On-chain Model Canister when feasible. (Internet Computer)
-
Attested Inference Service (off-chain)
- Runs model inside a confidential VM/TEE; emits remote-attestation evidence and output record
{attestation, output_hash, token}back to AKC/IGC. (Confidential Computing Consortium)
- Runs model inside a confidential VM/TEE; emits remote-attestation evidence and output record
High-level properties
-
All critical artifacts are hashed and bound into certified data structures (Merkle roots), then signed via tECDSA; end-users receive outputs with verifiable certificate chains. (Learn Internet Computer)
-
Identity & access: Internet Identity (WebAuthn) binds user sessions to canister origins; short-lived delegations restrict capability scope. (Internet Computer)
4) Protocol Flows
4.1 Source Registration (RAG corpus or training shard)
-
Client uploads document
D. -
PRC computes
hD = H(D), updates authenticated map, recomputesroot_RAG. -
PRC sets
certified_data = root_RAG(32 bytes); stores versioned map shards off-heap as needed. -
PRC returns
(doc_id, hD, proofD, root_RAG, cert_chain). Client can verify via certificate chain and Merkle proof. (Internet Computer)
4.2 Prompt Preparation & Policy Check
-
Client submits prompt
P+ selected sources{D_i}. -
PRC computes
hP, constructs source set digestroot_src. -
PRC-P evaluates policies (e.g., disallow HTML-embedded instructions from untrusted domains; strip patterns flagged by OWASP LLM01), emits
policy_versionand a taint summary. (OWASP Cheat Sheet Series)
4.3 Attestation Token Issuance
-
Client (or IGC) requests token from AKC with
{hP, root_src, model_hash, policy_version}. -
AKC issues signed token
τ = Sign_tECDSA(hP || root_src || model_hash || policy_version || nonce || ts)with derived per-canister key. (Internet Computer)
4.4 Inference (Hybrid)
-
On-chain (small model or verifier): IGC calls Model Canister → result
R, logs(hP, root_src, model_hash), updatescertified_data = root_out. -
Off-chain attested: IGC makes HTTPS outcall to AIS with
τ; AIS verifiesτand runs in TEE; returns{R, output_hash, attestation_quote, τ}. AKC verifies quote (reference measurement), records acceptance, and returns a verification receiptσ_ver. (Internet Computer)
4.5 Certified Delivery
-
IGC anchors
(hP, root_src, output_hash, σ_ver)into a Merkle accumulator; updatescertified_data. -
Client fetches the result as a certified query (or via certified HTTP asset). Client verifies: IC root → subnet key → canister certificate → Merkle proof for
output_hash. (Learn Internet Computer)
5) Data Structures
Authenticated map (sparse Merkle)
Leaf: L = H(namespace || key || value_hash)
Node: N = H(left_child || right_child)
Root: R = H(…)
-
namespace ∈ {RAG, PROMPT, MODEL, POLICY, OUTPUT}for multi-lattice provenance. -
value_hashcan be SHA-256 of canonicalized bytes; large artifacts chunked with rolling hashes. -
The 32-byte
certified_dataholdsR; arbitrary volumes are certified via proofs. (Internet Computer)
Attestation token (COSE_Sign1-like)
payload = {
"p": hP, "s": root_src, "m": model_hash,
"pv": policy_version, "n": nonce, "ts": time
}
sig = tECDSA_sign(payload)
Keys derived per canister with ICP’s threshold ECDSA API. (Internet Computer)
6) Reference Pseudocode (Rust canister snippets)
Issue attestation token
use ic_cdk::api::management_canister::ecdsa::{sign_with_ecdsa, EcdsaKeyId};
fn issue_token(hp: [u8;32], src: [u8;32], mh: [u8;32], pv: u64) -> Token {
let payload = canonical_cbor(hp, src, mh, pv, nonce(), ic_time());
let sig = sign_with_ecdsa(EcdsaKeyId{curve: "secp256k1".into(), name: "dfx_test_key".into()}, hash(&payload))
.await
.expect("tECDSA");
Token { payload, sig }
}
Set certified root for outputs
#[update]
fn commit_output_root(root: [u8;32]) {
ic_cdk::api::set_certified_data(&root);
}
Serve certified HTTP response
- Implement
http_requestto return body +IC-Certificateheader carrying witness; client verifies chain & Merkle proof. (Internet Computer)
7) Security Analysis
-
Prompt/indirect injection: Policies pre-filter inputs and bind permissible context to
τ. AIS must verifyτbefore inference, preventing “surreptitious context switching” by injected instructions. (OWASP LLM01; Microsoft MCP guidance.) (OWASP Cheat Sheet Series) -
Data poisoning: Only source sets registered in PRC (with signed digests) are used for training/RAG; auditors can reconstruct lineage and compare to registered roots (Nature Med. threat assessment; 2025 survey). (Nature)
-
Execution integrity: Off-chain AIS must present a valid TEE quote; AKC verifies reference measurement before accepting outputs. (Intel/Azure attestation docs; CCC primer.) (Microsoft Learn)
-
Tamper-evident delivery: Clients verify output proofs against canister certificates and IC chain key; boundary nodes may cache but cannot forge. (Learn Internet Computer)
-
Key custody: tECDSA avoids hot private keys; signing happens via subnet shares. (Internet Computer)
8) Performance & Operational Considerations
-
Latency model:
- Update calls (state-changing, consensus) → seconds-scale finality; query calls (read-only) → ms-scale, verified via certificates. (Certification explains CDN-speed verified reads.) (Learn Internet Computer)
-
State & scale: Use chunked authenticated structures to avoid single-object limits; stable memory enables large indices (design for migration/versioning). (Docs cite stable memory usage/limits; plan conservatively.) (Internet Computer)
-
Costing: HTTPS outcalls consume cycles; set
max_response_bytes, cap retries, and cache digests to reduce spend. (Internet Computer) -
Edge hardening: Leverage boundary nodes’ throttling and geo-routing; nevertheless, rely on client-side certificate verification for authenticity. (Internet Computer)
9) Governance, Compliance, and Audit
-
SNS governance for public accountability of policy updates and model rollout; proposals, votes, and version pinning become immutable records. (Internet Computer)
-
NIST AI RMF alignment: IC-Guard maps to Govern/Map/Measure/Manage functions with explicit technical controls for provenance, integrity, and access—satisfying GenAI profile recommendations for verifiable traceability. (NIST Publications)
10) Government & National-Security Posture
-
Hybrid mode: keep verifiers, registries, and audit logs on-chain; perform sensitive inference inside confidential VMs/TEEs with remote attestation fed back to canisters; deliver outputs with certified HTTP. (Suitable where models are too large for on-chain execution.) (Microsoft Learn)
-
Policy hardening: strict provenance whitelists; mandatory enclave attestation; deny unregistered sources; signed human approvals for elevated operations.
11) Deployment Blueprint (Phased)
Phase 0 — Foundations: implement PRC (auth map + certified root), PRC-P (policy language), AKC (tECDSA keying), and basic verification SDK. (Internet Computer)
Phase 1 — RAG Integrity: wrap existing LLM with IGC; every inference requires a τ; add client-side certificate verification helpers (Web, Node). (Internet Computer)
Phase 2 — Confidential Inference: integrate AIS with SGX/TDX/SEV; AKC quote verification; attested result receipts. (Microsoft Learn)
Phase 3 — Governance & Analytics: SNS for policy/model lifecycle; risk dashboards (policy violations, taint paths, source aging). (Internet Computer)
12) Example: End-to-End Verification Trace
-
User auth via Internet Identity → scoped delegation to IGC. (Internet Computer)
-
Prompt hashed (
hP); sources selected and verified (root_src, proofs). (Learn Internet Computer) -
Policy check passes; AKC issues
τ(tECDSA). (Internet Computer) -
Inference in AIS (TEE); quote verified by AKC. (Microsoft Learn)
-
Output
Rstored as hash in authenticated map;certified_dataupdated; client fetches via certified query and verifies certificate chain + Merkle proof. (Learn Internet Computer)
13) Discussion: Why this is strictly stronger than app-level “filters”
-
Filters are heuristic and brittle (see indirect injection incidents and “Imprompter”). IC-Guard cryptographically binds what inputs may influence outputs and makes deviation detectable. (Microsoft)
-
Data poisoning concerns remain real in open-web training; authenticated source registries with audit logs materially raise the bar for undetected taint. (Nature)
14) Limitations & Future Work
-
On-chain model size: Today’s frontier LLMs are too large for pure on-chain execution; hybrid attestation is pragmatic. (HTTPS outcalls + TEE quotes.) (Internet Computer)
-
Policy language: Formalizing taint lattices for natural-language context is ongoing research; start with conservative whitelists and structured RAG schemas.
-
Supply-chain attestation: Extend to container images, dataset pipelines, tokenizer versions; anchor SBOM digests on-chain.
15) Conclusion
The Internet Computer’s combination of chain-key cryptography, threshold ECDSA, certified data/HTTP certification, origin-bound identity, and first-class HTTPS outcalls makes it uniquely suited to be the verification, authentication, and integrity layer AI systems are missing. Paired with confidential computing attestation, IC-Guard offers verifiable provenance, execution integrity, and tamper-evident delivery—turning today’s “giant vacuum” AI into a system with provable boundaries around who can influence what, when, and how. (Internet Computer)
Selected References
-
OWASP LLM Prompt Injection cheat sheet; LLM01 risk category. (OWASP Cheat Sheet Series)
-
NIST AI RMF & Generative AI Profile. (NIST Publications)
-
ICP: Threshold ECDSA; threshold signatures; chain-key crypto; certified data; HTTP certification; HTTPS outcalls; Internet Identity; boundary nodes. (Internet Computer)
-
Confidential Computing & Attestation (Intel/Azure; CCC). (Microsoft Learn)
-
Data poisoning evidence & surveys. (Nature)
-
Recent incidents (indirect injection, Imprompter). (Microsoft)