IC-Guard: A Cryptographic Verification & Integrity Layer for AI Systems on the Internet Computer

Abstract

Large language models (LLMs) expose new attack surfaces, prompt/indirect injection, data poisoning, model inversion, and provenance opacity, that current web-era controls don’t adequately address. We propose IC-Guard, a whitepaper-style architecture that uses the Internet Computer (ICP) as a cryptographically anchored verification, authentication, and data-integrity substrate for AI systems. IC canisters issue attestations, certify data and outputs, enforce origin-bound identity and access, and coordinate hybrid confidential execution for high-sensitivity inference. We specify threat models, protocol flows, data structures, and implementation details leveraging ICP primitives (chain-key cryptography, threshold ECDSA, certified data/HTTP certification, Internet Identity, HTTPS outcalls) and confidential computing attestation. The result is a layered trust backbone for AI stacks, one that is verifiable end-to-end and auditable at byte-level granularity across prompts, RAG inputs, model versions, and responses. (OWASP Cheat Sheet Series)


1) Motivation & Threat Model

Threats addressed

  • Prompt & indirect injection: adversarial instructions embedded in user input or external content (e.g., web pages, emails, RAG corpora) hijack model behavior. Documented mitigation need is now mainstream (OWASP LLM01; Microsoft guidance on indirect injection). (OWASP Cheat Sheet Series)

  • Data poisoning: tainted training or RAG data biases outputs; demonstrated in medical/biomedical settings and reviewed systematically in 2025. (Nature)

  • Opaque provenance & audit gaps: users and auditors cannot cryptographically prove which sources, versions, or prompts influenced an answer. NIST’s AI RMF/GenAI guidance calls for governance, traceability, and verifiability. (NIST Publications)

  • Stealth exfiltration: new attacks (e.g., “Imprompter”) hide instructions to extract PII from chats—underscoring the need for cryptographic controls tied to data flow rather than heuristic filters alone. (WIRED)

Security goals

  1. Source integrity & lineage: verifiable content digests and attestations for every RAG item, prompt component, and model artifact.

  2. Execution integrity: proof that authorized code ran in an expected environment (on-chain or in attested confidential compute), with policy-guardrails evaluated by verifiers.

  3. Origin-bound identity & access: per-canister cryptographic authentication and scoped delegations; signed responses tied to canister identities.

  4. Tamper-evident outputs: client-verifiable certificates on result payloads, preventing downgrade, replay, and on-path manipulation.

  5. Audit & governance: immutable logs and community/DAO control where applicable (e.g., SNS governance). (Internet Computer)


2) ICP Primitives for an AI Trust Backbone

  • Chain-key cryptography & threshold signatures: subnets jointly hold key shares; no single node holds a private key. Enables one-shot signatures, fast finality, and cross-chain signing. (Learn Internet Computer)

  • Threshold ECDSA (tECDSA): canisters sign messages/transactions without ever materializing private keys; keys derivable per canister and path. Useful for attestation tokens, content digests, and cross-chain proofs. (Internet Computer)

  • Certified data & HTTP certification: canisters commit a 32-byte certified root; clients verify query responses against a certificate chain (subnet key → IC root key) and Merkle proofs—enabling CDN-speed queries with cryptographic authenticity. (Internet Computer)

  • Internet Identity (WebAuthn-based): origin-bound, device-anchored passkeys for user auth; supports delegation chains scoped to canisters. (Internet Computer)

  • HTTPS outcalls: canisters fetch off-chain data without trusted oracles; costs paid in cycles; responses can be re-hashed and certified on-chain. (Internet Computer)

  • Boundary nodes & certified content: edge layer routes, rate-limits, and caches cryptographically verified responses; clients verify certificates. (Internet Computer)

  • State & upgrades: stable memory and orthogonal persistence allow large, versioned state with upgrade hooks; certification ties state commitments to outputs. (Limits vary by docs/subnet; design for chunked authenticated structures.) (Internet Computer)


3) Architecture Overview (IC-Guard)

Components

  1. Provenance Registry Canister (PRC)

    • Stores authenticated digests for sources (RAG docs, datasets), prompts, model versions, policies.

    • Maintains Merkle trees / authenticated maps; commits root in certified_data. (Internet Computer)

  2. Policy & Risk Canister (PRC-P)

    • Encodes allow/deny and taint-propagation policies for inputs (e.g., “external HTML with untrusted origin cannot set instructions”).

    • Uses OWASP LLM01/NIST control mappings for explainable audit. (OWASP Cheat Sheet Series)

  3. Attestation & Keys Canister (AKC)

    • Issues short-lived attestation tokens (tECDSA-signed JWT-like or COSE_Sign1 structures) binding {prompt_hash, source_set_root, model_hash, policy_version, nonce, time}.

    • Verifies confidential compute quotes (SGX/TDX/SEV) during hybrid inference and records results. (Internet Computer)

  4. Inference Gateway Canister (IGC)

    • Orchestrates inference: checks policy, obtains attestation token from AKC, sends HTTPS outcall to an Attested Inference Service (AIS) or routes to an On-chain Model Canister when feasible. (Internet Computer)
  5. Attested Inference Service (off-chain)

    • Runs model inside a confidential VM/TEE; emits remote-attestation evidence and output record {attestation, output_hash, token} back to AKC/IGC. (Confidential Computing Consortium)

High-level properties

  • All critical artifacts are hashed and bound into certified data structures (Merkle roots), then signed via tECDSA; end-users receive outputs with verifiable certificate chains. (Learn Internet Computer)

  • Identity & access: Internet Identity (WebAuthn) binds user sessions to canister origins; short-lived delegations restrict capability scope. (Internet Computer)


4) Protocol Flows

4.1 Source Registration (RAG corpus or training shard)

  1. Client uploads document D.

  2. PRC computes hD = H(D), updates authenticated map, recomputes root_RAG.

  3. PRC sets certified_data = root_RAG (32 bytes); stores versioned map shards off-heap as needed.

  4. PRC returns (doc_id, hD, proofD, root_RAG, cert_chain). Client can verify via certificate chain and Merkle proof. (Internet Computer)

4.2 Prompt Preparation & Policy Check

  1. Client submits prompt P + selected sources {D_i}.

  2. PRC computes hP, constructs source set digest root_src.

  3. PRC-P evaluates policies (e.g., disallow HTML-embedded instructions from untrusted domains; strip patterns flagged by OWASP LLM01), emits policy_version and a taint summary. (OWASP Cheat Sheet Series)

4.3 Attestation Token Issuance

  1. Client (or IGC) requests token from AKC with {hP, root_src, model_hash, policy_version}.

  2. AKC issues signed token τ = Sign_tECDSA(hP || root_src || model_hash || policy_version || nonce || ts) with derived per-canister key. (Internet Computer)

4.4 Inference (Hybrid)

  • On-chain (small model or verifier): IGC calls Model Canister → result R, logs (hP, root_src, model_hash), updates certified_data = root_out.

  • Off-chain attested: IGC makes HTTPS outcall to AIS with τ; AIS verifies τ and runs in TEE; returns {R, output_hash, attestation_quote, τ}. AKC verifies quote (reference measurement), records acceptance, and returns a verification receipt σ_ver. (Internet Computer)

4.5 Certified Delivery

  • IGC anchors (hP, root_src, output_hash, σ_ver) into a Merkle accumulator; updates certified_data.

  • Client fetches the result as a certified query (or via certified HTTP asset). Client verifies: IC root → subnet key → canister certificate → Merkle proof for output_hash. (Learn Internet Computer)


5) Data Structures

Authenticated map (sparse Merkle)

Leaf:  L = H(namespace || key || value_hash)
Node:  N = H(left_child || right_child)
Root:  R = H(…)

  • namespace ∈ {RAG, PROMPT, MODEL, POLICY, OUTPUT} for multi-lattice provenance.

  • value_hash can be SHA-256 of canonicalized bytes; large artifacts chunked with rolling hashes.

  • The 32-byte certified_data holds R; arbitrary volumes are certified via proofs. (Internet Computer)

Attestation token (COSE_Sign1-like)

payload = {
  "p": hP, "s": root_src, "m": model_hash,
  "pv": policy_version, "n": nonce, "ts": time
}
sig = tECDSA_sign(payload)

Keys derived per canister with ICP’s threshold ECDSA API. (Internet Computer)


6) Reference Pseudocode (Rust canister snippets)

Issue attestation token

use ic_cdk::api::management_canister::ecdsa::{sign_with_ecdsa, EcdsaKeyId};
fn issue_token(hp: [u8;32], src: [u8;32], mh: [u8;32], pv: u64) -> Token {
    let payload = canonical_cbor(hp, src, mh, pv, nonce(), ic_time());
    let sig = sign_with_ecdsa(EcdsaKeyId{curve: "secp256k1".into(), name: "dfx_test_key".into()}, hash(&payload))
              .await
              .expect("tECDSA");
    Token { payload, sig }
}

(Internet Computer)

Set certified root for outputs

#[update]
fn commit_output_root(root: [u8;32]) {
    ic_cdk::api::set_certified_data(&root);
}

(Internet Computer)

Serve certified HTTP response

  • Implement http_request to return body + IC-Certificate header carrying witness; client verifies chain & Merkle proof. (Internet Computer)

7) Security Analysis

  • Prompt/indirect injection: Policies pre-filter inputs and bind permissible context to τ. AIS must verify τ before inference, preventing “surreptitious context switching” by injected instructions. (OWASP LLM01; Microsoft MCP guidance.) (OWASP Cheat Sheet Series)

  • Data poisoning: Only source sets registered in PRC (with signed digests) are used for training/RAG; auditors can reconstruct lineage and compare to registered roots (Nature Med. threat assessment; 2025 survey). (Nature)

  • Execution integrity: Off-chain AIS must present a valid TEE quote; AKC verifies reference measurement before accepting outputs. (Intel/Azure attestation docs; CCC primer.) (Microsoft Learn)

  • Tamper-evident delivery: Clients verify output proofs against canister certificates and IC chain key; boundary nodes may cache but cannot forge. (Learn Internet Computer)

  • Key custody: tECDSA avoids hot private keys; signing happens via subnet shares. (Internet Computer)


8) Performance & Operational Considerations

  • Latency model:

    • Update calls (state-changing, consensus) → seconds-scale finality; query calls (read-only) → ms-scale, verified via certificates. (Certification explains CDN-speed verified reads.) (Learn Internet Computer)
  • State & scale: Use chunked authenticated structures to avoid single-object limits; stable memory enables large indices (design for migration/versioning). (Docs cite stable memory usage/limits; plan conservatively.) (Internet Computer)

  • Costing: HTTPS outcalls consume cycles; set max_response_bytes, cap retries, and cache digests to reduce spend. (Internet Computer)

  • Edge hardening: Leverage boundary nodes’ throttling and geo-routing; nevertheless, rely on client-side certificate verification for authenticity. (Internet Computer)


9) Governance, Compliance, and Audit

  • SNS governance for public accountability of policy updates and model rollout; proposals, votes, and version pinning become immutable records. (Internet Computer)

  • NIST AI RMF alignment: IC-Guard maps to Govern/Map/Measure/Manage functions with explicit technical controls for provenance, integrity, and access—satisfying GenAI profile recommendations for verifiable traceability. (NIST Publications)


10) Government & National-Security Posture

  • Hybrid mode: keep verifiers, registries, and audit logs on-chain; perform sensitive inference inside confidential VMs/TEEs with remote attestation fed back to canisters; deliver outputs with certified HTTP. (Suitable where models are too large for on-chain execution.) (Microsoft Learn)

  • Policy hardening: strict provenance whitelists; mandatory enclave attestation; deny unregistered sources; signed human approvals for elevated operations.


11) Deployment Blueprint (Phased)

Phase 0 — Foundations: implement PRC (auth map + certified root), PRC-P (policy language), AKC (tECDSA keying), and basic verification SDK. (Internet Computer)
Phase 1 — RAG Integrity: wrap existing LLM with IGC; every inference requires a τ; add client-side certificate verification helpers (Web, Node). (Internet Computer)
Phase 2 — Confidential Inference: integrate AIS with SGX/TDX/SEV; AKC quote verification; attested result receipts. (Microsoft Learn)
Phase 3 — Governance & Analytics: SNS for policy/model lifecycle; risk dashboards (policy violations, taint paths, source aging). (Internet Computer)


12) Example: End-to-End Verification Trace

  1. User auth via Internet Identity → scoped delegation to IGC. (Internet Computer)

  2. Prompt hashed (hP); sources selected and verified (root_src, proofs). (Learn Internet Computer)

  3. Policy check passes; AKC issues τ (tECDSA). (Internet Computer)

  4. Inference in AIS (TEE); quote verified by AKC. (Microsoft Learn)

  5. Output R stored as hash in authenticated map; certified_data updated; client fetches via certified query and verifies certificate chain + Merkle proof. (Learn Internet Computer)


13) Discussion: Why this is strictly stronger than app-level “filters”

  • Filters are heuristic and brittle (see indirect injection incidents and “Imprompter”). IC-Guard cryptographically binds what inputs may influence outputs and makes deviation detectable. (Microsoft)

  • Data poisoning concerns remain real in open-web training; authenticated source registries with audit logs materially raise the bar for undetected taint. (Nature)


14) Limitations & Future Work

  • On-chain model size: Today’s frontier LLMs are too large for pure on-chain execution; hybrid attestation is pragmatic. (HTTPS outcalls + TEE quotes.) (Internet Computer)

  • Policy language: Formalizing taint lattices for natural-language context is ongoing research; start with conservative whitelists and structured RAG schemas.

  • Supply-chain attestation: Extend to container images, dataset pipelines, tokenizer versions; anchor SBOM digests on-chain.


15) Conclusion

The Internet Computer’s combination of chain-key cryptography, threshold ECDSA, certified data/HTTP certification, origin-bound identity, and first-class HTTPS outcalls makes it uniquely suited to be the verification, authentication, and integrity layer AI systems are missing. Paired with confidential computing attestation, IC-Guard offers verifiable provenance, execution integrity, and tamper-evident delivery—turning today’s “giant vacuum” AI into a system with provable boundaries around who can influence what, when, and how. (Internet Computer)


Selected References

  • OWASP LLM Prompt Injection cheat sheet; LLM01 risk category. (OWASP Cheat Sheet Series)

  • NIST AI RMF & Generative AI Profile. (NIST Publications)

  • ICP: Threshold ECDSA; threshold signatures; chain-key crypto; certified data; HTTP certification; HTTPS outcalls; Internet Identity; boundary nodes. (Internet Computer)

  • Confidential Computing & Attestation (Intel/Azure; CCC). (Microsoft Learn)

  • Data poisoning evidence & surveys. (Nature)

  • Recent incidents (indirect injection, Imprompter). (Microsoft)

2 Likes