Come hear about the state of the ART on ZKML. *ICP is the global orchestration layer for DeAI

apotheosis · May 5, 2025, 4:08pm

Come Hear About The Future of Decentralized AI: Zero-Knowledge Machine Learning On ICP.

Hey!

Come Join the DeAI Working Group next meeting: May 8th Topic:

Kinic Open Beta, SOTA for zero-knowledge machine learning Location: https://discord.internetcomputer.org (voice)

I’ve been deep in the zkML rabbit hole and wanted to share some exciting developments that showcase why the Internet Computer is uniquely positioned to lead the DeAI revolution. This isn’t just theoretical - the technology is advancing rapidly, and ICP’s architecture gives us distinct advantages that no other blockchain can match.

Zero-Knowledge Machine Learning:

Zero-knowledge machine learning represents a paradigm shift in how we think about AI verification. It allows you to:

Run AI models locally on your own hardware
Generate mathematical proofs that you ran the model correctly
Let anyone verify these proofs with absolute certainty - no trust required

This solves multiple critical problems simultaneously: privacy concerns, computational sovereignty, and trustless verification.

Why ICP Has the Edge

Several technical factors position ICP at the forefront of this revolution:

Direct ZKP Verification

Unlike other chains, ICP can host ZKP verification directly on-chain without complicated workarounds. This is a massive architectural advantage that simplifies the entire verification flow.

No Proof Composition Required

Most blockchains force developers to use “proof composition” - a complex process of wrapping ZKPs to make them compatible with on-chain verification. These often rely on Groth16 with trusted setups (introducing security assumptions we’d rather avoid). ICP sidesteps this entirely.

Chain Fusion Capabilities

ICP’s threshold cryptography enables fluid interactions with other chains, creating powerful workflows:

Run your model locally (keeping data private)
Verify execution on ICP
Trigger actions across BTC, ETH, and other networks

This cross-chain capability will be essential for AI agents operating across the blockchain ecosystem. AI is not going to use fiat.. but crypto to transact among themselves.

Vector Databases On-Chain

ICP smart contracts can natively host vector databases - the specialized data structures that power modern AI. Better yet, they can maintain privacy for sensitive data. This is perfect for:

Personal data (email, documents, health records)
Proprietary datasets you want to monetize
Collaborative AI training with privacy guarantees

Technical Advancements Driving This Forward

The zkML field is evolving at breakneck speed, with proving speeds improving approximately 100x annually. Our team at Kinic is pushing state-of-the-art approaches.

The most exciting development is JOLT (“Just One Lookup Table”), a ZKP scheme that leverages lookup arguments for dramatically faster proving. What makes this particularly powerful for AI is how well lookup arguments handle non-linear functions like ReLU - the backbone of modern neural networks.

We’re currently modifying JOLT to extend to AI-specific opcodes with specialized lookups and precompiles, which will dramatically outperform previous zkML approaches in raw proving speed.

Join the Revolution

If you’re as excited about this potential as we are, here’s how to get involved:

Check out the JOLT paper: Jolt: SNARKs for Virtual Machines via Lookups
Join our upcoming developer session: Technical Working Group DeAI - #289 by patnorris
Share your use cases in the comments - we’re particularly interested in hearing what AI applications you’d imagine when there are no technical boundaries.

I made a modified Sutton’s principle: “An AI system can only truly create and maintain knowledge to the extent that it can verify that knowledge, on its own, succinctly and reliably.” With ICP and zkML, we’re building exactly that future.

timk11 · May 6, 2025, 4:02am

I’m very interested in this topic and actively working on something at the moment. What time is the meeting?

icarus · May 6, 2025, 5:17am

Depends on your timezone, here is the event link

timk11 · May 6, 2025, 5:55am

Thanks! That’s 2am in my timezone so I don’t think I’ll make it unfortunately. Will there be a recording of the session?

timk11 · May 6, 2025, 6:53am

By the way, the project I’m working on is intended to train a publicly visible AI model on a private dataset, and to use a zk-STARK to show that the training was authentic. This is verifiable training as distinct from verifiable inference, and I’ve been using the Winterfell Rust crate. Here are some links in case anyone would like to take a look:

Grant details
Repo
Some hurdles that I’m currently working through

icarus · May 6, 2025, 7:38am

You and me both! AEST timezone for me.
Little known fact: the ”De" in DeAI stands for “dedication” and “delerium”, not just"decentralisation’

The working group is open to shifting the time of the meeting once a month to make it easier for some of us in the Asia Pacific timezones, if there are a few of us not just me it is maybe it is worth doing.

icarus · May 6, 2025, 7:46am

This looks like really interesting work, would you be open to sharing in a future DeAI group meeting?

I am leading the May 22 meeting which has an ai hardware focus but if we shifted the start time a few hours, would you be interested in a short presentation of your work?
@patnorris ping!

apotheosis · May 6, 2025, 1:39pm

Hey! That is super cool.

We can discuss more in the Discord or here.
STARKs are battle tested for ‘scaling blockchains’, but easily explode in memory usage for large problem sets. Some DeFi companies are using it for small problems and models (Giza.xyz).

TLDR; it will not scale to larger training sets unless you have many BIG machines working on it.

In my talk I will discuss this a bit and all previous attempts at zkML.
*It might be recorded? @patnorris

Sum check based approaches, such as GKR are very promising.

This is current claimed SoTa

Before that there was Modulus labs - acquired by WorldCoin. Also used GKR.

Prior .. Halo2 based work and other non-specialized zkML that was often super slow.

We are experts in the space and will 10x .

This is done with lookup arguments for non-linearities, folding scheme, and of course lots of sum check protocol. Don’t want to give toooo much away as we will likely release a research paper on it.

REF:

Lorimer · May 6, 2025, 4:43pm

I’m not clear how this is possible in a way that cannot be spoofed. I claim I’ve ran a model (y) offline and that the output is x. How do you prove/disprove that? (both x and y)

apotheosis · May 6, 2025, 5:05pm

Please take a look at a few of the papers listed. ZKP for verifiable computation is a very well researched field. This can get a bit mathy If you are interested in learning ZK, I can post some material! zkhack discord and the Thaler book are great starting points.

Gist: Program is turned into polynomials and queried using the power of randomness by the verifier.

Longer Gist:

Zero-Knowledge Proofs (ZKPs) allow one party (the prover) to convince another party (the verifier) that a statement is true without revealing any additional information beyond the validity of the statement itself.

To address your concern about spoofing: with proper ZKP systems, it’s not just about claiming ‘I ran model y and got output x’ - the mathematics of ZKPs creates cryptographic commitments and verification processes that are provably secure.

Here’s a clearer explanation of the sum-check protocol:

The computation (running model y to get output x) is represented as a polynomial function.
The prover converts this program into a multilinear extension (MLE) polynomial representation and commits to it using a cryptographic polynomial commitment scheme.
The verifier generates random challenges without needing access to the full polynomial.
Through an interactive protocol, the prover provides evaluations of the polynomial at points determined by the verifier’s random challenges, along with cryptographic proofs that these evaluations are consistent with the original commitment.
The verifier runs multiple rounds of verification using the sum-check protocol, where:

The prover gradually reduces a high-dimensional claim to a one-dimensional claim
Each round, the complexity decreases as the verifier picks random challenges
The mathematical properties of polynomials ensure that incorrect computations will be detected with high probability

The entire process is built on hardness assumptions and mathematical properties that make it computationally infeasible to forge proofs for incorrect computations.

This approach handles both concerns: it proves the prover knows output x AND that it was indeed produced by running model y, without revealing the internal details of the computation.

timk11 · May 7, 2025, 5:02am

I’m also AEST/AEDT so if some of the meetings were a few hours earlier I’d certainly try to make it along. I’d be interested to present something but I’d probably like to push it back a bit further. At this stage I’m still doing a lot of figuring out and not sure if the project I’m building will end up working or if I’ll need to try a different approach altogether.

timk11 · May 7, 2025, 7:10am

Thanks! Happy to discuss here, and hopefully it might help other people working on similar things.

I had a quick look at those links and I’ll to go through them much more thoroughly over the next few days or so. I gather scaling might be a problem for trying to do this with STARKs so I’ve started just trying to use a very small model and dataset. I’ve gone into a fair bit of detail in the GitHub issue I’ve posted (the “some hurdles” link in my previous post). From what I understand so far, my hunch is that it should be possible to make this work with Winterfell but perhaps my logic is a bit messed up.

For using GKR, I’ve come across the Expander Rust library (briefly outlined here) but I haven’t explored it in much detail yet. So far I’ve been focusing mainly on the high level details and gradually chipping away at all the maths, so I’ve been looking for a good Rust example of using this for neural networks or other machine learning models but haven’t yet found anything. Ultimately I’m hoping to build a basic verifiable training tool by one or another means and then use it as the basis for a federated learning system or something similar as a larger project.

timk11 · May 7, 2025, 12:05pm

Some of the best writing I’ve seen on this topic is from Vitalik Buterin. This one of his on zk-STARKs is particularly good. See also this selection.

apotheosis · May 7, 2025, 4:49pm

Sounds good!

Good to know: ZKP people are touchy about the naming STARKs vs SNARK. Technically STARKs are a transparent setup form of SNARK and many modern SNARK are STARKs

In general, when someone says STARK they mean something that uses hashes (FRI) at its core rather than elliptic curves or lattice based techniques.

Its interesting as most STARK are actually not privacy preserving; i.e. they have no ZK.

History note: STARK paper came from a founder of STARKWare - they are often still used in the context of memory hungry proving to scale blockchain.

patnorris · May 8, 2025, 7:25am

I usually publish a written summary afterwards but if you like we can video record today’s call and publish that then too. Would that work for you?

apotheosis · May 8, 2025, 11:54am

That would be great!

patnorris · May 8, 2025, 6:27pm

please find the summary of @apotheosis 's presentation here: Technical Working Group DeAI - #293 by patnorris

timk11 · June 26, 2025, 8:52am

Hi @apotheosis . I watched the recorded presentation a little while back. Very informative!

I’m still struggling with the issue I raised earlier and wondering if you or anyone here can help me with some noob-friendly answers. I’m building a STARK prover with Winterfell for verifiable training of a linear regression model. Unfortunately it keeps failing. My constraint definition function looks similar to this:

    fn evaluate_transition<E: FieldElement<BaseField = BaseElement> + From<Self::BaseField>>(
        &self,
        frame: &EvaluationFrame<E>,
        _periodic_values: &[E],
        constraints: &mut [E],
    ) {
        let current_state = frame.current();
        let flag = current_state[0]; // 1 for dataset members, 0 for separator row
        let len_sample = current_state[1]; // Number of features for each dataset member
        let learning_rate = current_state[2];
        let mut dataset_hash = current_state[3]; // Current dataset hash
        let mut w_b_hash = current_state[4]; // Hash of weights and biases
        let input = current_state.get(5..(5 + len_sample)); // Input features
        let expected = current_state[5 + len_sample]; // Expected output
        let mut weights = current_state.get((6 + len_sample)..(6 + len_sample * 2)); // Weights
        let mut bias = current_state[6 + len_sample * 2]; // Bias

        let mut output = bias;

        dataset_hash = (update_hash(
            dataset_hash,
            &[
                input, vec![expected],
            ].concat(),
        ) * flag); // progressively updates the hash at each row

        for i in 0..len_sample {
            output += input[i] * weights[i] / E4;
        }

        let error = output - expected;
        let gradient = error;
        for i in 0..len_sample {
            weights[i] -= (learning_rate * gradient * input[i] * flag / E8);
        }
        bias -= learning_rate * gradient * flag / E4;
        
        w_b_hash = row_hash(&[
            weights,
            vec![bias]
        ].concat()); // new hash for each row

        constraints[0] = frame.next()[3] - dataset_hash; // should equal zero
        constraints[1] = frame.next()[4] - w_b_hash; // should equal zero
    }

My question at this stage: Is there something fundamentally wrong with how I’ve set up the constraints?

Note that this is pseudoRust - I’ve taken out several type conversions etc for the sake of readability. E4 and E8 are 10000 and 100 million respectively. This adds a separate problem but I’m coming back to this after I solve the first problem. I’ve framed this question more formally in a GitHub issue here and the complete function can be seen here.

apotheosis · June 26, 2025, 7:16pm

I think the owner of the repo on Github will get back to you on your specific setup.

I am not super deep into Stark tech.

timk11 · June 27, 2025, 7:24am

Thanks for responding. I did try that but it’s been hard to get the right sort of answers so far. I might try repeating the question in a separate thread to see if anyone else out there might have some ideas.

Topic		Replies	Views
Zero Knowledge Internet Computer Virtual Machine Roadmap	63	7957	May 16, 2025
Technical Working Group DeAI Developers	347	13951	August 25, 2025
What Makes AI on Blockchain Hard? [Request for feedback on post] General	24	2365	October 29, 2024
AI and machine learning on the IC? Developers	114	10144	June 20, 2024
DeAI.chat – Decentralized AI chat on the Internet Computer Showcase DeAI	0	124	February 25, 2025