DeAi.chat Decentralized AI chat on ICP
DeAI.chat is a project focused on running AI models on ICP to build private models, AI agents, and provide a tool for building and customizing models.
We’ve recently completed a phase of intense research, which you can read more about below, and as a result, we’ve developed advanced proof of concepts. We are currently working on finishing a more advanced MVP that will bring significant changes to running large AI models on ICP.
The outcome of our efforts is a website that allows for simple interaction with the chat. Unfortunately, since our platform is currently connected to our PoC, the chat only responds with pre-learned stories.
Links
- Website & demo: https://deai.chat/
- Twitter / X: https://x.com/deai_chat
- Demo video: 2025-02-25 12-12-45.mkv - Google Drive
- LLM canister: https://dashboard.internetcomputer.org/canister/5fbuf-eyaaa-aaaag-qkazq-cai
Our goal became the efficient execution of AI models, which led us to implement AI model execution using Query methods. However, we also aimed to gather information about principals participating in the tests, so we implemented a payment gateway.
Currently, the payment gateway does not collect any funds but only gathers information about users who have interacted with the model. At any time, new applications can be added to enable payment collection.
Our journey
For those who have never heard of DeAI.chat:
DeAI.chat is a project that started as research into discovering the possibilities, blockers, and boundaries of AI on ICP. It’s one of the first AI projects on ICP that has deeply explored the question:
Does AI on ICP make sense?
Through testing, research, and ideation, we’ve been tirelessly working to find the perfect balance and purpose of it all.
Does AI on ICP Make Sense?
To answer that question: Yes, AI on ICP absolutely makes sense! We’ve identified key points of connection that align perfectly with this concept, and I’m excited to share them with you.
Let’s Start with How People Use Blockchain in the Context of AI
Blockchain is typically used in the context of AI for things like creating data markets, managing federated learning infrastructure, certifying responses, or even certifying models themselves. In the case of ICP and our research, we’ve pushed the boundaries further by deploying an AI model on the Internet Computer and running it.
The first idea that came to mind was pretty obvious: “Let’s just put the model on the blockchain!” And that’s exactly what we did – we uploaded the AI model to the blockchain piece by piece into a prepared canister and then ran it using the
candle
library developed by Hugging Face in Rust.
However, the process wasn’t as simple as it sounds here. We encountered several challenges:
- Hugging Face libraries weren’t adapted to work with the WASM version used by ICP.
- Instruction limits.
We managed to overcome each of these challenges in one way or another, and you can read more about it in “What Limits Does ICP Have?”. Of course, we manually adjusted the Hugging Face libraries by forking them. But the problem of the “instruction limit” still lingered, which we solved by adjusting the number of returned tokens.
That’s how we managed to run our first model on the blockchain. But waiting 3-5 seconds for 1-2 words wasn’t satisfying. We knew something could be done better. That’s when the idea of using Query Calls was born.
Query Calls vs. Update Calls
Query calls have much higher limits than update calls, but unfortunately, alongside the problems mentioned earlier, we also encountered stricter instruction limits and a heap size issue when we decided to run an even larger model.
What Limits Does ICP Have?
During our research, we uncovered the following limits and solutions:
- Heap Size: Running the model layer by layer. In a typical AI runtime, models are loaded all at once into RAM. In our approach, we load only the layer being processed, significantly reducing the RAM required for the heap size.
- Instruction Limit: The solution here is to reduce the number of generated tokens.
- ICP Computes on CPU: Again, the solution is to reduce the number of generated tokens.
- Library Adaptations: We had to replace all libraries written in C with their Rust equivalents and provide randomness and other system parameters in a way that’s compatible with ICP.
What Does Our Solution Provide?
We use ICP as a computational cloud, similar to serverless functions, to run AI. This is incredible because the infrastructure needed for AI is expensive, and to deploy your own model, which you fully own, currently requires paying large monthly sums. ICP, as a blockchain, operates continuously, allowing us to store many independent models on it. These models remain idle until they’re “woken up” and used by someone.
This makes ICP the perfect infrastructure for running AI agents and private AI models. Especially since running models using Query Functions guarantees privacy at the protocol level.