Introducing Cipher AI Vault - A fully sandboxed AI demo w/ memory

Project Overview

Cipher AI Vault is a groundbreaking proof-of-concept that seamlessly integrates in-memory VectorDB and LLM functionalities within a canister on the Internet Computer. Designed for developers and researchers at the forefront of AI and blockchain technology, our project addresses the growing demand for secure, scalable AI tools by operating within a fully sandboxed environment.

Key Features:

  • In-memory VectorDB for efficient data retrieval
  • In-memory LLM for natural language processing
  • Secure asset and data storage
  • Dual sandboxing for enhanced security
  • Integration with multiple wallet options

Web3 Advantages

Cipher AI Vault stands out from traditional Web2 AI platforms by leveraging the power of blockchain technology:

  1. Sandboxed Environment: Our solution operates within the secure confines of both the browser and the canister, ensuring unparalleled data protection.
  2. Decentralized Processing: Unlike centralized AI solutions, all data processing occurs within the sandboxed environments, eliminating risks associated with external data centers.
  3. Tamper-Proof Data: The blockchain-based infrastructure ensures that your data remains intact and unaltered.
  4. High Availability: Say goodbye to downtime concerns – our decentralized approach guarantees continuous accessibility.

Technical Architecture

Built using the Azle framework, Cipher AI Vault enables TypeScript-based AI development for the Internet Computer. Our architecture emphasizes in-memory operations within a secure, sandboxed environment:

  1. Frontend Canister: The main entry point for user interactions.
  2. In-memory VectorDB: Manages embeddings for lightning-fast retrieval.
  3. In-memory LLM: Processes natural language queries and interacts seamlessly with the VectorDB.
  4. Secure Asset Storage: A dedicated module leveraging the Internet Computer’s asset layer.
  5. Secure Data Store: An Azle-based canister for robust data management in stable memory.
  6. Cycles Distro Canister: Efficiently manages cycles and top-ups.
  7. ic-auth: Handles authentication with various wallets (Plug, Stoic, NFID, and Internet Identity).

Internet Computer Superpowers

Cipher AI Vault harnesses the unique capabilities of the Internet Computer:

  • Dual Sandboxing: Combines browser-based and canister sandboxing for highly secure, isolated AI operations.
  • Secure Asset & Data Storage: Protects all information from tampering and ensures continuous availability.
  • Multi-Wallet Authentication: Implements the ic-auth module for flexible and secure login options.

Project Status and Achievements

We’re proud to announce that Cipher AI Vault has reached a fully functional proof-of-concept stage, with several key milestones:

  • Operational in-memory VectorDB and LLM within the canister environment
  • Secure asset storage utilizing the Internet Computer’s asset layer
  • Robust data storage using stable memory
  • Integration and open-sourcing of the ic-auth npm module
  • Efficient cycle management with a developer-friendly open-source module

Future Roadmap

We’re committed to continuous improvement and expansion. Our future plans include:

  1. Data Store backup canister
  2. Edit functionality for Data Store file entries
  3. Support for multiple in-memory LLMs
  4. Model storage in asset canisters
  5. Embeddings backup in Stable Memory
  6. Document-to-Data File generation using in-memory LLM
  7. In-memory Stable Diffusion for image generation and storage

Get Involved

We invite developers, researchers, and blockchain enthusiasts to explore Cipher AI Vault:

Call to Action

We’re excited about the potential of Cipher AI Vault and invite you to join us on this journey:

  1. Developers: Contribute to our open-source projects and help shape the future of AI on the Internet Computer.
  2. Researchers: Leverage our platform for your AI experiments in a secure, decentralized environment.
  3. Blockchain Enthusiasts: Explore the intersection of AI and blockchain technology through our innovative solution.

Let’s revolutionize AI together with Cipher AI Vault – where security meets innovation on the Internet Computer!


Have questions or want to collaborate? Drop a comment below or reach out to us directly. We’re eager to hear your thoughts and ideas!

6 Likes

This is cool! You should connect with @patnorris to see if you can present at the DeAI Technical Working Group!

A few follow-up questions:

  1. What was it like working with Phi-3-mini-4k-instruct?
  2. It looks like you are using your own vector DB implementation. Is there a reason for that as opposed to using the existing vector DBs on ICP?
3 Likes

Sure! I’ll definitely reach out and explore the Working Group opportunity.

  1. Working with Phi-3-mini-4k-instruct has been a positive experience. I’m impressed with the progress of in-memory models, particularly those utilizing WebGPU. Having worked with these models for a while, I’ve found that they’ve recently reached a point where they’re genuinely useful. The project is designed to support easy model swapping, and we plan to let users choose from various models in the future.

  2. Yes, we’re using a custom version of client-vector-search. We chose this over existing ICP vector DBs because our solution is fully in-memory, implemented in TypeScript with HNSW, and doesn’t rely on external bindings. It runs entirely on the client side, is lightweight, and developer-friendly. This approach enables us to maintain a completely sandboxed environment.

1 Like

Thank you for the follow-up.

  1. I am glad that Phi-3-mini-4k-instruct is working well! How’s the overall performance? Did you hit the instruction limit at any point?
  2. Cool, I’ll look more into client-vector-search. I’ve been trying to learn more about how Vector DBs work!
1 Like

cool stuff :+1: if you like, we’ve got a weekly DeAI working group call for ICP in the Discord

this is the event for this week: ICP Developer Community

1 Like

The Phi-3-mini-4k-instruct model has been working surprisingly well, delivering fast and relevant responses. So far, we haven’t hit instruction limits in our tests, and the setup has been efficient for most use cases. The use of client-vector-search helps by pulling relevant context from an embedding space before the LLM responds, which lightens the load on the model and helps avoid those limits, especially with more complex queries.

We’re also exploring several other models, like Phi-3-mini-128k-instruct-onnx, to handle larger prompts and datasets, which could further enhance scalability and performance.

1 Like

Thank you!

I’ll definitely check out the group! I’m very interested in joining and will likely make time for it soon.

1 Like

Isn’t the app doing the inference at user’s side, that’s it, at the user browser, loading the LLM models and running them in the frontend using WebGPU?

How are you going to hit any instruction limits with that? :thinking:

Edit: Yeah, it’s: Cipher-AI-Vault/frontend/frontend/hooks/modelManager/llm.js at main · supaIC/Cipher-AI-Vault · GitHub

You’re right, the inference happens client-side using WebGPU, but the instruction limit is determined by the model itself, not the execution environment. Each model has a fixed token limit it can process in one go—like 4k tokens for Phi-3-mini-4k-instruct. So even in the browser, larger prompts could hit those limits, which is why we’re considering models like Phi-3-mini-128k-instruct-onnx for handling larger datasets.

3 Likes

Awesome work on Cipher AI Vault!

1 Like