Blueband

tinybird · August 6, 2024, 12:30pm

Blueband is a vector database on ICP, building from the core of Vectra and its local db principles.

Blueband persists data like a traditional db and saves its embeddings on ICP’s stable memory. This setup can be ideal for use-cases involving small, mostly static datasets where quick comparison is needed.

Loading Data into Memory: The index, which contains metadata and vectors, is loaded from the persistent storage (a collection’s canister) into a system’s memory
Querying: Once in memory (initialized), the index can be queried to calculate and rank the similarity between saved vectors, and external prompts.

Getting Started

Prerequisites
To use Blueband, deploy a blueband_db_provider canister by adding the prebuilt canister to your dfx.json:

{
  "blueband_db_provider": {
    "type": "custom",
    "candid": "https://github.com/acgodson/blueband-db/releases/download/v0.0.9/blueband-db-backend.did",
    "wasm": "https://github.com/acgodson/blueband-db/releases/download/v0.0.9/blueband-db-backend.wasm.gz"
  }
}

You can point your backend canister to the blueband_db_provider’s canister to make storage calls from your backend.

ic-use-blueband-db: is a simple React library for interacting with your db on the frontend. It exports functions to load indexes into the system’s memory, save new items, and compare similarities between saved documents and external prompts using in-memory operations.

Usage

1.	Initializing

Connect actor and initialize index:

import {actor} from "./provider_actor_path";
import { useBlueBand } from "ic-use-blueband";

const ReactComponent = () = {
const { initializeIndex} = useBlueband();

const collectionId = "unique collection_id";
cons config = {
    collection: collectionId,
    api_key: OPENAI_KEY,
    /*chunk options*/
}

await initializeIndex(actor, config);

2.	Add Items

Add documents to the index:

const { AddItem, Query } = useBlueband();

const title = "Document Title or Url";
const content = "Document content...";

await AddItem(title, content);

3.  Query Items

Query the index to find documents similar to a given prompt:

const { Query } = useBlueband();

const results = await Query("query text");

//Results are ranked by similarity scores:

// [
//   {
//     "title": "Document Title",
//     "id": "document_id",
//     "score": 0.951178544877223,
//     "chunks": 1,
//     "sections": [
//       /*...*/
//     ],
//     "tokens": 156
//   },
//   {
//     "title": "Document Title",
//     "id": "document_id",
//     "score": 0.726565512777365,
//     "chunks": 4,
//     "sectio ns": [
//       /*...*/
//     ],
//     "tokens": 500
//   }
// ]

Links

Demo

kpeacock · August 6, 2024, 3:58pm

This looks like a really clean interface! Looking forward to checking it out

tinybird · August 6, 2024, 8:15pm

Thanks @kpeacock

made a quick demo here https://6fsnc-oaaaa-aaaag-aliwa-cai.icp0.io

Slower than ideal speed at the moment, because the OpenAI proxy for embedding is from my local machine

laska189345938458347 · August 7, 2024, 5:55am

@laughtt look for this case

domwoe · August 13, 2024, 7:09am

Great work! Please consider making a PR to GitHub - dfinity/awesome-internet-computer: A curated list of awesome projects and resources relating to the Internet Computer Protocol

tinybird · June 5, 2025, 1:34pm

Blueband Update – Rust Canister

From a hybrid in-browser model to a fully on-chain vector db. The goal remains the same: enable semantic document search via vector similarity—but now, the heavy lifting is done inside the rust canister.

Key Improvements

1. On-Chain Cosine Similarity

Previous architecture offloaded similarity search to the frontend/browser. Documents were embedded and stored persistently on-chain, but at query time, the index had to be loaded in-browser for ranking.

Now, we’ve migrated vector distance computations fully into the backend canister. The cosine similarity between query and document vectors is calculated on-chain, and results are ranked and returned as scored matches.

Example:

// Actor call to canister's demo_vector_similarity
const result = await actor.demo_vector_similarity(docs, query, EmbeddingProxyUrl, [1], []);
// Canister returns top-scoring [1] document with similarity score

2. Canister Logic

Blueband’s full-cycle document storage, embedding and computation logic is powered by a single backend canister. The logic to:

Create collections ref & store documents
embed documents,
store chunked vectors,
compute similarities, and
return ranked results

…is implemented in Rust and exposed via Candid.

You can see this in action in our SimpleTest class, where queries like “Which sport is more popular?” return cosine match to “Soccer is the most popular sport…” — directly from our on-chain vector engine.


const docs = [
  "Pizza is a delicious Italian food with cheese and tomatoes",
  "Soccer is the most popular sport in the world", 
  "JavaScript is a programming language for web development",
];

const query = "Which sport is most popular?";

const results = await actor.demo_vector_similarity(
  docs,
  query,
  "<openai-embedding-proxy-url>",
  [1], // Only return top result
  []
);

console.log(results);
// Returns: [{ score: 0.91, text: "Soccer is the most popular sport in the world", ... }]

3. 500 GiB Stable Memory Expansion

Earlier implementations were constrained by motoko limits. The Db can now:

Access 500 GiB stable memory for storage.

Why This Matters

This evolution would enable our projects:

Make 100% on-chain inference for semantic search
scalability and admin control for collections
Simpler, yet performant client-side SDKs (just query + result)

Blueband is one of the simplest vector db on ICP for static document storage and queries.

	Blueband	KinicDAO	ArcMind	Elna	Vectune
Algorithm	Hybrid (cosine + K-means)	Vamana	k-d tree	HNSW	FreshVamana
Best For	Static document search

GitHub and Documentation page. Excited to continue pushing this forward, and welcome feedback and contributions.

cryptoschindler · June 5, 2025, 2:54pm

I think this is worth sharing with the DeAI WG in case you haven’t planned already

Topic		Replies	Views
World's largest web3 vector database Rust Discussing	3	505	May 21, 2025
So the IC can't store files well either? General	16	3074	August 8, 2022
Looking for a database(preferrably with an SQL interface) I can run across multiple canisters Developers Discussing	1	402	September 25, 2023
Toolkit experiment: User controlled mind and body canister Showcase	0	362	January 9, 2024
Query_blocks <-- ICP ledger Candid 🙌 Language Support Motoko	15	2435	April 13, 2023

Blueband - vector database