CanDB, the first scalable NoSQL Database on the Internet Computer

icme · June 22, 2022, 4:45pm

It is with great joy that I bring you CanDB, the first flexible and truly horizontally scalable NoSQL database built for the Internet Computer.

(Supernova Devpost submission link)

I built CanDB to enable the next generation of massively scalable applications on the Internet Computer.

I built CanDB for teams that are currently pushing their application canister storage limits, and developers with ambitions to build out the next viral application and scale it to millions of users and hundreds of gigabytes of data.

Here is a list of features that CanDB currently provides

Performant and Rich CRUD + scan APIs (not just scans like shown in the demo). Here is the endpoint that was used to backfill millions of comments into the demo application via a CanDB data structure on the Comment Actor
Canister cluster management features:

a. Support for rolling upgrades Code from the demo application deployed to the main net

b. Support for targeted canister deletion by partition key Code from the demo application deployed to the main net
Abstracted and easy to set-up auto-scaling

a. Set user-defined auto-scaling limits for your canisters, but don’t fear the responsibility, as CanDB will eventually auto-scale for you at a pre-defined internal threshold if you set your auto-scaling limits too high.

b. Use the createAdditionalCanisterForPK() hook in your canister responsible for auto-scaling in order to scale out specific canister actors when they hit their scaling limits.
Stable and persistent data through upgrades - CanDB keeps you safe by providing a flexible range of stable data types to store as attributes
An easy-to-use TypeScript client sdk - Set up client interactions with your Index Canister, your Actor Canisters, and then performing query and targeted update calls to the specific canister where the update should take place.

My devpost video demo contains a special announcement at the end. For those who make it that far and feel you qualify, please reach out to me via DM here, or at any of my contact handles listed in Supernova submission.

JaMarco · June 22, 2022, 8:54pm

What was the reason to develop this in Motoko as opposed to Rust?

icme · June 22, 2022, 10:41pm

Great question. First, I want to say that there is no need for CanDB to be written explicitly in Motoko, and I do have plans to eventually either build or collaborate with the IC Rust community to build a Rust version of CanDB.

Rust is an extremely powerful and mature language, but I decided to start writing CanDB in Motoko because it has the set of features that fits the IC exactly, whereas Rust has a super-set of language features to what the IC provides. This means that if I’m developing in Motoko I’m far less likely to run into unexpected errors or dead ends because the language won’t let me make those mistakes.

This follows the philosophy that I’m trying to transfer to CanDB, in letting developers focus on building their application logic in a way that feels “cloud native” to the IC without worrying that something they’re incorporating into their app will not be supported and result in “down-the-rabbit-hole” type of errors.

For example, CanDB is able to guarantee data stability and persistence because the APIs it provides only allow you to insert stable data types, protecting all of us developers from our worst enemy when it comes to bugs, ourselves .

Also, CanDB also sets fixed upper limits on how much data you can insert into a canister before it auto-scales, as well as before it tells you, “Hey you hit your canister data storage limit, did you not set up that auto-scaling hook thing I told you about? Well then I’m going to protect you from data loss and not going to let you turn your canister into a zombie by inserting more data into your canister until you set that up!”

Hopefully I’m not building CanDB and all of these abstractions in a silo, that would make for a very one-sided product that all of you could hate working with.

That’s why I’m aiming to work with 10-20 active and dedicated projects (in a closed alpha) over the next several months to refine these abstractions, as well as to further improve performance and gather both performance and cost metrics so that developers know what to expect when deciding to use CanDB in their applications.

apotheosis · June 23, 2022, 12:21am

Hey! I am from ICME.io a no-code tool for the IC. It would be cool to get this as a module so that users can press to deploy a CanDB instance and manage it all from the ICME UI.

cyberowl · June 23, 2022, 6:45am

I can’t wait to try it out for my next project. I need to dig a bit more before I commit. But overall good work, this is something we need.

icme · June 23, 2022, 8:45pm

Hey @apotheosis, first off sorry I took this forum name (icme) - I originally just make it as a pun before I knew that ICME.io was a thing. You have no idea how many people have DM’ed me thinking that I ran or was affiliated with your site .

To answer your question, the first step will be to build out CanDB as a full fledged open source library (not a managed service).

Because CanDB utilizes a client-centric architecture to optimize application performance, it consists of a TypeScript client library, and a Motoko backend library (I’d recommend going through my Supernova demo video again where I go into the architecture around ~4:50 if you’re curious). Both of these libraries are currently private, and will remain private until the end of the alpha-testing period and the beginning of the open beta (est. Q4 2022).

The Motoko backend library currently holds both the CanDB core data structure (used in your actor class canisters that will auto-scale), and the various functions and “hooks” that you use in your Indexing canister to index each of your actor class canisters that are spun up, as well as facilitate cluster management and auto-scaling.

Right now, you can almost think of CanDB as a framework and mindset you can use for designing your application with composable and scalable microservices. I’m sure there’s an even higher level abstraction that I could pull out eventually into a “low-code” type of functionality, but I’m cautious about prematurely optimizing and creating any more abstractions without feedback from developers.

This is why the upcoming closed alpha period is so important to the future of CanDB - having a wide variety of teams/projects with different use cases take part in it will help stretch CanDB and keep it as flexible and generic as it needs to be - after which the right APIs/abstractions for managed services should become more obvious.

ArjaanBuijk · June 25, 2022, 12:18am

@icme
Do you have any plans to also create a python client library?

That is what I would need for my project, where I currently use a managed db at a traditional cloud provider, and I am exploring the options to move the db to the IC.

icme · June 25, 2022, 11:11pm

Hi @ArjaanBuijk

What libraries do you currently use to interact with the IC? The ic-py Python client?

If so, I would love to work with you to bring Python client support for CanDB. In fact for my Supernova demo project I originally did some data wrangling in Python to properly format the Reddit comment data before uploading it with the candb TypeScript client to my CanDB application.

Ideally this Python client is something the community can help maintain in order to keep parity with different versions of the TypeScript client and compatibility with different versions of the CanDB backend library.

DM me if you’re interested.

AndraGeorgescu · August 13, 2022, 11:59am

This is super interesting for us at distrikt, we should talk! @dymayday and I will reach out to you next week

haida · October 1, 2022, 3:56am

Candb looks great, but it seems that the code base of candb is closed for open source sharing. “https://github.com/canscale/candb” As a dfinity ecological reward, shouldn’t the project be closed at will, because the dfinity ecological reward is to support the development of dfinity ecology. If sharing is closed, it seems that there will be no contribution to dfinity ecology?

SapereAude · February 8, 2023, 12:24pm

Hi Byron, Thanks for this great job! Could you please give an update on the progress? When do you expect to have this released to the public for production use?

icme · February 8, 2023, 10:17pm

Hey @SapereAude,

I’m constantly getting requests and inviting new developers and teams to use CanDB via the CanDB alpha.

There are already a handful of projects on the IC that are using CanDB in production, and just last week, Bink migrated their application to CanDB.

The main reason why I’m keeping CanDB in alpha is because there’s one major breaking change in the data structure that backs CanDB that I’ve scheduled for later this month (before public release). I’m going to provide a migration path for the alpha users, but in general once you release software publicly, it becomes much harder to make breaking changes without causing a lot of developer friction.

I’m more than happy to invite interested parties to the CanDB alpha if you’re burning to try it out before the public release - just send me a DM.

SapereAude · February 9, 2023, 8:22am

Thanks, great to hear – all makes sense this way!

We are currently still in an early research phase of IC, so the only thing I can do for now, is to wish you good progress and best of luck!

CanDB seems to me a crucial part of the eco-system - very exciting, thank you!

icme · July 21, 2023, 5:22pm

Hey IC Devs, Wanted to update you all on some releases and DX improvements to CanDB

Earlier this summer, the CanDB repository was publicly released by the ORIGYN foundation: GitHub - ORIGYN-SA/CanDB: CanDB - CanDB is a flexible, performant, and horizontally scalable non-relational multi-canister data storage framework built for the Internet Computer.
the CanDB documentation is now fully on-chain
CanDB can now be found on Mops
CanDB is a schemaless NoSQL data store, meaning you can dynamically change your data schema. ORIGYN recently added the candy library as a CanDB attribute value type, making your schemas even more flexible GitHub - icdevs/candy_library: Library for Converting Types and Creating Workable Motoko Collections)

Huge kudos to @skilesare, Fernando, James, and the ORIGYN team for these releases and DX improvements to CanDB!

Topic		Replies	Views
ICP.Lab Storage & Scalability Summaries Developers	16	4620	December 6, 2023
Stable Memory for the Database in Motoko Developers	1	651	July 24, 2023
Building for huge scale Developers	10	1218	May 10, 2022
[union-db] Let's build an infinite database together! Developers	22	1480	April 8, 2024
Looking for a database(preferrably with an SQL interface) I can run across multiple canisters Developers Discussing	1	395	September 25, 2023

CanDB, the first scalable NoSQL Database on the Internet Computer

Related topics