CanDB, the first scalable NoSQL Database on the Internet Computer

It is with great joy that I bring you CanDB, the first flexible and truly horizontally scalable NoSQL database built for the Internet Computer.

:point_down: (Supernova Devpost submission link)


I built CanDB to enable the next generation of massively scalable applications on the Internet Computer.

I built CanDB for teams that are currently pushing their application canister storage limits, and developers with ambitions to build out the next viral application and scale it to millions of users and hundreds of gigabytes of data.


Here is a list of features that CanDB currently provides

  1. Performant and Rich CRUD + scan APIs (not just scans like shown in the demo). Here is the endpoint that was used to backfill millions of comments into the demo application via a CanDB data structure on the Comment Actor

  2. Canister cluster management features:

    a. Support for rolling upgrades Code from the demo application deployed to the main net

    b. Support for targeted canister deletion by partition key Code from the demo application deployed to the main net

  3. Abstracted and easy to set-up auto-scaling

    a. Set user-defined auto-scaling limits for your canisters, but donā€™t fear the responsibility, as CanDB will eventually auto-scale for you at a pre-defined internal threshold if you set your auto-scaling limits too high.

    b. Use the createAdditionalCanisterForPK() hook in your canister responsible for auto-scaling in order to scale out specific canister actors when they hit their scaling limits.

  4. Stable and persistent data through upgrades - CanDB keeps you safe by providing a flexible range of stable data types to store as attributes

  5. An easy-to-use TypeScript client sdk - Set up client interactions with your Index Canister, your Actor Canisters, and then performing query and targeted update calls to the specific canister where the update should take place.


My devpost video demo contains a special announcement at the end. For those who make it that far and feel you qualify, please reach out to me via DM here, or at any of my contact handles listed in Supernova submission.

44 Likes

What was the reason to develop this in Motoko as opposed to Rust?

2 Likes

Great question. First, I want to say that there is no need for CanDB to be written explicitly in Motoko, and I do have plans to eventually either build or collaborate with the IC Rust community to build a Rust version of CanDB.

Rust is an extremely powerful and mature language, but I decided to start writing CanDB in Motoko because it has the set of features that fits the IC exactly, whereas Rust has a super-set of language features to what the IC provides. This means that if Iā€™m developing in Motoko Iā€™m far less likely to run into unexpected errors or dead ends because the language wonā€™t let me make those mistakes.

This follows the philosophy that Iā€™m trying to transfer to CanDB, in letting developers focus on building their application logic in a way that feels ā€œcloud nativeā€ to the IC without worrying that something theyā€™re incorporating into their app will not be supported and result in ā€œdown-the-rabbit-holeā€ type of errors.

For example, CanDB is able to guarantee data stability and persistence because the APIs it provides only allow you to insert stable data types, protecting all of us developers from our worst enemy when it comes to bugs, ourselves :sweat_smile:.

Also, CanDB also sets fixed upper limits on how much data you can insert into a canister before it auto-scales, as well as before it tells you, ā€œHey you hit your canister data storage limit, did you not set up that auto-scaling hook thing I told you about? Well then Iā€™m going to protect you from data loss and not going to let you turn your canister into a zombie by inserting more data into your canister until you set that up!ā€

Hopefully Iā€™m not building CanDB and all of these abstractions in a silo, that would make for a very one-sided product that all of you could hate working with.

Thatā€™s why Iā€™m aiming to work with 10-20 active and dedicated projects (in a closed alpha) over the next several months to refine these abstractions, as well as to further improve performance and gather both performance and cost metrics so that developers know what to expect when deciding to use CanDB in their applications.

16 Likes

Hey! I am from ICME.io a no-code tool for the IC. It would be cool to get this as a module so that users can press to deploy a CanDB instance and manage it all from the ICME UI.

4 Likes

I canā€™t wait to try it out for my next project. I need to dig a bit more before I commit. But overall good work, this is something we need.

1 Like

Hey @apotheosis, first off sorry I took this forum name (icme) - I originally just make it as a pun before I knew that ICME.io was a thing. You have no idea how many people have DMā€™ed me thinking that I ran or was affiliated with your site :laughing: .

To answer your question, the first step will be to build out CanDB as a full fledged open source library (not a managed service).

Because CanDB utilizes a client-centric architecture to optimize application performance, it consists of a TypeScript client library, and a Motoko backend library (Iā€™d recommend going through my Supernova demo video again where I go into the architecture around ~4:50 if youā€™re curious). Both of these libraries are currently private, and will remain private until the end of the alpha-testing period and the beginning of the open beta (est. Q4 2022).

The Motoko backend library currently holds both the CanDB core data structure (used in your actor class canisters that will auto-scale), and the various functions and ā€œhooksā€ that you use in your Indexing canister to index each of your actor class canisters that are spun up, as well as facilitate cluster management and auto-scaling.

Right now, you can almost think of CanDB as a framework and mindset you can use for designing your application with composable and scalable microservices. Iā€™m sure thereā€™s an even higher level abstraction that I could pull out eventually into a ā€œlow-codeā€ type of functionality, but Iā€™m cautious about prematurely optimizing and creating any more abstractions without feedback from developers.

This is why the upcoming closed alpha period is so important to the future of CanDB - having a wide variety of teams/projects with different use cases take part in it will help stretch CanDB and keep it as flexible and generic as it needs to be - after which the right APIs/abstractions for managed services should become more obvious.

6 Likes

@icme
Do you have any plans to also create a python client library?

That is what I would need for my project, where I currently use a managed db at a traditional cloud provider, and I am exploring the options to move the db to the IC.

Hi @ArjaanBuijk :wave:

What libraries do you currently use to interact with the IC? The ic-py Python client?

If so, I would love to work with you to bring Python client support for CanDB. In fact for my Supernova demo project I originally did some data wrangling in Python to properly format the Reddit comment data before uploading it with the candb TypeScript client to my CanDB application.

Ideally this Python client is something the community can help maintain in order to keep parity with different versions of the TypeScript client and compatibility with different versions of the CanDB backend library.

DM me if youā€™re interested.

1 Like

This is super interesting for us at distrikt, we should talk! @dymayday and I will reach out to you next week

5 Likes

Candb looks great, but it seems that the code base of candb is closed for open source sharing. ā€œhttps://github.com/canscale/candbā€ As a dfinity ecological reward, shouldnā€™t the project be closed at will, because the dfinity ecological reward is to support the development of dfinity ecology. If sharing is closed, it seems that there will be no contribution to dfinity ecology?

Hi Byron, Thanks for this great job! Could you please give an update on the progress? When do you expect to have this released to the public for production use?

3 Likes

Hey @SapereAude,

Iā€™m constantly getting requests and inviting new developers and teams to use CanDB via the CanDB alpha.

There are already a handful of projects on the IC that are using CanDB in production, and just last week, Bink migrated their application to CanDB.

The main reason why Iā€™m keeping CanDB in alpha is because thereā€™s one major breaking change in the data structure that backs CanDB that Iā€™ve scheduled for later this month (before public release). Iā€™m going to provide a migration path for the alpha users, but in general once you release software publicly, it becomes much harder to make breaking changes without causing a lot of developer friction.

Iā€™m more than happy to invite interested parties to the CanDB alpha if youā€™re burning to try it out before the public release - just send me a DM.

4 Likes

Thanks, great to hear ā€“ all makes sense this way!

We are currently still in an early research phase of IC, so the only thing I can do for now, is to wish you good progress and best of luck!

CanDB seems to me a crucial part of the eco-system - very exciting, thank you!

1 Like

Hey IC Devs, Wanted to update you all on some releases and DX improvements to CanDB

Huge kudos to @skilesare, Fernando, James, and the ORIGYN team for these releases and DX improvements to CanDB!

10 Likes