CanDB - Rust port / implementation

I am applying for a $5k Developer Grant.

Please review my application and discuss! See DFINITY’s directions for becoming a registered reviewer here. They will be collected by DFINITY. When one week passes, DFINITY will release them and they will appear as a new section on this post.

Please review my application and discuss! If you would like to submit an official review, then please use the links below to register, see the rubric, and submit a review.

I’m looking forward to everyone’s input!

Reviewer Registration | Rubric for Evaluating | Submit a Review

MY APPLICATION:

REVIEWS:

6 Likes

What would be different about this implementation?
To my understanding the CanDB project is in separate canisters that get deployed vs being a library/integration with a Rust canister.
Is there something a rust canister could not do already with CanDB?
I havent used it with Rust, so I am in the dark here

Hi Gekctek,

There wouldn’t be any major difference in terms of features. It would be a straight port to Rust.
At the moment you can’t actually use CanDB with Rust canisters, at all.

Please see the examples in the repo CanDB/examples/multiCanister/simpleMultiCanister/src/user/User.mo at beta · ORIGYN-SA/CanDB · GitHub

We may well decide to restructure the entire project and have a core layer. With this core layer in place we could perhaps make client / library bindings for TypeScript and other ICP supported languages but that would require further talks with Bryon @icme - The original writer of the Motoko implementation. Some of the finer points about how the final code structure will end up haven’t been finalised. But at the very least this project aims to make CanDB available for Rust based canisters.

1 Like

I’m not opposed to any specific direction, just trying to get a good sense of the scope

If possible, it would probably be less maintenance long term to have a rust library but deploy the Motoko canister WASMs from rust, but I’m not sure I’d there are some trade offs with that

1 Like

Ah I see. Yeah, that could be a good strategy. It’s also a pattern that the community is used to so it makes sense to at least investigate it. I’m actually due to talk to Bryon next week to discuss some of the finer details. I am on annual leave for a week starting tomorrow so perhaps this proposal is slightly bad timed but I wanted to get some feedback and ideas from the community so big thanks for taking the time out of your day to message.

1 Like

I think what you want to build is a rust client and not the actual can db functionality. In actuality candb is more designed to be consumed from outside the IC than inside it.

Porting the actual code wouldn’t make a lot of sense and may actually be kind of difficult given the different programming models.

You probably want to port GitHub - ORIGYN-SA/candb-client-typescript: A TypeScript client for CanDB which consumes a candb datasource. I actually don’t know if we have a motoko client for that matter. Consuming data sources in the form that can db is most aligned with is not a great pattern for the IC. Canisters have a bit of trouble with unbounded data stores and responses can only be 2MB so you’ve got to chunk your retrievals over many rounds.

1 Like

Awesome! Glad to see someone take a stab at this. Where were you 2 years ago @frederico02 :rofl:

So CanDB is really a framework that has two different pieces. A backend library (currently only written in Motoko), and a client (currently only written in TypeScript).

The backend module is composed of two main parts:

  1. Storage (CanDB)
  2. Auto-scaling and partitioning module (CanDB Admin)

The storage module gets embedded in partitioned microservice actors, or Service Actors. This module is essentially an opinionated BTree.

The module responsible for auto-scaling and partitioning is embedded into the IndexCanister Actor.

The client module assumes that is working with a CanDB application, and provided the Index Canister Id, wraps query and update calls, abstracting communication between the frontend and service actor parition(s) where the data is stored, so that the client just calls the Index Canister and says call this API in this partition and the framework does the rest.

CanDB does all this in a way that doesn’t require inter-canister calls, which cuts down on latency and is how the original Supernova demo was able to run query and update calls across sharded data so quickly.

As CanDB is a framework meant for performance and low-latency UX with a frontend, it’s primary use case wasn’t originally intended for integration directly with other on-chain canisters. For that, you’d need to build a Motoko/Rust/Azle CanDB client, but inter-canister latency does slow things down considerably (it would still work though, and I’d be happy to advise someone on building additional clients).

I therefore think the primary value add from this grant would be bringing a canister partitioning & microservices framework/pattern to the Rust community.

There are a few key differences between Motoko & Rust that the implementer will need to consider:

  1. Motoko has the concept of stable variables, which actually live in the heap, and perform automatic, language-enabled serialization to stable memory during canister upgrades, and deserialization back to the heap afterwards. The CanDB framework makes frequent use of stable variables. The Rust implementation of CanDB may want to swap this out for stable memory use and the stable-structures library. Otherwise, if they want to use heap memory to store the data (for performance reasons or other), they need to build a good abstraction for the serialization and deserialization logic during upgrades.
  2. The CanDB entity type uses a compound variant type for dynamically storing different data types. The implementer will need to consider how to replicate this in Rust (with the stable/heap memory considerations described in #1).
  3. CanDB has auto-scaling functionality primarily because the canister heap is currently limited to a memory cap of 4GB (limitation of 32-bit wasm). However, the current canister memory limit is 400GB (entire memory, combination of both heap and stable memory), where stable memory has access to the full 400GB of memory. If the implementer decides to use stable memory Rust stable memory libraries, they can essentially use all 400GB. Additionally, given that the current subnet size is 750GB, does it make sense to turn on auto-scaling if a single canister can essentially be as large as a subnet?
  4. Given the points made in #3, you may actually want to be able to spin up different partitions and limit the size of each partition, rather than worry too much about auto-scaling.

Given the points made around Rust having easier access to stable memory in #3, I think the canister partitioning scheme that CanDB uses might be the most valuable part to port/adapt over to Rust, coupled with replacing the Motoko stable keyword libraries with stable structures.

There are a number of applications that have this single Index Canister to many Service Actor Canisters pattern from OpenChat to Yral, so it would be nice to see some generic and extensible multi-canister frameworks spin up that would reduce the amount of boilerplate developers need to write if they choose this app architecture.

I don’t want to distract away from the grant objective too much, but I might even recommend taking the bones of CanDB and then talking to @hpeebles and @saikatdas0790 from those respective teams to see what superpowers the Rust CDK has that would work well in a Rust adaptation of CanDB. I know Dragginz has a pretty extensible ORM and DB that @borovan’s been building out as well.

There’s also a number of utility APIs in the CanDBAdmin library that would be super helpful in managing a cluster of service actors. For example, an API that performs a paginated upgrade of all of the service actor canisters in a specific partition key range https://www.candb.canscale.dev/CanDBAdmin.html#upgradeCanistersInPKRange.

So I guess for scope:

  1. Taking into account feedback/learnings from other Rust multi-canister projects
  2. port/adaptation of the CanDB module to Rust using stable structures
  3. port/adaptation of the Index Canister pattern
  4. No auto-scaling (auto-scaling is less necessary given that Rust canister stable memory is very large).

That way, a developer can build their backend in Rust or Motoko, and the TypeScript client can consume from either without changing the client API.

2 Likes

To add onto the list of things to consider, when we get wasm64 running in a canister (hopefully soon), heaps are going to grow to the max size of a canister (400GB and counting).

This is another reason to not worry as much about auto-scaling to start off, as over the past 2 years improvements to the protocol are supporting larger and larger canisters.

1 Like

Great discussion here guys, appreciate the input. Yes it would seem with the recent changes to the increase in stable memory that the auto scaling feature is less of a concern right now.

On a slight tangent, I have in mind to create a video that compares creating an endpoint and storing data using AWS vs doing the equivalent in ICP. I want to get it so that it’s actually easier and quicker to do it in ICP with less boilerplate. Right now, it’s very easy to create the equivalent HTTP endpoint but the storage isn’t quite there and so removing boilerplate is something i’d like to see as well. I believe that the ease of setting up this combination of infrastructure is what originally attracted devs to use AWS for their hobby and start up projects and so I want ICP to be able to compete on the same level of convenience. In the end I want to get devs saying “just throw it on ICP, it’ll take a few minutes”. I believe it will drive a lot of potential adoption.

Okay, back to technicalities.

I will look at other projects to see what they’re doing for a multi canister approach and especially Dragginz as the ORM seems to provide the most value in terms of reducing boiler plate. I think leaving out the auto scaling is fine for now though I would still like to do it in the future.

Really my only concern now is using CanDb from a different rust canister - the inter-canister calls would probably increase the latency. It would require some more research on my part but given the 400GB stable limit we could effectively just keep everything on a single canister for now and keep the scope to what @skilesare suggested.

1 Like

For latency concerns it may be worth looking at the composite query scenario from rust. I know canpack make the assumption that a composite query on the same subnet is virtually synchronous and will execute in the same round when going from motoko to rust. Not sure about the reverse.

1 Like