Complete: ICDevs.org Bounty #20 - QuickStart Dapp - Scaling With Canisters - 200 ICP, 100 ICP, 50 ICP - Multiple winners

Hey @skilesare , great timing with this bounty. I had “research canister scaling” as a grant application todo, so I took a stab at it. Here’s a brief description of the project. There’s also a screen cap with the front-end, demoing the main concepts of the app.

Architecture: I went for the full scaling solution, where clients upload content to many buckets. The buckets create indexes based on the content they received and periodically (5s for the demo - but configurable) send the index to an Indexing canister. The Index canister instructs the front-end to upload new content to particular canisters based on the indexing strategy (more below). The Index canister also serves an index of canister IDs to the front-end, based on the indexes received from the Buckets. Thus the front-end first requests an “index” for #dogs, and then queries all the canisters where #dogs content was recorded, and displays the list.

To demo “user access” I chose a trivial top-to-bottom approach. “Moderators” are added to the Index canister (based on a principal) and the Index sends the “moderator list” to every Bucket. By default the Buckets only serve content created by the requester, or by Anonymous. If a moderator is added, however, they can view every piece of content uploaded to that Bucket.

I’ve used “entries” here, to denote pieces of content. This could be anything that we can track and quantize on a bucket. It could be megabytes for file storage, live sessions for proxy canisters, users served for game realms, etc. The 20 entry limit was chosen just so we can see the spawning in a live environment.

Implementation: The study is written in rust, with a react mock-up for front-end.

The canisters rely on heartbeat() functionality, as I wanted to also test this at scale. There are a ton of ways to optimize the flows, and the architecture could probably support having heartbeat() functionality just on the Index canister. Or, better yet, a dedicated heartbeat-enabled canister for the entire project.

Code organization: I tried to be as non-opinionated as possible. There are a lot of great production-ready projects out there where one could chose to get inspiration on file organization. I chose to keep it as simple as possible, so that people reading the code focus on the IC stuff and not on implementation details. A lot of things could obviously be optimized and better organized.

Both the Index and Bucket canisters have 4 main source-files. lib.rs deals with canister settings, and IC-related calls (queries and updates);
businesslogic.rs deals with … business logic. Here lies the main impl for most of the functionality;
env.rs is an adaptation from @hpeebles’ starting-project and deals with helpers for cdk API;
lifetime.rs deals with pre and post upgrades and the heartbeat function.

The front-end is simply thrown together to demo the canister workflows, nothing to write home about.

Key things to note when playing with the demo (some of the things can be hopefully seen in the video below):

  1. There are two indexing strategies implemented - FillFirst and BalancedLoad. The default one is Balanced. The front-end first requests a list of can_id where to upload content, and then calls the first canister in the list. We can imagine the Index canister using a multitude of indexing strategies, based on business needs. (an interesting one would be grouping content to optimize for querying as few canisters as possible on content display)

  2. The Index canister maintains a number of metrics. The main thing to notice is the relation between Free Slots, Desired Free Slots and Planned Slots. Free slots are computed every 5 seconds. When Free Slots becomes lower than the Desired number, a new bucket is planned for and added to the spawn queue. We add the planned slots, so that we don’t over-add too many canisters if the spawning process takes ~4-5 seconds and the heartbeat() gets called multiple times.

  3. When sending multiple tags in a short time, notice that even though the Indexing Strategy wants to add content to the lowest canister, the re-indexing happens once every 5 seconds. So multiple entries will be sent to the same canister. (this turns out not to be a problem in larger deployments).

  4. Adding a moderator to the Index canister will get populated to all the Buckets on the next heartbeat update, as this would probably need to be as close to synchronous as possible, since a real-life implementation would also require deleting moderators, and this approach would prevent weird edge-cases where moderators could still edit content on slow-to-update Buckets.

Play around with the demo, and let me know if there are any questions.

Note: please do not use this code as is in production!!! This was approached as a scalability study, and it is not optimized, and not tested well enough to be production ready. Feel free to use the code as you want, but please make sure it’s tested and stable before using this with real stakes!

Repo:

Video Demo:

7 Likes