Complete: Bounty #20 - QuickStart Dapp - Scaling With Canisters - 200 ICP, 100 ICP, 50 ICP - Multiple winners

Thank you for the kind words :blush:

For the first question, that’s literally on my whiteboard right now :slight_smile: I have two diagrams, one with an index of indexes and one with simply two index canisters. There are a couple of benefits of simply having two Index canisters, maybe sorted by “frequently used stuff”. I believe that 8gb ought to be enough (famous last words), and worst case scenario sometimes the clients will query both index canisters (instead of always making two queries). I guess we’ll have to grow and see…

Your second question is a bit more complicated, and I don’t have a good answer at this point. There are a lot of factors going in how you decide to index your data, and obviously there are some tradeoffs, one way or the other. My hope for my project is that it’s not a critically real-time system, thus I have the benefit of being able to schedule tasks and as long as they “eventually” complete, the system should still work. In other words, I might be able to have many indexing tasks work on each Bucket, and the main Index would just point towards them. It’s a kind of both distributed storage and distributed computing. The tradeoff is of course the front-end needs to do more queries. Looking at existing projects, that seems to be a good tradeoff to make, as the IC seems uniquely positioned to serve this use case with ease. When in doubt, we’ll “simply” add another layer of caching, trading space for compute time.

1 Like

Hey, as promised I updated my submission with docs, some diagrams, and cleaner code and an instructional tutorial. I will remove the old reply. Thanks! I hope its not too late

1 Like

The summary is from Github
The purpose of this project is to create canisters that have shared ownership. However, this is different than allowing multiple canisters to have control of the canister in the default sense. The problem is some default canister functions give full access to anyone with control (ie. there is only one level of control). A quick analogy is in a joint bank account from bank, you can have a married couple with each of their name on it but each party is able to completely withdraw funds. In a divorce, things can become messy if one party decides to withdraw everything quickly since both parties have full control. Instead, it would be nice to create a joint bank account where maybe people can vote on money spent, split evenly the amounts, divy up the percentage owned etc. In essence if the default settings for multiple users with control over a canister is like a legacy joint bank account,this program is meant to create extra sharing functionality for canisters!

I achieve this by creating a priary canister that creates canisters. This canister retains default control of the new canisters and any customized ownership is kept track in a list in the secondary canisters. Users interface with the secondary canisters through method calls from the the primary one. Essentially, the primary canister has an assoclist (dictionary) that has the users principal ids as a key and a buffer of secondary ids per principal id. This achieves project goal 1.

Primary canister provides indexing information such that a client can distribute prallel calls across secondary canisters directly.

principalids/users can join and unjoin secondary canisters as they please. (However, custom membership, limiting number of principal ids/users, etc. can be added). For now it is just join and unjoin. However, actual calling of canisters isnt from the the principalids/users themselves but from the primary canister. If we remove all control from the primary canister after enough development, the system can be quite trustless as default control would be essentially “black holed”. This achieves project goal 2.

Provide a security interface such that secondary canisters can hold private data from many users but only deliver requests to authorized requesters. Attempt to use as few inter-canister calls as possible.

TO SEE CODES STRUCTURE, SEE DIAGRAMS AT BOTTM OF THIS PAGE. Essentially, is the primary canister, and it creates many instances of NodeCanisters ( These live in /src/Main directory.


@skilesare , any updates on this and the other bounties for the hackathon?

Office hours tomorrow.Hopefully an update today.

1 Like


Technical features of ICSP:

  1. Infinite Capacity ICSP Canister: read and write to the same Canister without having to worry about storage space.
  • Explanation:
    • infinite capacity refers to the infinite creation of the Canister contract (in the case of a sufficient Cycle) , which supports the automatic creation of the Canister when the storage Canister is full and does not block the creation of data writes, smooth and smooth switching of storage destination.
  1. Support CRUD(only support read and write at current version)
  • Explanation:
    • in the business, supports the data to add, delete and check four operations, and in the related operations on the memory of the appropriate optimization, to support the reuse of fragmented memory(Next Version).
  1. One Step Store Data and Two Steps Get Data
  • Explanation:
    • One-step storage: support back-end direct: ignore await store the key and value into ICSP Canister. When store (key, value) , do not have to wait for the return value, which creates the convenience of storage.
    • Two-step get : first obtain from the ISP which storage unit is stored in, and then obtain metadata from that storage unit(bucket).
  1. Cycle automatic monitoring
  • Explanation:
    • ICSP Heartbeat actively triggers the monitoring of ICSP and storage monomer Cycle balances and automatically Top Up, so the user only needs to monitor the ISP’s ICP balances.

Welcome to check out !


Awesome and pure key/value store! I’d love to see you replace blob with CandyLibrary so you can store all kinds of data!

1 Like

No, please don’t replace anything with CandyLibrary. The native types exist for a reason.

There’s huge overhead with variant types, as we’ve discovered today in our project. I couldn’t think of anything worse than having a “variant type of all types”

What did you figure out? I thought it was like two bytes as long as you kept it under 256 entries?

And how else are you doing hierarchical data?(If you are)

I can’t tell you exactly because it’s a black box. We replaced variants with text fields and reduced our wasm function count by 600, from 5600 down to 5000.

It wouldn’t matter if there wasn’t a 6000 function cap on the wasm binary.

wasm-objdump and wasm-opt have helped us a lot

Good to know. Thanks. We’re doing a bunch of json and whatnot and nested documents that seem to require variants of some sort.

Okey, I will try it !

1 Like

It’s with great pleasure I present the fruits of weeks of labor: Scaled Storage


Before starting this bounty, I knew I wanted to build a general solution that could be easily added into any rust project.

While working on the bounty, the following goals drove most decisions I made:

  1. Simplicity
  2. Ease of use
  3. Storage efficiency
  4. It should work as an imported library.

Scaled Storage is a generic distributed hash table tailored for the internet computer. It can scale to possibly infinite amount of canisters, with a worst case scenario of one inter-canister calls and usually a best case of zero (I’ll got into this later). I’ve currently tested it to 10 canisters. The client side never needs prior knowledge of all canisters holding data, but instead just the canister id of any one of the canisters.


  1. Imported as a Rust library
  2. Scales up or down depending on developer defined behaviour.
  3. Quine” style canister replication. All canisters are functionally alike and use the same code.
  4. House keeping operations (migrations, scaling up and down) are abstracted away.
  5. There isn’t a “primary”, “index” or “secondary” canister, any request can be taken from any canister.
  6. Tries to reduce inter-canister calls.

Developer UX

  1. Developer imports the scaled_storage library
  2. Copy-pastes a few predefined scaled_storage operations.
  3. Uploads the canister’s WASM using ss_uploader.

How It Works
Scaled Storage uses a consistent hashing algorithm to map keys to their appropriate canisters.

When a key is accessed through the exposed functions, the result is either the value or the canister where it is located, In the latter case the request is simply forwarded to that canister. Here’s a code excerpt:

match canister.with_data_mut(key.clone(), |data| data.clone()) {
   NodeResult::NodeId(canister_id) => {
    //forward request to canister_id or return to client side
   NodeResult::Result(result) => {
    //return result to client side

Scaled Storage provides the following aptly named functions:

  1. with_data_mut
  2. with_upsert_data_mut

Upgrades & Migrations
When nodes are added or deleted, the changes are broadcast to all other nodes.

pub enum CanisterManagerEvent {

These nodes then update their node list and hash function. The affected values that need to be migrated are converted to a generic byte array and sent to the relevant node.

#[derive(CandidType, Deserialize, Debug, Clone)]
pub struct MigrateArgs {
    #[serde(with = "serde_bytes")]
    data: Vec<u8>,

///candid file
type migrate_args = record {
    data: blob;

The consistent hashing algorithm in question is Anchor Hash which has “high key lookup rates,and a low memory footprint” according to the authors of its paper.

This particular algorithm guarantees that given N number of nodes and K number of keys, the hash function distributes K/N number of keys to each node, achieving uniformity. It also reduces the amount of inter-canister calls necessary by:

  1. Only needing to migrate values to a newly created canister (instead of having to migrate to other nodes).
  2. Only needing to migrate values from a deleted canister without disrupting the location of other keys


Scaled Storage can be described as a linked-list of nodes with each node having the following:

  1. The previous node id
  2. The next node id
  3. A consistent-hashing function mapping keys to a canister
  4. A LIFO list of all nodes.
  5. The underlying data

I chose this linked list structure over an “index node” → “secondary node” structure for the following reasons:

  1. Canisters/Nodes have exactly the same code and behaviour.
  2. Minimal Responsibilities - Each node is only responsible for its subsequent node and serves as its sole controller (with the developer being the controller of the first node). This reduces surface area for any accidents and gives each node a sole scape-goat.
  3. It reduces memory fragmentation - new canisters are only created after prior ones are full. Canisters cannot be removed in an arbitrary order, but must first have all subsequent nodes removed first.
  4. Reduces inter-canister calls since there’s no need to check the memory usage of all canisters, when the last canister wants to upgrade, it can simply assume all other canisters are full.

Issues and Solutions

  1. Large binary from in-lining wasm via include_bytes!:
    Wasm code for creating new canisters is added in heap memory at runtime via an init_wasm operation. I’ve created wasm_uploader for this.
  2. Payload size too large for init_wasm operation:
    The wasm binary is chunked and sent one at a time.

type wasm_init_args = record {
    position: nat8;
    wasm_chunk: blob;


  1. Update operations take at most two calls; one made by the end user and a second potential inter-canister call if the hash function points to another canister.
  2. Query operations take at most three calls; an additional call made by the end user.
  3. Since operations can be taken from any canister, operation results could contain the id of the canister and stored/cached on the front-end. Subsequent operations could then be directly requested using the stored canister id.

Tests and examples

  1. Scaled Storage Simple Scaling Test
    This demonstrates the scaling capabilities of scaled storage. The default test demonstrates 10 canisters storing simple string types.
  2. IC Snippets
    This is a more complex example, demonstrating storing more advanced data structures, and authentication to prevent unauthorised access:
        .with_upsert_data_mut(snippet_key.page_id(), |page| {
            match page.get_snippet(&snippet_key) {
                Some(snippet) => {
                    let owner = snippet.owner;
                    if owner != ic::caller() {
                        return Err("Auth Error");
               `  ...

Future updates and issues:

  1. Consistent hashing algorithm change

Although Anchor hash prevents unnecessary data migrations, it distributes data equally amongst nodes, typically this is a desirable effect, but for Scaled Storage this means “filled” nodes still have data inserted in them (although the possibility of that reduces as the number of nodes increases).
A better hash function would have the following properties:

a. Given N nodes, only distributes keys to node N.
b. When number of nodes are increased from N to N+1, only keys in N have to be migrated to N+1.
c. When number of node are reduced from N to N-1, only keys in N are migrated to N-1

This would limit the number of canisters to be migrated to only one; either the prior node (downgrades), or the next node (upgrades).

  1. A rewrite implemented as a finite state machine: easier to read, more bug free code since state transitions can be validated at compile-time, easier to write rollback operations when state changes error.
  2. Macros to reduce the amount of boilerplate to write.
  3. Downscaling hasn’t been implemented.
  4. More operations on the underlying data apart from inserting and getting data. For example map reduce operations that require data from every node.
  5. Authenticate house-keeping requests between nodes. Currently it is trivial for a bad actor to pose as a newly created node, receiving migrated data.
  6. Query calls cannot be forwarded currently, the first demo’s query forwarding only works locally, pending inter-canister query calls. For other demos, the canister id is returned to the client.
  7. Cycles sent to one canister should be automatically distributed amongst all canisters.
  8. Scaled storage uses heartbeat, it may be possible to avoid this and instead “hitchhike” on other requests, reducing cycle consumption.

There’s still significant work to be done, but the foundation for a generic easy to use scaling solution has been created. Thanks for reading.

  1. Scaled Storage repo: Github
  2. Scaled Storage library crate: Crates
  3. Scaled Storage uploader crate: Crates
  4. Canister for the simple scaling test: qjikg-riaaa-aaaag-aaf7a-cai

Curious, was this wasm function count decrease with respect to variants observed strictly when writing your code directly in Motoko, or when writing your code in Go and then using that tool to compile it to wasm?

First off, I want to say that I absolutely love this scaling idea and what you’ve done with data partitioning, especially in the context of canister reliability. Very cool work, and I can tell you put a lot of hard work into designing and building this!

If one can distribute their data enough that one canister has the same management capacity as all canisters, then it becomes much harder to DDOS an entire application (you may just take one or more canisters offline).

Although I love the creativity of this idea, I see several potential issues with the scalability of the design in terms of the involvement of inter-canister update and/or query calls required for all the nodes to remain in sync. The fact that multiple canisters would constantly be rebalancing with each other (using inter-canister update calls) quite often means that when under significant load the entire application would most likely slow down and get “behind” on rebalancing.

In this situation imagine Canister A trying to update and re-balance with Canister C, Canister C simultaneously trying to re-balance with Canister B, and Canister B is trying to re-balance with Canister A. We end up with inter-canister update call 2+ second “locks” that lock up the rest of re-balancing efforts.

In a large enough application with enough of these requests coming in, I could see this eventually leading to application failure.

A few questions.

  1. When the number of nodes is increased from N to N+1, how long does the complete re-paritioning take? How does this change as we go from N=3 to N=50?

  2. If you have N=10 canister nodes and you hit each canister with the exact same ~500 requests in a second, how long does it take to re-balance? What about ~1000 requests/sec?

  3. Does Anchor Hashing allow me any control over knowing the exact canister node/partition that my data was sent to? For example, can I easily ensure that all data for a specific user is in a specific canister, or do I need to spin up a new scaled storage actor for each user in order to do this?

  4. How is the client expected to know of and randomly pick which canister to interact with at any one time given that the management canister generates random canister ids?

1 Like

We were at about 5800 wasm functions (not using any optimisers at that point). We had about 17 entities that used variants, ie…

#common; #uncommon; #rare

I changed the go generator to skip variants and just reference them by ID (like a normal database relation), then recompiled and we were at 5200.

Going through the wasm-objdump logs and grepping for #common returned way more lines than I would have expected. It was at about 40 per variant, which makes sense if we dropped about 600 funcs.


Hi, thanks a lot for the praise.
About re-balancing, only the last canister has sole responsibility concerning upgrades or downgrades. So it’s impossible for an upgrade and downgrade to occur simultaneously. Also migrations (or re-balancing if I understood you correctly) only occur two ways: To the last node and from the last node, no other canister is affected in the re-balancing operation.


  1. This needs benchmarks, i can’t really say as it depends on the size of the values stored: a simple string contains far less bytes than a struct or a hashmap. Also there can be optimisation made to make sure the data migrated is chunked at little as possible; reducing the time it takes to complete a migration.
  2. Rebalancing should be very rare; it happens only when a canister is filled up or is empty. Also since it’s developer defined behaviour, you can only benchmark this with an actual concrete implementation.
    For example it’s possible to define that the canister should be considered filled up when it has only 2GB memory remaining and empty when it has 1.9GB; this is terrible logic and nobody in their right mind would do that, but alas with great power comes…Anyways you can imagine if there are a lot of creation and deletion of keys, canisters would be created and deleted wildly but again it is the responsibility of the developer using the library to use good upgrade/downgrade logic.
  3. Yes, you simply call a function with the data’s key and you get the exact canister id it is located (happens in nanoseconds).
  4. It’s not up to the client to know anything. Ideally the system looks like one canister to the client.
    To explain further, a client just needs to know one canister id, and update operations work seamlessly since the canister that the client knows, would forward the request automatically to appropriate canister.
    For query requests though, due to to query calls not being supported on the IC yet, requests can’t be forwarded and instead the appropriate canister id is returned to the client for another query request.

A good practice would be to return the canister id of both query and update requests (Check out the second demo on how I did it). This canister id can be cached in localstorage and called later:

  1. If the cached canister id points to a now deleted canister, the client can simply call the first canister (Since it is guaranteed to always exist).
  2. Even if the data has since been migrated to another canister, its new canister id is returned.

This looks fantastic. Amazing job!

1 Like

Update for the description of the second example (ic_snippets).
The goal was to be able to serve paginated data which is somewhat difficult given we’re storing data on what is basically a key value store. Here’s how snippets are stored:

Each page has a max number of snippets before a new page is created. So now the client can request for a page and request for prior or subsequent pages on demand

1 Like

@simdi.jinkins Thanks for correcting my misunderstanding of how and when re-balancing happens (when node “full” threshold is hit) and taking the time to answer my questions.

I hope you don’t mind me asking few follow-ups :sweat_smile:

(Follow up Q1)

I’m still not quite clear on the details of how when a new canister is spun up (we increase # of nodes from N → N + 1 → N + 2), how the client becomes aware not just of how many canister nodes exist and the canister node partition to send the request to, but the specific canisterId corresponding to that node.

When the management canister creates a new canister parition node, it generates a completely random canisterId along with that node, one where the application developer cannot fix, control, or predict what the canisterId will be (I wish we could do this!).

How would a client then be able to guess this random canisterId when making requests if it has no control over the canisterId generation? There must be some “fixed” initial canisterId representing Canister A in the client application that the client can first use to make contact with the data storage solution, correct?

I see this as your solution to the issue I mentioned above. The client is told a single canister that they will interact with, and they use this initial canister to then get all information about the current array of canister ids and the current hash ring that they can then use AnchorHash algorithm with on the client side. If a re-balancing happens during the middle of a client session, the client needs to re-fetch the new array canister ids and the new hash ring.

However, this approach means that all clients are still originally directed to the same initial canister for fetching this cluster data, and that each application client will have a hardcoded canisterId that they will use, since they can’t predict the canisterIds of dynamically created canister nodes.

Is this a correct assumption?

(Follow-up Q2)

About a month ago, I had these exact same questions about data transfer speeds (inter-canister, ingress, egress, etc.), and the question got pretty much ghosted and went unanswered :smile:

Whether no one has been brave enough to test this out or just hasn’t made their findings public, there are multiple anecdotal comments from developers who have mentioned that downloading 1-2 MB of data from the IC can take a few seconds in this thread.

By the way, consistent hashing is pretty cool - I didn’t know much about it before reading your solution → this article has a pretty neat breakdown with visualizations on how one might naively implement consistent hashing and how it works in general.

Looking into consistent hashing, it looks like for each new canister node being spun up the rebalancing cost is O(K/N), where K = the number of keys and N = the number of nodes.

This would mean that as you scale up, for each new node added the performance cost of data transfer redistribution remains constant.

One additional question (new)

  1. Thinking about consistent hashing and the way that it further subdivides the hash ring as it scales, how does consistent hashing handle unbalanced data coming in, or more specifically how does the AnchorHash algorithm handle it? Let’s say I’m seeding my scalable storage database and all of my keys are inserted in monotonically increasing order. How would that affect this architecture?