I’ve been following the web3 space for a few years, but only started looking more seriously into it the last month. My interest has always come from the vision of actually running the web in decentralized fashion, with micropayments replacing primarily ad-driven monetization models and protocols owned by their users. So I’m glad I’ve come across ICP, which so far seems the most realistic approach to a general purpose execution layer for the web.
I’ve been thinking a bit about how ICP could actually be the foundation for building scalable systems with billions of users someday. Figured I should share my ideas here to get feedback and maybe learn about other projects and ideas that I haven’t come accross yet. It ended up a bit of a long post, but hope it makes for an interesting discussion ![]()
Overview
Juno seems like a great piece of tech that’s very much going in the direction I imagine, so I’ll use that as a reference point in this post. While Juno doesn’t seem to have any auto scaling beyond a single cansiter yet, it does have the basic ingredients you need in any scalable app:
-
Datastore
-
Stateless functions
-
Storage
-
Authentication
Notably, you want to avoid stateful backends, because anything that has state is incredibly hard to scale. Instead, we’ll solve those problems once in the data store and only use stateless functions that can easily scale horizontally. In the following, I’ll go through all components and describe ideas for making them scalable beyond a single canister.
Datastore
I think Firebase’s NoSQL data store is a good starting point. Let’s represent our data in collections that map a UUID to an arbitrary document (BSON seems like a reasonable data format to me). We also support subcollections, so you might have a document like /posts/[postId]/comments/[commentId].
Scaling
So far it’s similiar to Juno’s datastore (plus subcollections, which I don’t think are too hard to implement). But what if we wanted to store terrabytes of data in this datastore or serve millions of queries per second?
The easiest approach seems to allow collections or subcollections to be sharded across different canisters. For example, if we wanted to shard the /posts collection over 8 instances, the root datastore node would just store the canister IDs of the 8 instances, rather than the data of the posts collection. Assuming random UUIDs, looking at the last three bits will give you reasonable load balancing and tell you which node is responsible.
What if your collection of posts is growing too big for 8 canisters? You could start by replicating the data to 16 new canisters in the background. While the initial replication is in progress, you ensure all writes are replicated on the new cansiters as well, and once that’s done you can mark the 16 new canisters the source of truth. In order to avoid every operation having to go through the root node, you also want to have some TTL that guarantees a node is authorative for a subset of documents until a specified time.
You could also use above mechanism to create read replicas, although that might not be needed if you’re using query calls that don’t need consensus anyways, as there would be easier ways to scale those. (On a more general note, could you make ICP nodes sign query calls and stake funds to guarantee correctness? If nodes kept history of state for a brief period of time, they could then check responses randomly. I guess it’s too expensive to do that in general, but it could certainly work for this data store as long as you keep old values for a few seconds.)
Embedded Functions
You do want the ability to embed small pieces of code directly on your datastore canister for tasks that should be executed synchronously and atomically:
-
Authorization rules
-
Triggers
-
Stored procedures
Juno currently supports Rust and JavaScript compiled into the canister for this. I think allowing you to change these at runtime would be important for a great developer experience. You wouldn’t want to restart Postgres just to change a trigger, would you? I guess Juno could do that as it’s already interpreting the JavaScript anyways, although I would imagine CEL or a whole wasmi interpreter (supporting AssemblyScript) are good options as well.
Offline-first support
Since the datastore is decentralized, writes will always be slower than in a centralized system. That gives you even more reasons to follow an offline-first paradigm where clients keep a local copy of a subcollection that they read from and write to and where the sync happens in the background.
I created a prototype of a data store in Motoko that supports this sync out of the box by keeping a revision counter for each document. That is also useful for avoiding conflicts when multiple clients attempt to write to the same document in parallel (otherwise the last write to server always wins, even if the client intended to update a much older version of the document). (Note I hadn’t come across Juno when I wrote that prototype, maybe I could have built it on top of that…)
Token integration
To me the appeal of web3 isn’t just decentrelization, but also the ability to easily integrate micropayments and token transfers to build prototcols and apps in which incentives are aligned better than in the web2 world (where you use stuff for free, and in return your data is sold and you have to see ads). Therefore, a web3 data store should also have native support for that!
I guess you could just track balances in the data store and then use triggers to implement things like “inserting a new document in collection X costs Y tokens”. But I’m sure there are ways to make this super easy in the data store, e.g. integrations with the ICRC-1 standard. Just need to be mindful that if you call into other canisters during a write operation, you’ll probably want a lock on the document level to handle competing writes gracefully.
Serverless Functions
A huge class of apps can already be built with the above data store alone. The serverless functions I’m talking about here are comparable to Cloud Functions (Firebase) or Edge Functions (Supabase). It’s essentially just a higher level abstraction that makes it easy to run scalable code on top of ICP, so you don’t have to worry about which cansiter your code is running on.
I imagine you could have an executor pool, which is an auto scaling pool of cansiters that executes jobs. A job would be specified through the hash of the wasm code file (which the executor can fetch from storage if needed), a calldata blob, as well as an optional authentication context (e.g. executing on behalf of a specific user or with specific permission). Jobs could be triggered in a meriad of ways, e.g. through direct invocation by a client, triggers from the data store, calls from a different cansiter, recursive function calls, through timers, or even through events on a different block chain that we listen to with chain fusion.
I’m imagining to just run this code sandboxed in wasmi, but if you run the same function often with the same authentication context, you could deploy it as a native canister automatically to speed things up (authentication is a bit tricky assuming the code is untrusted, but should be possible). In the interpreted version you could also overcome ICP’s 40 billion instrution limit by just pausing exeuction briefly.
By default, costs for the execution (not just cycles, could also include costs for transactions on different chains) should be paid by the caller sending gas. But similiar to the embedded functions in the data store, you could also have embedded functions in the executor pool that check whether the execution may be billed to a different account (based on what code is executed by whom).
It would also be great to have an integration with zero knowledge pools and TEE pools to offload computations where possible. Although those won’t allow you to do things like HTTP outcalls or signing transactions.
Storage
I’m surprised that ICP hasn’t embraced IPFS more. IMO it solves caching and verification once and for all by having immutable objects that the client can verify themselves. I think there’s a good chance major browsers will add IPFS support at some point in the form of dnslink (maybe specifying a preferred trustless gateway in a DNS record as well), as it’s quite easy to implement and does increase security.
Of course, actually getting data onto IPFS from a browser without a central party has been a pain so far. Theoretically you can run a local node in the browser, connect to IPFS with WebSockets and then use filecoin to pay for pinning, but I haven’t been able to get that to work.
However, ICP makes storing data without a central party trivial. The IPFS standard already splits files into 1MB blocks that you could easily upload into a canister and calculate the hash. Scaling storage shouldn’t be too hard, just spawn more canisters when needed. The trickiest bit seems how to find the canister that stores a specific block in a scalable way, but you could just reuse the data store described above to map block hash to canister ID.
You could also build IPFS Nodes and Gateways that communicate directly with ICP API boundary nodes to retrieve blocks / files from ICP again. At that point you should also be able to use existing IPFS Gateways to access files or transfer the files to cheaper storage protocols like Filecoin, depending on what your storage needs are. Wondering if this has been considered before?
Authentication
I haven’t looked too much into Internet Identity, but seems like a good starting point. If I were to build a developer platform, I’d probably follow Supabase’s model to allow other authentication providers besides Internet Identity, e.g. anonymous logins, using passkeys, sign in with web3 etc. directly. That’d certainly be useful for apps that should work well cross-chain. I think OAuth-like flows would also be possible, e.g. allow user X to modify a subset of the data store on behalf of user Y.
For the above infrastructure to work, you’d also need authentication of canisters, e.g. to verify that a canister is actually part of your trusted exeuctor pool. I imagine this could be solved with a central cansiter in your “canister cloud” that signs certificates stating “canister with XYZ is trusted”.
Summary
Due to ICPs subnet architecture and ability to spawn new cansiters programmatically, ICP seems the first blockchain that allows you to build applications that could automatically scale to more transactions than a single machine could handle. I’m sure the devil is in the detail, but it seems to me the fundamental building blocks are already in place. Or am I missing any crucial bottlenecks?
I should note that I’m in no way claiming horizontal scaling to be the current bottleneck of web3 adoption. Currently the bottleneck seems more to be convincing user experiences that actually add value over web2 experiences. But it’s still important to think ahead and I think some of the abstractions I outlined above would also make for a better developer experience that would allow web2 devs to easily build decentralized apps without learning Solidity, Motoko, Rust or similar.