@Hazel , @skilesare , for your protocols, did you end up deciding to continue single-subnet? This subnet topic caught me off-guard, I wasnât expecting âhigh-availabilityâ to be a concern in blockchains, then Solana incidents happened, I didnât hesitate to leave that chain. Literally just woke up thinking about single-subnet / multi-subnet choice. Guys, any insights into worst-case scenarios youâve thought about, and your mitigations (those you could share openly) would be appreciated. What are the faillure modes of a single subnet? Whatâs at stake for a protocol thatâs integrated with ckBTC? Please advise
I canât see a good analogy between cloud provider regions and zones on the one hand; and subnets and replicas on the other.
For one, there is no concept of region on the IC (as in a part of the network within which communication is cheaper; or across which pricing may be different).
Second, IC subnets are replicated by default: your code already runs across at least 5 regions without you having to do anything (or having much choice about it). So (ignoring any future low-replication or geography / jurisdiction constrained subnets) a single subnet already transcends the concepts of region or zone.
On the other hand, regions and zones hold thousands and thousands of machines each. A subnet is a single virtual machine, so a subnet is nowhere near a region or even a zone in terms of compute / storage / whatever capacity.
My mental model is that a subnet is simply a virtual machine. A mindbogglingly highly replicated, highly available, trustless, tamper-resistant virtual machine, but a single machine.
So whatever reasons you may have to shard a traditional application across multiple machines (apart from replication and availability) also apply to sharding an IC dapp across subnets. These reasons include Severinâs two examples above: scalability and trust / security domains (e.g. in a large / important enough application youâd usually have the DB on a different, more tightly controlled machine from your frontend or backend servers).
Thanks for the detailed write up. True, the analogy isnât that good, but I couldnât come up with anything closer to flesh out the 3-axis Cost & Complication & Reliability choices in increasing levels, than to bring up how one chooses to deploy software on processors these days. For ensuring high-availability of software targeting CPU/GPU abstraction (not mainframe, not quantum either) today, independent of marketing names, current choices are more like: fully distributing the software as in Polkadot, Ethereum, Bitcoin, etc, grouping machines into clusters in data centers, grouping data centers into zones, grouping zones into regions, grouping regions into multi-regions, and grouping public clouds into multi-clouds. The granularity of the choice depends on the nature of the software architecture, distributed, what not. Iâm not going into Nakamoto Coefficient, nor comparing chains etc, itâs not my purpose here. At the absence of how many 9s can be mentioned regarding trust levels of single subnets vs multiple subnets, Iâm seeking to understand at a high level, what specific things can be stated about launching solo-subment vs. multiple-subnet, as guidance from the makers of the IC.
Thanks a lot for all the input so far. Just a brief update: we decided to look into this topic more broadly and therefore have put the price increase of message memory on hold for now.
Thank you! Unityâs recent disaster has proven that technologies have much less barrier to exit than they think they do. The last thing we need is to lose talent from the IC ecosystem in this bear market. Kudos to the leadership.