Long Term R&D: Subnet splitting (proposal)

Subnet splitting (one pager)

Objective

The internet computer is designed to have unbounded capacity by scaling out to different subnet blockchains. Each subnet however has a limited capacity:

  • There is a bound on how large the replicated state (combined state of all canister smart contracts on subnet) can grow
  • There is a limited amount of canister smart contracts that can be installed on the subnet
  • Every subnet has one blockchain, and blocks are of bounded size, so the bandwidth of accepting updates is bounded
  • Every replica of a subnet should process all update calls that reach a subnet, so it has bounded processing capacity.

The load on the different subnets varies wildly. For instance, subnet jtdsg has a replicated state size of 170 GB, while most other subnets hold less than 10 GB. As it stands today, there is no convenient way to balance load between subnets.

We propose to address this issue by introducing a new NNS proposal which would “split” a subnet into two subnets. The replicas are divided into two groups, each of which will become a separate subnet. The canisters are distributed over the two subnets in a similar fashion, such that each of the two “child” subnets only have half of the load before the split. Since all replicas already have the state of all canisters that were on the subnet, no slow transfer of state is required and the subnet downtime due to splitting should be minimal.

Both child subnets are half as large as the original subnet, which means they could be less decentralized / secure. To address this, it seems prudent to only split large subnets. A small subnet could be split by first adding new replicas to it.

Why this is important

If there is too much load on a single subnet, the canister smart contracts on that subnet will suffer from degraded performance. This has already happened: subnet pjljw has been under significant computational load multiple times, which led to all canisters on that subnet experiencing higher latencies.

Outline of proposed technical solution

To reduce the implementation effort, the following design relies strongly on existing mechanisms such as replica upgrades. The overall design is a trade-off between canister downtime and ease of implementation; possible improvements that can be taken up in a later stage are mentioned in the text.

The following steps describe the procedure to split a parent subnet A into two equal-size child subnets A and B. Even though one of the child subnets inherits the identifier of the parent subnet, both subnets will operate under new threshold keys after the split.

  1. Expand the parent subnet: Nodes are added to the parent subnet A using existing NNS proposals until it reaches at least twice the regular size for this type of subnet. The exact number of nodes will be determined such that a >⅔ threshold-signed certification in the expanded parent subnet guarantees that at least one honest node in each of the child subnets has a full copy of the state.
    One or more NNS proposals will add new nodes to subnet A. Once accepted, the new nodes will fetch the full state from existing nodes as usual.
  2. NNS proposal to split subnet: A new NNS proposal type will be created, where the proposal describes which nodes and which canisters are moved to which of the two child subnets after the split. Once accepted, the registry will mark the subnet with a new “splitting” flag that temporarily prevents concurrent changes to the subnet (e.g., moving nodes or canisters in and out of the subnet) until after the split.
  3. DKGs for child subnets: The NNS subnet performs DKGs to generate fresh threshold keys for both child subnets A and B, assigning key shares to the nodes assigned to A and B.
  4. Parent subnet stops and creates final CUP: Parent subnet A stops processing update calls and creates a final catch-up package (CUP), as is currently done before a replica upgrade. Note that the parent subnet continues to process query calls, so that canisters are essentially running in “read-only mode”.
  5. NNS obtains the final CUP of the parent subnet. After the splitting proposal is executed, the NNS will look to obtain the final CUP of the parent subnet. Once it obtains this CUP, the NNS will execute the split and update the registry to replace the parent subnet with the two child subnets. For each of the child subnets, the NNS constructs a “genesis” CUP in the registry, instructing the subnet from which state to start. This genesis CUP contains the state from the final CUP from the parent subnet, meaning that the child subnet continues from the final pre-split state, and it contains the newly generated threshold key material. Additionally, the “routing table” (that maps canister ids to subnets) is updated to split the canisters over the two child subnets.
    There are multiple approaches on how the NNS can obtain the final CUP of the parent subnet. Ideally, it securely fetches it itself from the parent subnet, but a simpler intermediate solution might be to deliver this CUP via a second NNS proposal that voters can verify.
  6. Child subnets A and B restart: All replicas in the child subnets restart, without erasing their execution state (as they would during an upgrade), from the Genesis CUPs found in the registry. All replicas purge the state information of canisters that are not assigned to their subnet.
    Special care needs to be taken of in-transit cross-subnet messages and responses from and to canisters that moved to child subnet B. Messages that were in the outgoing streams of these canisters at the moment the parent subnet was stopped continue to be offered by child subnet A. Messages and responses on incoming streams, however, will be met with a new REJECT signal, until the sending subnet updates its routing tables to child subnet B. To ensure ordering guarantees, subnet B initially runs canisters in “starting state”, meaning that all open call contexts for a canister have to be closed before it can accept new calls. Once all call contexts of a canister are closed, the canisters can transition into the running state and continue processing new calls as usual.

Discussion leads

The motion proposal is driven by @derlerd-dfinity1, @gregory, @Manu, and other team members will also be available for discussion.

Skills and Expertise necessary to accomplish this

To achieve the goal of being able to split subnets is clearly a broad R&D effort. Despite the high-level design presented above we expect that many open questions will need to be answered. Answering these questions will require the involvement of many teams all across the IC stack. In addition it also requires broad input from the community to guarantee that we also end up with a usable solution for both canister developers and end users that meets the expectations of the community.

What are we asking the community

  • Review comments, ask questions, give feedback
  • Vote accept or reject on NNS Motion
  • Participate in technical discussions as the motion moves forward
4 Likes