Long Term R&D: Subnet splitting (proposal)

1. Summary

This is a motion proposal for the long-term R&D of the DFINITY Foundation, as part of the follow-up to this post: Motion Proposals on Long Term R&D Plans (Please read this post for context).

This project’s objective

In the near future, subnets with heavy load can be split into two subnets via multiple NNS proposals, to balance the load. This project is about upgrading the mechanisms for subnet splitting such as splitting via a single proposal or subnet splitting taking into consideration which canisters together form a single dapp, and should remain on the same subnet

2. Discussion lead

Manu Drijvers

3. How this R&D proposal is different from previous types

Previous motion proposals have revolved around specific features and tended to have clear, finite goals that are delivered and completed. They tended to be measured in days, weeks, or months.

These motion proposals are different and are defining the long-term plan that the foundation will use, e.g., for hiring and organizational build-out. They have the following traits and patterns:

  1. Their scope is years, not weeks or months as in previous NNS motions
  2. They have a broad direction but are active areas of R&D so they do not have an obvious line of execution.
  3. They involve deep research in cryptography, networking, distributed systems, language, virtual machines, operating systems.
  4. They are meant to match the strengths of where the DFINITY foundation’s expertise is best suited.
  5. Work on these proposals will not start immediately.
  6. There will be many follow-up discussions and proposals on each topic when work is underway and smaller milestones and tasks get defined.

An example may be the R&D for “Scalability” where there will be a team investigating and improving the scalability of the IC at various stages. Different bottlenecks will surface and different goals will be met.

3. How this R&D proposal is similar to what we have seen

We want to double down on the behaviors we think have worked well. These include:

  1. Publicly identifying owners of subject areas to engage and discuss their thinking with the community
  2. Providing periodic updates to the community as things evolve, milestones reached, proposals are needed, etc…
  3. Presenting more and more R&D thinking early and openly.

This has worked well for the last 6 months so we want to repeat this pattern.

4. Next Steps

Developer forum intro posted
1-pager from the discussion lead posted
NNS Motion proposal submitted

5. What we are asking the community

  • Ask questions
  • Read 1-pager
  • Give feedback
  • Vote on the motion proposal

Frankly, we do not expect many nitty-gritty details because these are meant to address projects that go on for long time horizons.

The DFINITY foundation’s only goal is to improve the adoption of the IC so we want to sanity-check the projects we see necessary for growing the IC by having you (the ICP community) tell us what you all think of these active R&D threads we have.

6. What this means for the existing Roadmap or Projects

In terms of the current roadmap and proposals executed, those are still being worked on and have priority.

An intellectually honest way to look at this long-term R&D project is to see them as the upstream or “primordial soup” from which more baked projects emerge from. With this lens, these proposals are akin to asking, “what kind of specialties or strengths do we want to make sure DFINITY foundation has built up?”

Most (if not all) projects that the DFINITY foundation has executed or is executing are borne from long-running R&D threads. Even when community feedback tells the foundation, “we need X” or “Y does not work”, it is typically the team with the most relevant R&D area that picks up the short-term feature or project.

1 Like

Please note:

Some folks gave asked if they should vote to “reject” any of the Long Term R&D projects as a way to signal prioritization. The answer is simple: “No, please, ACCEPT” :wink:

These long-term R&D projects are the DFINITY’s foundation’s thesis at R&D threads it should have across years (3 years is the number we sometimes use internally). We are asking the community to ACCEPT (pending 1-pager and more community feedback of course). Prioritization can come at a separate step.

Hi all! I’m Manu, I’m the eng manager of the consensus team at the DFINITY foundation. I will soon post an outline for the plan of “subnet splitting”. Any questions, comments, or suggestions to improve the plan are very welcome! I look forward to the discussion.

2 Likes

Subnet splitting (one pager)

Objective

The internet computer is designed to have unbounded capacity by scaling out to different subnet blockchains. Each subnet however has a limited capacity:

  • There is a bound on how large the replicated state (combined state of all canister smart contracts on subnet) can grow
  • There is a limited amount of canister smart contracts that can be installed on the subnet
  • Every subnet has one blockchain, and blocks are of bounded size, so the bandwidth of accepting updates is bounded
  • Every replica of a subnet should process all update calls that reach a subnet, so it has bounded processing capacity.

The load on the different subnets varies wildly. For instance, subnet jtdsg has a replicated state size of 170 GB, while most other subnets hold less than 10 GB. As it stands today, there is no convenient way to balance load between subnets.

We propose to address this issue by introducing a new NNS proposal which would “split” a subnet into two subnets. The replicas are divided into two groups, each of which will become a separate subnet. The canisters are distributed over the two subnets in a similar fashion, such that each of the two “child” subnets only have half of the load before the split. Since all replicas already have the state of all canisters that were on the subnet, no slow transfer of state is required and the subnet downtime due to splitting should be minimal.

Both child subnets are half as large as the original subnet, which means they could be less decentralized / secure. To address this, it seems prudent to only split large subnets. A small subnet could be split by first adding new replicas to it.

Why this is important

If there is too much load on a single subnet, the canister smart contracts on that subnet will suffer from degraded performance. This has already happened: subnet pjljw has been under significant computational load multiple times, which led to all canisters on that subnet experiencing higher latencies.

Outline of proposed technical solution

To reduce the implementation effort, the following design relies strongly on existing mechanisms such as replica upgrades. The overall design is a trade-off between canister downtime and ease of implementation; possible improvements that can be taken up in a later stage are mentioned in the text.

The following steps describe the procedure to split a parent subnet A into two equal-size child subnets A and B. Even though one of the child subnets inherits the identifier of the parent subnet, both subnets will operate under new threshold keys after the split.

  1. Expand the parent subnet: Nodes are added to the parent subnet A using existing NNS proposals until it reaches at least twice the regular size for this type of subnet. The exact number of nodes will be determined such that a >⅔ threshold-signed certification in the expanded parent subnet guarantees that at least one honest node in each of the child subnets has a full copy of the state.
    One or more NNS proposals will add new nodes to subnet A. Once accepted, the new nodes will fetch the full state from existing nodes as usual.
  2. NNS proposal to split subnet: A new NNS proposal type will be created, where the proposal describes which nodes and which canisters are moved to which of the two child subnets after the split. Once accepted, the registry will mark the subnet with a new “splitting” flag that temporarily prevents concurrent changes to the subnet (e.g., moving nodes or canisters in and out of the subnet) until after the split.
  3. DKGs for child subnets: The NNS subnet performs DKGs to generate fresh threshold keys for both child subnets A and B, assigning key shares to the nodes assigned to A and B.
  4. Parent subnet stops and creates final CUP: Parent subnet A stops processing update calls and creates a final catch-up package (CUP), as is currently done before a replica upgrade. Note that the parent subnet continues to process query calls, so that canisters are essentially running in “read-only mode”.
  5. NNS obtains the final CUP of the parent subnet. After the splitting proposal is executed, the NNS will look to obtain the final CUP of the parent subnet. Once it obtains this CUP, the NNS will execute the split and update the registry to replace the parent subnet with the two child subnets. For each of the child subnets, the NNS constructs a “genesis” CUP in the registry, instructing the subnet from which state to start. This genesis CUP contains the state from the final CUP from the parent subnet, meaning that the child subnet continues from the final pre-split state, and it contains the newly generated threshold key material. Additionally, the “routing table” (that maps canister ids to subnets) is updated to split the canisters over the two child subnets.
    There are multiple approaches on how the NNS can obtain the final CUP of the parent subnet. Ideally, it securely fetches it itself from the parent subnet, but a simpler intermediate solution might be to deliver this CUP via a second NNS proposal that voters can verify.
  6. Child subnets A and B restart: All replicas in the child subnets restart, without erasing their execution state (as they would during an upgrade), from the Genesis CUPs found in the registry. All replicas purge the state information of canisters that are not assigned to their subnet.
    Special care needs to be taken of in-transit cross-subnet messages and responses from and to canisters that moved to child subnet B. Messages that were in the outgoing streams of these canisters at the moment the parent subnet was stopped continue to be offered by child subnet A. Messages and responses on incoming streams, however, will be met with a new REJECT signal, until the sending subnet updates its routing tables to child subnet B. To ensure ordering guarantees, subnet B initially runs canisters in “starting state”, meaning that all open call contexts for a canister have to be closed before it can accept new calls. Once all call contexts of a canister are closed, the canisters can transition into the running state and continue processing new calls as usual.

Discussion leads

The motion proposal is driven by @derlerd-dfinity1, @gregory, @Manu, and other team members will also be available for discussion.

Skills and Expertise necessary to accomplish this

To achieve the goal of being able to split subnets is clearly a broad R&D effort. Despite the high-level design presented above we expect that many open questions will need to be answered. Answering these questions will require the involvement of many teams all across the IC stack. In addition it also requires broad input from the community to guarantee that we also end up with a usable solution for both canister developers and end users that meets the expectations of the community.

What are we asking the community

  • Review comments, ask questions, give feedback
  • Vote accept or reject on NNS Motion
  • Participate in technical discussions as the motion moves forward
3 Likes

NNS Motion is live: Internet Computer Network Status