TL;DR
This post explores metrics and models for node decentralization and presents strategies for future node onboarding and offboarding decisions.
- Metrics: A recap of key metrics for node decentralization, in particular the Nakamoto coefficient.
- Modeling: Introduction of an optimization model designed to reach decentralization targets with a minimum node count.
- Community involvement: After incorporating community feedback we aim to proceed with a motion proposal for NNS agreed rules for node onboarding.
For an introduction and additional background, please refer to the previous post on node diversification.
Recap of problem statement
The Internet Computer (IC) operates on physical node machines and achieves decentralization by partnering with multiple independent node providers. This decentralization reduces risks of central control, failures, and censorship. However, there is a trade-off: a high degree of decentralization comes at a cost in the form of node provider rewards to account for hardware, maintenance, and operations.
In order to strike the right balance between network size & cost and the degree of decentralization, the IC needs to tackle two aspects
- Establishing a Target Topology: This involves defining the number of subnets and their respective sizes, aligning with anticipated future demand. It also involves setting decentralization targets.
- Optimization: Given a target topology, optimize between node rewards (onboarding of additional new nodes and rewards for existing nodes) and decentralization.
This blog post explores metrics and models for assessing node decentralization, along with suggested strategies for future node onboarding and offboarding decisions.
Metrics of decentralization of Nodes in a Subnet
Definition of the Nakamoto Coefficient
The Nakamoto Coefficient traditionally signifies the smallest number of entities required to control 51% of resources in a decentralized system. It provides a measure of how centralized or decentralized control is within that system.
In the context of IC subnets, an adjusted definition is more meaningful. According to the IC consensus protocol, if a third (or more) of the nodes behave maliciously, they can stall a subnet. Hence, we define the Nakamoto Coefficient as the smallest subset of entities corresponding to a specific characteristic (such as node provider, data center, country, etc.) that collectively control at least a third of the nodes in the subnet. Taking the ānode providerā characteristic as an example, it denotes the fewest number of node providers such that their cumulative node count is at least one-third of the subnet size.
More formally we can define the Nakamoto coefficient as follows
Given:
N
is the total number of nodes in a subnet.Ļ
is a characteristic of the node (e.g., node provider, country, data center, etc.).Ī¾(Ļ)
denotes the number of nodes with the characteristicĻ
Then the Nakamoto Coefficient for the characteristic Ļ
is
NC_Ļ(N) = |P|
where P
is the smallest subset of unique Ļ
values such that:
sum(Ī¾(Ļ_i) for Ļ_i in P) ā„ N/3
Alternative Metric: Subnet Limit
A more straightforward (but also less precise) metric for node decentralization is the subnet limit, denoting the maximum instances a specific characteristic can appear within a subnet. For instance, if the subnet limit for the ānode providerā characteristic is 1, each node provider within the subnet is unique.
Connection between Nakamoto Coefficient and Subnet Limit
When each nodeās characteristic is unique within the subnet, i.e. subnet limit = 1, the Nakamoto Coefficient NC_Ļ(N) is ceil(N/3), where ceil(x)
is the ceiling function, which rounds x
up to the nearest integer. This value represents the maximum Nakamoto Coefficient achievable. In the context of a 13-node subnet, this coefficient becomes ceil(13/3) = 5.
If each characteristic can appear up to twice in a given subnet, i.e. subnet limit = 2, the Nakamoto Coefficient will be reduced. In this case Nakamoto Coefficient NC_Ļ(N) is ceil(N/(2*3)). Taking the 13-node subnet as an example again, the Nakamoto Coefficient becomes ceil(13/6) = 3.
Modeling approach
In this section, we outline a mathematical model for optimizing between node rewards (including onboarding new nodes and compensating existing ones) and network decentralization. We employ linear optimization for this purpose, describing the model input, applied constraints, and objective functions. Model results will be discussed in a later section.
Optimization Strategy
Starting with a specified decentralization goal for the network, this method calculates the least number of new nodes or the minimal node rewards necessary to meet this goal.
Model Input
- Available Nodes and Their Characteristics: Represents available resources. Every node possesses instances of characteristics influencing its allocation. Nodes can either be existing or potential additions (e.g., from a future node provider).
- Target Topology: Describes the desired network structure, detailing the count and size of subnets. For every characteristic, it specifies a decentralization goal, which could be the target Nakamoto coefficient or a subnet limit.
Formulation of Constraints
- Nodes to Subnet Allocation Constraints: Create a matrix mapping nodes to subnets, adhering to:
- Uniqueness: Each node can be linked to just one subnet.
- Subnet Fill: Each subnet must have its designated number of nodes.
- Characteristic Constraints: For every characteristic (e.g. node provider), construct a ācharacteristic matrixā that maps the instances of the characteristic to subnets. The matrix values are derived from the node-to-subnet mapping. This matrix must respect:
- Subnet Limit: If a subnet limit is set, ensure that the entries of the characteristic matrix are bounded by the subnet net limit (e.g. individual node providers can appear only twice within a subnet).
- Nakamoto Coefficient: If a Nakamoto coefficient target NC (e.g. five) is set, ensure that every possible subset of instances of this characteristic, with sizes up to and including NC, should have control over less than one-third of the nodes within the subnet (e.g. all subsets with up to five node providers should not control a third of the nodes). Note: This introduces numerous constraints.
Objective Function
The objective is to minimize the required nodes (or node rewards) to achieve the given target.
Suggested Process for Applying the Model
We propose utilizing the previously presented model framework for making informed decisions regarding future node onboarding as follows:
Establishing a Target Topology
Context
- Target Subnet structure: This involves determining the number and respective sizes of subnets, aligned with anticipated future demand.
- Decentralization Targets: Per subnet, decentralization targets should be set, utilizing either a Nakamoto coefficient or a subnet limit. The node topology matrix, described in the previous forum post, assists in evaluating achievable targets.
We propose that the NNS agrees on a single target topology at any given time. Various decisions, such as whether to onboard additional nodes, can be derived from this agreed-upon topology. As the IC evolves, updated target topologies could be proposed to the NNS, ensuring continual alignment with the networkās development and needs.
Optimizing Node Allocation
Utilizing the defined target topology, the model can determine the minimal number of nodes, or alternatively, the minimal amount of rewards required for achieving the target topology.
Deciding on Node Candidates
Utilizing the model, the following can be analyzed given a set of current nodes and node candidates:
- Effectiveness of Candidate Nodes: Can node candidates 1:1 reduce the number of additional nodes needed?
- Node Relevance of Existing Nodes: Which existing nodes are not utilized to achieve the decentralization target (and thus could potentially be offboarded) ?
Modeling Results
Utilized Model Inputs
Suggested Target Subnet Structure
The following table specifies the type, number, and size of anticipated subnets. The column labeled āSEVā indicates whether the subnet is designated to run on generation 2 SEV machines, enhancing protection against malicious actors. This table serves as a suggestion for the Subnet Target Structure for the next 6-12 months. The number of subnets is based on current and anticipated demand. There are some special subnets that are dedicated to specific use cases, e.g. ECDSA signing. The sizes of the subnets were chosen depending on the sensitivity of the services/dapps running on them, e.g. the NNS subnet has the highest sensitivity and is thus proposed to be larger than app subnets.
Subnet type | # Subnets | # Nodes | Total | SEV |
---|---|---|---|---|
NNS | 1 | 43 | 43 | no |
SNS | 1 | 34 | 34 | no |
Fiduciary (used as ECDSA signing today) | 1 | 28 | 28 | no |
II | 1 | 28 | 28 | yes |
ECDSA signing (new) | 1 | 28 | 28 | yes |
ECDSA backup (new) | 1 | 28 | 28 | yes |
Bitcoin canister | 1 | 13 | 13 | no |
European Subnet (new) | 1 | 13 | 13 | yes |
Swiss Subnet (new) | 1 | 13 | 13 | yes |
Application Subnet | 31 | 13 | 403 | no |
Reserve nodes | 120 | |||
Total | 751 |
Note: The optimization model does not yet consider the adjusted country constraints required for the planned European and Swiss subnet.
Current nodes
The current node set, extracted from the IC dashboard as of September 7, 2023, totals 1151 nodes, with 84 of these being SEV nodes.
Candidate nodes
The analysis will incorporate candidate nodes from new countries, data centers, and data center providers. We assume that for every new country, there are 20 candidate nodes (5 node providers x 4 nodes per node provider), and that these candidate nodes are generation 2 SEV machines.
Proposed Decentralization Targets for further Analysis
From the previous blog post, we know that decentralization, considered on a standalone basis, is well-achieved for data centers and data center providers and near optimal for node providers. For the country characteristic the current node set shows a lower degree of decentralization.
In light of this, we propose the following decentralization targets for further analysis, each of which assumes compliance with the SEV requirement:
- Decentralization Target 1: A subnet limit of 3 for all characteristics.
- Decentralization Target 2: A subnet limit of 2 for all characteristics.
- Decentralization Target 3: A subnet limit of 2 for country characteristic and 1 for other characteristics.
- Decentralization Target 4: A stringent subnet limit of 1 for all characteristics.
Optimizing Node Allocation
Below, we share the results of applying the optimization model to the targets outlined previously. In this analysis, we focus on minimizing the node count of candidate nodes. In a second step, this could be enhanced to minimize node rewards instead.
Decentralization Target 1
The following bar chart illustrates node allocation per subnet, focusing on the country characteristic. Each bar corresponds to a subnet, sorted by size in descending order from left to right. Segments on a bar, sharing the same color, represent nodes from the same country, delineating current (blue) and candidate (red) nodes from a presumed 12 new countries. The number atop a bar reflects the Nakamoto coefficient of the subnet with respect to the considered characteristic (here country).
Decentralization Target 2
For the stricter decentralization target 2, enforcing a subnet limit = 2 for all characteristics, additional nodes are necessary. A total of 84 candidate nodes are required. We observe that candidate nodes are mainly assigned to the larger subnets on the left (NNS and SNS subnet), as well as to the SEV subnets (subnet number 4,5,6, 8,9).
Decentralization Target 3
For Target 3, 120 candidate nodes are required. The following table illustrates the allocation to subnets, this time using the characteristic node provider. In line with the target, we observe maximum decentralization only featuring unique node provider instances per subnet.
Decentralization Target 4
Achieving maximal decentralization across all characteristics necessitates 210 candidate nodes, sourced from 28 new countries.
Preliminary Conclusion
Relying on the suggested target topology and the provided model framework, our initial conclusions are as follows:
-
Overall Node count: Utilizing the Target Subnet Structure, there are approximately 400 excess nodes, with a current total of 1150 nodes compared to the 750-node target. The excess applies in particular to Generation 1 nodes.
-
Generation 2 nodes: There remains a need to introduce additional Generation 2 SEV nodes to fill the SEV subnets (e.g. ECDSA signing subnet).
-
Decentralization target:
- The node provider, data center, and data center provider characteristics, which already showcase a higher degree of decentralization, could adhere to stricter targets, demanding maximum decentralization (subnet limit = 1).
- Currently exhibiting the least decentralization, it seems logical to, at least initially, set a comparatively weaker decentralization target for the country characteristic. This would permit the same country to appear up to twice in any given subnet (country subnet limit = 2).
- To achieve this target (referred to as Target 3 above), an addition of 120 Generation 2 nodes would be necessary.
-
Node provider bound: As a consequence of the proposed decentralization target, no single node provider should manage more nodes than the number of subnets (40). Taking into account that there are typically 14 nodes in a rack, we propose a slight adjustment to this criterion: No node provider should control more than 42 (3*14) nodes.
Suggested next steps
In the journey towards refining and implementing models for node decentralization, we suggest a progressive path with the following steps:
Inviting Feedback
- Community Participation: We invite community members to offer feedback on the model for assessing node decentralization and the suggested strategies for future node onboarding and offboarding.
- Model Sharing: We aim to share the prototype used to run the model and create the visuals showcased in this blog, providing a tangible basis for feedback and collaborative refinement.
Motion Proposal
After consideration of the feedback, we plan to submit motion proposals that articulate
- The principles and model guiding future decisions on node onboarding and offboarding.
- A target topology for the next 6-12 months.
Further Refinements
- Security Review: Should the decision be made to progress with the presented model framework, a security review of the prototype implementation will be required to ensure its robustness and reliability.
- Strategic Enhancement: Beyond its initial capabilities, the model could be enriched to recommend strategies for re-assigning nodes effectively transitioning from the current to the target subnet allocation.