The Internet Computer (IC) operates on physical node machines, commonly referred to as “nodes”. To achieve robust decentralization, these nodes are operated by different node providers, in multiple data centers, by different data center providers, and in many countries. At present:
- Decentralization is well-achieved in “data center” and “data center provider” characteristics.
- While “node provider” decentralization is nearing optimal, some providers operate too many nodes relative to available subnets.
- There is an overrepresentation of nodes in the US and Switzerland when considering the “country” characteristic.
This post offers a preliminary assessment of node decentralization focusing on individual node characteristics. Subsequent posts will explore more comprehensive metrics and models for assessing node decentralization, along with suggested strategies for future node onboarding and offboarding decisions.
The IC runs on physical node machines. By engaging with numerous, independent node providers, the IC ensures robust decentralization. Such an approach mitigates risks associated with single points of failure, centralized control, and potential censorship.
While a broad diversification of nodes enhances decentralization, it is vital to recognize the associated costs. Node providers face both investment costs for hardware procurement and operational costs related to operating their node machines (in particular rental of data center rackspace and bandwidth capacity) and ongoing maintenance (in particular replacement of failed hardware parts). In return for their contributions, the IC compensates them with node provider rewards, which are minted as ICP tokens every month.
In order to strike the right balance between network size & cost and the degree of decentralization, the IC needs to tackle two aspects
- Establishing a Target Topology: This involves defining the number of subnets and their respective sizes, aligning with anticipated future demand.
- Optimization: Given a target topology, optimize between node rewards (onboarding of additional new nodes and rewards for existing nodes) and decentralization.
In this blog series, we will delve deeper into the optimization aspect, addressing the following key questions along the way:
- Current Status: Where do we currently stand in terms of node decentralization?
- Metrics: Which metrics effectively capture decentralization across various characteristics?
- Models: Develop models optimizing between node rewards and decentralization.
- Operational strategy: With an established topology and set decentralization goals, how can we apply these models for future node onboarding and offboarding decisions?
In this first post, we will provide background knowledge and offer an initial evaluation of node decentralization. Our subsequent post will cover a detailed exploration of the metrics and models vital to understanding node decentralization.
The Internet Computer runs on physical node machines, the foundational building blocks of the IC. These machines execute the protocol and manage user data and computation.
These node machines are high specification servers, standardized for optimal performance. They are distributed across independent data centers worldwide. Currently, the IC operates on more than 1200 physical nodes.
A node provider is an entity that operates nodes within the IC network. Node providers are responsible for the physical hardware, connectivity, and overall maintenance of the nodes they operate.
To safeguard the network’s decentralization, the onboarding of node providers is regulated by the NNS, the network’s governing DAO. To become a node provider, one must submit a proposal accompanied with a self-declaration document that states provision of node machines and proof of identity. Subsequently, the community evaluates and votes on the onboarding proposal.
The node matrix below presents a snapshot of all nodes contributing to the IC as of September 7th, 23, grouped row-wise by their node provider, in ascending order from top to the bottom. For providers with over 50 nodes, only the first 50 are displayed, with the actual count indicated to the right of the bar. Nodes that are maintained by specific service providers (henceforth referred to as “Aggregators”) and not directly by the node providers, are counted under rows specific to each Aggregator.
Node provider rewards
Node providers are incentivized to operate nodes through economic rewards. Specifically, they earn ICP tokens in return for their participation in the network.
Given that node providers incur costs in fiat currencies (both for investment and operations), the reward structure is pegged to a basket of fiat currencies called XDR (Special Drawing Rights). This works as follows:
- Rewards are initially determined in terms of XDR derived from three primary factors:
- The generation of hardware (Gen 1 or Gen 2).
- The geographic location of the node and corresponding operating costs.
- The total number of nodes being operated by the provider.
- Once the XDR-based rewards are determined, they are converted to ICP tokens using a 30-day average ICP/XDR exchange rate.
- These rewards are then disbursed to node providers on a monthly basis.
For a more comprehensive breakdown on node provider rewards, see here.
A subnet is a collection of nodes that run their own instance of the consensus algorithm to produce a subnet blockchain that interacts with other subnets of the IC. Subnets play a pivotal role in striking the balance between scalability and security:
- Scalability: Not all nodes can accommodate every application.
- Security: Applications must run on enough nodes to guarantee data integrity and ensure uninterrupted uptime, even in the face of individual node failures or malicious activity.
The IC hosts various subnet types, each distinguished by properties (notably, the level of decentralization) making them suitable for specific applications:
- System subnets: These subnets are reserved for canisters that are an integral part of the IC protocol. Canisters on these subnets do not pay cycles. Only the NNS can deploy canisters on those subnets. Examples: NNS subnet (tdb26), currently consisting of 40 nodes, II subnet (uzr34) currently consisting of 28 nodes.
- Application subnets: These subnets are user-accessible for canister deployment. Typically, they encompass 13 nodes, and canisters here expend cycles. If users do not specify requirements, the system randomly selects an application subnet for canister creation.
- Further special subnets types: Beyond the basic system and application subnets, the IC also has specialized subnets tailored for certain dapps. These can offer enhanced features like increased replication. Examples: Fiduciary subnet (pzp6e) is an enlarged application subnet with 28 nodes. The SNS application subnet (x33ed), catering to SNS DAOs, has 34 nodes.
The following subnet matrix shows the number of required node slots of the current subnets on the IC, sorted by size from left to right. The reason for the minimalistic presentation will become clear in the next section. Please note that the subnet matrix could also be generated with respect to a target topology (number of subnets and their respective sizes) instead of the current topology.
We are now in a position to provide a preliminary evaluation of node decentralization on the IC, focusing on four distinct characteristics: node provider, data center, data center provider, and country.
For optimal decentralization, each subnet would only have unique representations of each characteristic. For example, within any given subnet, each node provider should be represented only once.
To visualize this, consider a matrix, we have termed the “node topology matrix”, where:
- Tiles in rows represent nodes sharing a certain characteristic (e.g., nodes from the same provider).
- Crosses in columns signify the required slots within a subnet.
- A cross overlaying a tile indicates that a node (from that row’s characteristic) is mapped to the subnet denoted by that column.
Under this setup, every node with a specific characteristic can only be mapped once to each subnet. We require that rows are arranged with characteristics in ascending order from top to bottom, while subnet columns are arranged in descending order from left to right.
Tying back to the preceding section, this matrix emerges by superimposing the subnet matrix over the node matrix. This will be illustrated in the subsequent sections.
The matrix depicted below showcases the node topology for the “node provider” characteristic.
From the matrix, we can infer that achieving near-optimal decentralization is plausible. Notably, only 9 subnet slots, visible as crosses not overlaying tiles, remain vacant in the two most substantial subnets to the left, when mandating a unique node provider representation in each subnet.
Additionally, it becomes clear that several node providers have a disproportionately high number of nodes compared to the total subnets available. For achieving optimal diversity across node providers, providers should not operate more nodes than the number of (planned) subnets. Otherwise, the usage of surplus nodes would inevitably breach the requirement for unique provider representation in each subnet.
The matrix below illustrates the node topology specific to the “data center” characteristic.
From this visualization, it is evident that, in terms of the “data center” characteristic, the IC has achieved full decentralization. Notably, even the largest subnet, comprising 40 nodes, can be filled accommodating each data center only once.
It is worth pointing out that the crosses in the bottom right, which do not overlay tiles, do not present a decentralization concern. These available slots can readily be populated using unassigned nodes (tiles without crosses) from the upper rows without compromising the decentralization objective.
Displayed next is the matrix showing the node topology for the “data center provider” characteristic.
Similar observations as for the “data center” characteristic are applicable here. The IC is already well diversified when considering the data center providers.
The matrix below provides insight into the node topology based on the “country” characteristic.
A stark contrast is evident when compared to previous characteristics.
Currently, nodes are placed across only 15 different countries. Given this limitation, it is infeasible to fill the large subnets on the left while insisting each country is represented only once within a subnet. This challenge is highlighted by the 50+ crosses in the top left that do not overlay tiles.
Even among some of the application subnets with 13 nodes, there is a scarcity of diverse nodes. This shortfall is further emphasized by the 50+ crosses in the middle that again, do not overlay tiles.
To address this, we could moderate our decentralization standards, proposing that a specific country should, at most, be represented twice (instead of once) within a subnet. Implementing this modification would produce the subsequent matrix, where we permit double rows for each country.
The softer target appears more attainable. Only 13 subnet slots in the top left, again illustrated as crosses not overlaying tiles, remain unoccupied under this criterion. The crosses in the middle that do not overlay tiles do not present a decentralization concern. They can be populated using unassigned nodes (tiles without crosses) from the upper rows, without compromising the decentralization objective.
Furthermore, it is clear from that picture that even under this weaker decentralization target, we have far too many nodes in the US and Switzerland.
Based on the presented analysis, the following summary observations can be drawn regarding decentralization across different node characteristics on the IC:
- For the characteristics “data center” and “data center provider,” the nodes on the IC show robust diversification.
- Regarding the “node provider” characteristic, we are close to realizing near-optimal decentralization. For optimal diversification, it is essential that providers do not operate more nodes than the total number of (planned) subnets. Presently, a few providers exceed this optimal node count.
- Concerning the “country” characteristic, our decentralization aims may need modification. Instead of ensuring each country is represented only once in a subnet, we might consider allowing a country to be represented up to twice. Even under this weaker target, we observe that there is an overrepresentation of nodes from the US and Switzerland.
This blog post has concentrated on assessing individual node characteristics. The subsequent post will introduce a formal method for measuring decentralization and models that evaluate the combined diversification of these characteristics. Additionally, we will provide suggestions for an operational strategy on how these models can be applied to future node onboarding and offboarding decisions.