Enhancing Network Decentralization - Proposals for Identity Verification and Subnet Allocation

Introduction

During this quarter, several discussions have taken place in the forum on network security and decentralization. Valid concerns regarding node provider governance and the subnet allocation scheme have been raised. As indicated earlier, we have taken the action to review how to best capture these connections and consider them in subnet allocations, taking into account the constructive feedback received in the current discussion. This forms the scope of this forum post. Additional considerations for enhancing network decentralization will be shared separately.

Suggested Enhancements

Identity Verification

Background: At ICP genesis, DFINITY conducted a KYC process for Generation 1 node providers, requiring identity verification. For Generation 2 node providers, the process shifted to a self-declaration model where node providers provided identity documents. This raised security concerns, e.g., because such documents can be manipulated relatively easily. Discussions about enhancing identity verification were held in the forum, but progress was slow, for example due to privacy concerns.

Goal: Establish a robust identity verification process for node providers that defines clear protocols for both initial identity verification and the frequency of subsequent re-verifications.

Suggested Measures:

  • Engage several recognized third-party KYC service providers, selected by the NNS, for verifying the legal existence of node providers. The KYC service will issue certifications that node providers can use during both onboarding and periodic re-verification processes. All node providers, existing and new ones, would be in scope of this process.
  • To address privacy concerns, personal and corporate KYC data (e.g. passport information) would be securely stored at the KYC providers.
  • Enhance the on-chain node provider records by incorporating a field that indicates when the last KYC has been completed, and include a link to the corresponding certification.

Assessment of Independence

Background: Node providers as listed in the NNS registry can be connectedā€”for example, an individual might also be the owner of a company acting as a node provider. There can be legitimate reasons for such arrangements, like certain data centers only accepting companies as clients or tax considerations. However, information about such connections is currently not captured or used in the protocol. Further, there is no systematic process for assessing the independence of node providers.

Goal: Ensure that linked node providers are treated as a single entity in subnet allocations, or confirm the genuine independence of node providers. This includes not sharing ultimate beneficial owners or being otherwise closely linked in ways that could undermine the networkā€™s decentralization.

Suggested Measures:

  • Establish a clear policy defining what constitutes independence among node providers, including specific examples; see this thread for related discussions. Here we should try leveraging existing compliance frameworks like Financial Action Task Force (FATF) to simplify the establishment of the policy and the execution of the assessment. Suggested example criteria are listed below.
  • Require all node providers to disclose their linkages to other node providers in their self-declaration, according to the agreed-upon criteria.
  • Use several recognized third-party services to verify the independence of node providers, ensuring no overlapping ownership or control exists. If you see viable alternatives to using these services (reducing the overall effort), please share your ideas.
  • Implement a process for periodic monitoring of the independence status of node providers.
  • The initial assessment cost could be covered by the NNS, while the cost of subsequent re-verifications could be covered by the node providers as a business expense.

Example Criteria: The FATF provides guidelines that can be adapted to define independence among node providers:

  • Ownership Percentage: A person is considered a beneficial owner if they own more than 25% of an entity.
  • Control Through Management Positions: Individuals in key management positions (e.g., CEO, CFO) that give them control over the entity, even without significant ownership, can be considered beneficial owners.
  • Control via Voting Rights: Having voting rights that allow significant influence over business decisions, irrespective of actual share ownership, indicates control.
  • Control via Family Ties: Direct family relationships where family members may exert control or influence over an entity. Examples of direct family ties are relationships between spouses, parents and children, and siblings.
  • Control via Legal Arrangements: Ownership or influence exerted through trusts, intermediaries, or other legal arrangements that confer control or substantial influence, even if the named owner of shares or rights is another party.

For forming clusters, it is suggested to apply the principle of transitivity: If node A is connected to node B, and node B is connected to node C, then node A should also be considered connected to node C.

Enhance the Algorithmic Subnet Allocation

Background: The NNS utilizes two distinct tools to manage and enhance node decentralization:

  1. Node Target Topology Tool: This tool is designed to set detailed decentralization targets for subnets and determine an optimal node assignment that satisfies these targets.
  2. Decentralized Reliability Engineering (DRE) Tooling for Node Subnet Assignments: The DRE tooling is used for the actual submission of NNS proposals that assign nodes to subnets.

Currently, neither tool incorporates data regarding the connections between node providers. Furthermore, the two tools have the following differences:

  • Local vs Global: The DRE tool takes a local approach (changing one node for another), and approvals are reviewed based on the outcome of such changes. One could do multiple swaps in a row to get to an optimal state, but the DRE tool and review process considers only 1-1 changes so far.
  • Difference in metrics: The node topology is concerned with minimising the number of new nodes in order to meet the decentralisation targets. The DRE tool measures the impact on Nakamoto coefficients of a particular change. Even if two setups (e.g., Subnet Setup A and B) meet the same topological requirements, they may differ in their Nakamoto coefficients, influencing the toolā€™s preference for one setup over another.
  • Additional node data: The DRE tool takes additional data into account, e.g., whether a node is currently healthy.

Goals: Use the enhanced data on node provider connections in the algorithmic allocation of nodes to subnets. Align the DRE tooling for subnet assignments with the target topology.

Suggested measures:

  • Data Capturing
    • Define and implement appropriate data structures within the NNS to accurately record the connections between node providers.
    • Specifically, introduce an additional field named ā€œclusterā€ or similar which enables the aggregation of individual node providers into clusters, representing groups of connected providers.
  • Unified Tooling:
    • Develop a unified tool that combines target topology modeling with node subnet assignment. This tool should be open-source, though not necessarily on-chain.
    • Initially, allow the tool to source certain data, such as node provider connections, from external sources. Over time, transition to sourcing all information directly from the NNS.
    • Ensure the tool provides sufficient detail to enable independent verification of subnet assignment proposals by reviewers.

Next steps

We kindly invite the community to provide feedback on the proposals outlined above, recognizing that there are details yet to be determined, such as the choice of compliance framework for the assessment of independence. We suggest evaluating the process by applying it to publicly available node provider data.

Additionally, we plan to discuss these suggestions during the upcoming node provider working group meeting on Monday, March 24. Once we have reached a conclusion, we intend to formalize it through a motion proposal.

Please note that several aspects of the plan can be implemented concurrently. For instance, we can begin by enhancing the allocation tooling to incorporate known connections between node providers. In parallel, we can initiate work to improve the data structures on-chain.

20 Likes

Iā€™m really glad to see DFINITY leading the community through the solution space on these issues. This seems like a great path forward and captures a lot of the feedback already provided on these topics. I look forward to seeing this move forward.

5 Likes

Thanks very much for this @bjoernek. This does a good job of setting the context. Iā€™m looking forward to seeing more details and the additional considerations that are mentioned.

@Satā€™s already done a brilliant job of upgrading the tooling to take hardcoded cluster information into account :clap:

One thing Iā€™d like to understand better is what should happen if any of the checks above fail? What happens if a Node Provider is found to be in violation of any of these requirements. I think there need to be well defined consequences, ones which are respected, known upfront, and rigorously applied when needed.

7 Likes

Thank you for the great feedback @Lorimer !

To make sure that I understand your question: You are asking what you should happen if a node provider falsely declares in their self-declaration that they have no connections to other providers, but it is later verified that they do have such connections, correct?

1 Like

On the tech side Iā€™d suggest revisiting ICRC-17 ICRC-17 - Elective KYC Service Standard - #11 by skilesare.

We used a form of this for somethingā€¦maybe Yumi used it for the Gold sale? (@Dustin would likely remember) and it worked fairly well. It needs to be revisited and updated, but what it accomplishes is that the ā€œmethodsā€ we determine are reduced to a simple call and interface. You can change methods and the function doesnā€™t care. If it gets ā€œpassā€ back it moves on. This would be really helpful as a step for contributing to SNSs as well for DAOs that want to use a specific KYC process or vendor in their SNS process.

By putting this shim in place you can change your KYC process in two ways:

  1. Change the canisterId to point to a different process.
  2. Change the functional code on the canister and maintain the interface.

ā€¦thus creating a separation of concerns and letting the process group focus on process and the tech group focus on enabling the tech without complexity.

Idk if lorimer asked this, But I am now. How can you prevent Node Providers collusion?

Thanks @bjoernek. Indeed, there needs to be a well established and commensurate deterrent, or the rules wonā€™t mean much.

4 Likes

Thank you @Lorimer, I will bring up the treatment of false declarations at Mondayā€™s Node Provider Working Group. I see several options but have not formed a definitive opinion yet.

1 Like

I meanā€¦ only option is slashing stakeā€¦ and only way to pass judgment is via NNS after finding out, :rofl:

What options are you speaking of that Im not aware of.

Loss of the right to practice as a node provider, which means a reduction or removal of remuneration, hits pretty hard for a node provider. Node providers have data center and ISP contracts that must be fulfilled. They also have capital tied up in these high end assets. While they could still sell the assets for another use case, they are likely to exit at a loss.

1 Like

Hi @bjoernek , I appreciate the thoughtful work you and your team have put into this proposed approach to node provider governance and the node servers to subnets.

As a registered and active node provider (Icaria Systems, Australia) I think it is important to state that IC Node Providers are the independent custodians of the security, reliability and reputation of the IC as a decentralised compute infrastructure platform. Therefore it is important that we are we are open and transparent about our identity and business operations to the extent deemed necessary by the NNS DAO; where commercial confidentiality prevents us from sharing information publicly we should be ready to make any relevant public statements of simple facts that can be checked by independent audit as necessary.

Firstā€¦

I agree with the stated Goal and I would support these Suggested Measures for KYC of the node provider business entity and UBO(s) if the community agrees to this approach. Online access to business registration documents and other information varies by country however node providers could assist by identifying and sharing the primary agency or registry site relevant to their country.

Nextā€¦

I fully agree with the stated Goal, in particular ā€œconfirming the genuine independence of node providersā€ as the foundation of the Internet Computerā€™s reputation as a secure and resilient decentralised platform for general compute.

I support these Suggested Measures including the FATF guidelines quoted as examples. These appear to me as standard transparency requirements for good corporate governance.

There is one area of node provider operations that appears to not have been addressed regarding our operational independence (or potentially interdependence). This relates to the technical expertise we need to competently maintain our server nodes over time and respond to hardware and network problems as they arise. The current IC topology model assumes that the only business entity that a node provider engages with for all data centre access, services and technical support is the Datacentre Owner. This is only the simplest case and there may be other colocation, network and technical service provider businesses that the node provider contracts with for the services provisioned to them in the datacentre. This is of particular importance to the IC topology where multiple node providers contract with the same colocation/network/technical service provider and therefore that business (or person) may have physical or remote network access to nodes from multiple node providers in different datacentres. Currently the IC topology data collection and tooling is unable to identify server nodes accessible to one third-party service provider and therefore avoid placing them in the same subnet (where those nodes are not in the same datacentre).
I will contribute a post about this potential additional to the reportable business relationship and the potential security risks to IC subnets for discussion before our Node Provider working group meeting.

5 Likes

Excellent point. @Lerak included this in his list

It certainly needs covering

1 Like

These things are complicated. There will undoubtedly need to be multiple levels of punishment to deal with varying severity of the offense.

Why was this post flagged? It wasnā€™t offensive. I guess serious topics require serious posts. Iā€™d personally like to understand the rules a little better. Numerous posts of mine got flagged yesterday and all I was doing was responding to people on this thread.


In any case, thereā€™s a big difference between mostly dead and all dead (please open his mouth). Now, mostly dead is slightly alive. All dead - well with all dead thereā€™s usually only one thing that you can doā€¦ go through the code and look for loose ICP (oh, and kick it out the subnet).

3 Likes

Thanks @bjoernek for these suggestions. I have some thoughts, which are really just meant as contributions to the discussions and things to consider.

In terms of scalability for the future and also being able to attract individual node providers, one should maybe also think about how far one would want to move towards a very centralised compliance culture as suggested here. Such requirements could be easier to meet for people in certain jurisdictions and de-facto exclude some people from participating as node providers. If requirements are very onerous, it could even lead to a system where it would be larger institutional players only willing or able to undertake it as opposed to small individual node providers who might find it too burdensome or costly. This could introduce other types of risk.

In addition, requiring KYC through a third party could also introduce a further point of centralised control or failure. Depending on where the KYC entity is located, it could also be influenced by other factors (corporate interests, political), that could also possibly lead to someoneā€™s KYC being revoked etc., possibly for reasons that the community may not actually agree with.

I would personally still favour a system that sets certain requirements by the community that node providers need to meet and that get voted on by the community when someone onboards, in order to encourage open participation and decentralisation.

One suggestion for having an element of third party opinion or some kind of attestation as to node provider KYC, node provider independence etc., maybe a firm could be engaged to provide such an opinion/attestation once a year, for example.

To enhance node provider self declarations one could add some further things that need to be disclosed such as relationships between node providers. One could also narrow down further what kind of proof of identity is acceptable, without moving to a very centralized third party KYC.

If in the future there are certain subnets used for specific purposes in certain jurisdictions that require a higher level of KYC, one could add extra incentives for node providers to meet those requirements.

I would personally favor an outcome based approach, i.e. focus on having well performing nodes, by setting the right incentives, i.e. performance based node rewards. If someone is found to have violated the onboarding requirements one can think of a range of measures that would come into effect, from penalties and remedy periods all the way to removal, however, I think this also needs to be looked at very carefully, the recent discussions on this forum donā€™t really fill me with confidence as to the ā€œdue processā€ in such situations.

I would also question if it would really add much value to dig so deep as to which remote hands or other service a node provider may use. Given the many data centers we have in many different countries and number of node providers, I doubt this would add much. I feel like itā€™s a slippery slope of getting towards a system that becomes ever more centralized and compliance prescriptive rather than just focused on outcome and metrics that really matter. We could also start digging into who owns a particular data center, is there some interlink between data centers that currently host nodes, if the same parent company has data centers in different countries is that a problem then? Would we want to require node providers to find out every UBO and ownership structure of a data center? Some data centers sub-lease space to other data centers, etc etc. I think one can take this very far to no avail.

To me the focus should be on a risk based approach and what we want to achieve and the most important aspects to do so, while maintaining an open and decentralized approach. There is risk in every approach, we should categorize them into likelihood of occurrence and severity of impact, and then find appropriate mitigants.

1 Like

Itā€™s impressive how a theory can be both thoroughly disproven, yet endlessly rehearsed. Almost meditative.

1 Like