Data Center and Node Operator Agreements

Are the Data Centre and Node Operator agreements posted for review? I am interested in understanding the contractual obligations, in particular the privacy safeguards. I do understand there are significant cryptographic measures being applied etc. The contract is still a necessary control.

3 Likes

Did you ever get or find an answer?

Received nothing whatsoever.

Tagging @Luis, (leads node provider efforts at DFINITY).

3 Likes

Sorry, for the late response. There is no reason why such a question should stay open for so long time.
Thanks @justmythoughts for the hint.

I’m not aware of any contractual obligations between between the foundation and the node providers. That would be a serious centralisation issue if so. I remember that there was a draft for a node provider agreement almost a year before launch but AFAIK it was dropped because we didn’t know how a node provider could commit to such agreement. At that time we didn’t had motion proposals yet. I guess we would use that now to find agreements between the IC and the node provider community.

1 Like

I am puzzled that there is no contract. It would seem to me if the foundation runs boundary nodes their would be a contract between the service providers and the foundation governing things like availability expectations and bandwidth management. Likewise, rewards are paid to node operators - in this case there is a consideration for service. If a node operator fails to meet the availability requirements - they lose their reward. I appreciate your point about centralization… Thank you for the response.

1 Like

Not sure if we’re talking about the same thing. The initial question was about the node provider agreement with the foundation. For the nodes including boundary nodes that the foundation is running there are of course contracts with DC providers. And so do node providers. But that has nothing to do with a node provider agreement to the foundation.

The NNS is minting the rewards that are paid out to the node providers. The foundation placed the according proposals. It’s true that the NNS is currently not looking on the actual service level - it’s not calculating penalties for any downtime. We need to bring some SLI on chain in order to allow the NNS to calculate such penalties. We are already working on this. This is much more complicated as it sounds.

1 Like

I found out that there is an agreement between the ICA and the node providers that I wasn’t aware of.
This is node provider agreement and this is the additional storage provider agreement.
These agreements were only meant for the initial set of node providers and storage upgrades because, as I said in the beginning, there was no NNS yet that reflected that agreement. It’s very unlikely that new node providers will need to sign such an agreement, but I will ask and come back if so.

4 Likes

So that’s an interesting read, thank you very much for sharing. There are many things there that seem odd from the point of view of ICP’s promise of decentralisation?

  1. Both agreements specify that node providers and data centre storage providers must comply with countries’ lawful requests for data they hold. There are no restrictions as to which country those providers should be based in, so your data could be accessed by the American, Chinese or Russian governments, as an example, if hosted in those countries. This does not seem to be what ICP is selling?

  2. There is also a requirement not to be under sanctions from the USA or the EU, which also kind of means picking a side.

I have no issue with 1 or 2 in general, but yes with the assumptions of decentralisation and anonymity many ICP people think is what they are supporting.

  1. Re: data privacy there is a requirement to follow the relevant ISO, but I don’t see a requirement for proof that this is being done, or mechanisms for detecting and addressing violations.

  2. Is the vision to move all these kind of terms and conditions to the NNS? How would that work at the more granular level? I can see how the NNS might detect down time and enforce penalties as per the docs above, but I don’t see how the NNS would monitor, detect and respond to data breaches, selling of stored data, or government inspection of data and even back-doors in servers. I don’t see any requirements for secure enclaves.

The above, together with all the application layer data breaching possibilities seems truly ripe for abuse.

For an essentially trustless vision, the ICP seems to rely on a huge amount of trust in the good faith dapp devs, storage hosts, node providers, Dfinity itself.

3 Likes

Further findings:

There are no contractual data protections beyond the ISO reference and no duty to the customers or users, in the contracts shared. Nothing at all like:

  • Specifying the security measures that the cloud provider must comply with, depending on the risks arising out from the processing and on the nature of the data to be protected;

  • Subject and time frame of the cloud service, extent, manner and purpose of the processing of personal data by the cloud provider, as well as the types of personal data processed;

  • Conditions for returning the (personal) data or destroying it once the service is concluded;

  • Confidentiality clauses; Prohibiting the disclosure of data to third parties, except for subcontractors specifically allowed under the data processing agreement;

  • Cloud provider’s responsibility to notify the cloud customer, in the event of any data breach which affects the cloud client’s data; etc.

The contracts require “compliance” with ISO27001, not certification, meaning it’s voluntary and there is no contractual provision to confirm this. Even if they do comply, the ISO covers data security, not data privacy. That means that as long as your data isn’t hackable, there’s nothing in the contract as far as I can see preventing datacentres from selling your data or shaeing it with government.

There are only 20 datacenters in the world holding the information of every single dapp and every single user. Talk about centralisation and points of failure risks! Of these 20, 11 are in the USA, 3 in Singapore, 3 in Switzerland, 2 in Belgium and 1 in Romania.

This is how easily and with how many options the USA government and commercial entities can access all your data on US territory: How can US law enforcement agencies access your data? Let’s count the ways | Apple | The Guardian. In addition, if any of the datacentres abroad belong to US companies, the US gvt can access them via the CLOUD act, and there are also politically backed requests to foreign governments for cooperation, very often granted.

So basically, if the US Federal government lawfully asked, they could access all the data stored in 55% of ICP data centres, and if the datacentres abroad are owned by US companies, those too, and if not, probably those also at least in Romania and Singapore, taking the tally to 75%

For the Romanian data centre, law enforcement is legally allowed not just to requisition data, but to physically to enter datacentres and “install their own equipment” (Moving Data Centres to Romania: The Do’s and Don’ts | Article | Chambers and Partners). That could be there now and we’d never know.

In Singapore, a simple “breach of an agreement” entitles government and regulatory bodies to access your data in their datacentres.

Belgium is stronger in its protections, but also collaborative with the US. Switzerland is probably the safest datacentre privacy wise, but only covers 3 data centres. And that’s just the legal exposure

If you look at some of the providers, they are not exactly marquee names, with the depth of security and cyber protection capacities. So add hackers to the list of people with potential access to the 20 datacentres (just follow the map!) Over 20,000 data center management systems exposed to hackers

To quote good practice: " Both data centre operators and users should be able to identify their assets, identify threats, assess risks, develop a protective security strategy and implement the correct measures to ensure all these concerns are managed. These processes should also be reviewed periodically as risks and threats can change."

There is no way for users to identify their assets, inspect the security strategy or review the risks periodically. How confident are you that all the node providers, have these measures in place for all their data centers? Data centres are still a tempting target for hackers. Here's how to improve your security | ZDNet

I haven’t even looked at the node providers in depth yet, but the datacentre ecosystem suggests that IC is gigantically vulnerable to legal data extraction and security breaches.

Food for thought and for my coming blog!

3 Likes

Thanks @Leamsi. A lot of what you concluded makes sense. But again, these agreements were replaced by a smart contracts on the NNS. And as you already said, the NP records in the NNS don’t have any such conditions defined. How agreements are manifested between anybody, including node providers, and the IC community is something that is already discussed. The platform decentralisation depends on such agreements e.g. decentralisation of node operations: node providers contracting with hw vendors, remote hands and data centers; or decentralisation of monitoring: individuals are allowed and payed to monitor the network.

There are only 20 datacenters in the world holding the information of every single dapp and every single user. Talk about centralisation and points of failure risks! Of these 20, 11 are in the USA, 3 in Singapore, 3 in Switzerland, 2 in Belgium and 1 in Romania.

The total node count looks like you pulled the topology of the NNS subnet. That would be 27 DCs of which 9 are US-based. That’s not more than a third (=OK). The NNS subnet has a good nakamoto coefficient. It can change with every topology change. BTW this is the output of the topology change we did last Friday.

Decentralization score changes:
      node_provider: 5.00 -> 5.00  (+0%)
        data_center: 6.00 -> 6.00  (+0%)
  data_center_owner: 4.00 -> 4.00  (+0%)
               city: 4.00 -> 4.00  (+0%)
            country: 2.00 -> 2.00  (+0%)
          continent: 1.00 -> 1.00  (+0%)
    Total: 3.67 -> 3.67  (+0%)
    node_provider                                                              data_center            data_center_owner            city                        country              continent              
    -------------                                                              -----------            -----------------            ----                        -------              ---------              
    6mifr-stcqy-w5pzr-qpijh-jopft-p6jl3-n2sww-jhmzg-uzknn-hte4m-pae       1    an1               1    AtlasEdge               1    Brussels Capital       3    BE              4    Asia                  6
    6nbcy-kprg6-ax3db-kh3cz-7jllk-oceyh-jznhs-riguq-fvk6z-6tsds-rqe       3    at1               2    Datacenter United       1    Bucuresti              2    CH        10 -> 9    Europe         22 -> 21
    6r5lw-l7db7-uwixn-iw5en-yy55y-ilbtq-e6gcv-g22r2-j3g6q-y37jk-jqe       3    at2               1    Datasite                3    California        2 -> 3    DE              3    North America  12 -> 13
    7ryes-jnj73-bsyu4-lo6h7-lbxk5-x4ien-lylws-5qwzl-hxd5f-xjh3w-mqe  3 -> 2    br1               2    Digital Realty          2    Flanders               1    JP              3                           
    a24zv-2ndbz-hqogc-ev63f-qxnpb-7ramd-usexl-ennaq-4om4k-sod6u-gae  0 -> 1    br2               1    Equinix                 6    Florida                3    RO              2                           
    bvcsg-3od6r-jnydw-eysln-aql7w-td5zn-ay5m6-sibd2-jzojt-anwag-mqe       3    bu1               2    Everyware               3    Geneva            3 -> 2    SG              3                           
    dzxyh-fo4sw-pxckk-kwqvc-xjten-3yqon-fm62b-2hz4s-raa4g-jzczg-iqe       1    ch2               1    Flexential              4    Georgia                3    SI              3                           
    eipr5-izbom-neyqh-s3ec2-52eww-cyfpg-qfomg-3dpwj-4pffh-34xcu-7qe       2    dl1               1    Green.ch                3    Hesse                  3    US       12 -> 13                           
    i7dto-bgkj2-xo5dx-cyrb7-zkk5y-q46eh-gz6iq-qkgyc-w4qte-scgtb-6ae       2    fr2               3    HighDC             1 -> 2    Illinois               1                                                
    olgti-2hegv-ya7pd-ky2wt-of57j-tzs6q-ydrpy-hdxyy-cjnwx-ox5t4-3qe       1    ge1          1 -> 2    INAP               2 -> 3    Ljubljana              1                                                
    rbn2y-6vfsb-gv35j-4cyvy-pzbdu-e5aum-jzjg6-5b4n5-vuguf-ycubq-zae       4    ge2          2 -> 0    M247                    2    Maribor                2                                                
    sdal5-w2c3d-p3buy-zieck-2wyuj-eu5bn-rkfe6-uuspi-o4n2b-gpei7-iae       1    jv1               1    Nine.Ch                 1    New York               1                                                
    sixix-2nyqd-t2k2v-vlsyz-dssko-ls4hl-hyij4-y7mdp-ja6cj-nsmpf-yae       3    lj1               1    Posita.si               3    Oregon                 1                                                
    spp3m-vawt7-3gyh6-pjz5d-6zidf-up3qb-yte62-otexv-vfpqg-n6awf-lqe       3    mb1               2    Racks Central           1    Singapore              3                                                
    ucjqj-jmbj3-rs4aq-ekzpw-ltjs3-zrcma-t6r3t-m5wxc-j5yrj-unwoj-mae       1    ny1               1    SafeHost           2 -> 0    Texas                  1                                                
    wdjjk-blh44-lxm74-ojj43-rvgf4-j5rie-nm6xs-xvnuv-j3ptn-25t4v-6ae       3    or1               2    Telin                   2    Tokyo                  3                                                
    wdnqm-clqti-im5yf-iapio-avjom-kyppl-xuiza-oaz6z-smmts-52wyg-5ae       3    pl1               1    Tierpoint               3    Zurich                 7                                                
    wwdbq-xuqhf-eydzu-oyl7p-ga565-zm7s7-yrive-ozgsy-zzgh3-qwb3j-cae       3    sg1               1                                                                                                         
                                                                               sg2               1                                                                                                         
                                                                               sg3               1                                                                                                         
                                                                               sj1          2 -> 3                                                                                                         
                                                                               ty1               1                                                                                                         
                                                                               ty3               2                                                                                                         
                                                                               zh2               3                                                                                                         
                                                                               zh3               1                                                                                                         
                                                                               zh5               2                                                                                                         
                                                                               zh7               1                                                                                                         
  - w7nug-ly4on-elt3f-ctsb2-3c33h-c3p3c-iy5kg-6pcjc-xhsev-bhcfd-xae    + h22aj-3x4zu-opjab-4pb4m-n5lli-ttjzd-hhmyk-6fiiq-3cktf-urfgd-bqe
  - 5jf3x-iyvhm-z3x4w-6gmd4-a7y7k-qlavk-7qu3d-hyx33-lynbo-lxace-6ae    + ehshg-qtj5n-d5cdz-vaaec-665ol-7akv5-atlvm-jmdkw-5lo4t-rqiac-qae

It’s not a secret that every additional node provider, data center, country etc is improving the decentralisation. As long as we don’t allow AWS hosted nodes decentralisation can only improve over time :wink:

I would like to introduce my colleague @bjoern. We are working very closely on the platform decentralisation roadmap and Bjoern is the research lead on that topic.

2 Likes

@Luis Thanks a lot for taking the time to read and like my comments. As one of the architects of the Dfinity datacentre strategy, it does you a lot of credit to engage with these issues, and I’m grateful for your response. And thank you for sharing the data and the correction. I was actually going off ICP Explorer

I remain concerned (if I was deeply invested I would say alarmed!) that the datacentre exploitability is so gaping and so seemingly unprioritised, unless I’m missing something?

What’s there to detect if a datacentre provider sells off your data or implements a back door or acts as a threat vector? How would Dfinity know, let alone the IC users affected? What’s there to stop it? And to remediate it?

And who decides and how are decisions taken and how do users get visibility about what jurisdictions IC data is hosted in, who the providers are and what due diligence has been applied, etc? Is it the NNS each time? So if heavily staked players, say who happen to also be node providers, decided to boycott a country for instance, could they just keep datacentres to their own zone, or choke entry to the market? What safeguards are there for the whales to not just rig the market and create datacentre proxies they themselves control? When you can’t even, as an investor, determine who the whales are, how much voting power they hold, how much ICP is concentrated anywhere? I am guessing Dfinity can tell. Does that not put Dfinity in a big conflict of interest?

The tension between overcentralisation and lack of structural or contractual trust is massive. If you only let those with secure and reliable datacentre infrastructures be node providers, you are likely to end up with an ecosystem similar to the current cloud landscape (which brings huge advantages of scale which I suspect will be hard to resist longterm if ICP were to really take off). I.e. a few aws, gcp, azure equivalent node and data centre providers. If you truly decentralise and let everyone provide nodes from their home machines, the security and privacy vulnerabilities as currently configured become exponential.

Making secure enclaves compulsory would help with the trust, reassuring investors and users that their data is not visible to ita hosts, and potentially even to government actors (big maybe) but/and (bug/feature) it would raise the barriers to entry and constrain participation.

I have concentrated so far only on the datacentres, but similar and additional questions apply to the Node providers. Even should secure enclaves be made mandatory, how do you confirm they have in fact been implemented? If the datacentres are in turn subcontracted by node providers who don’t run them, you are now 2 degrees removed from visibility and verification. I tell the community my datacentre has secure enclaves enabled. Community has to trust my trust in my datacentre provider, and the node provider may not even have the skillset to know what a secure enclave is, let alone confirm it.

And would implementation of secure enclaves by default require redesign or rearchitecting of either the IC itself, its subnets, Dfinity and devs’ IT processes and/or the dapps built on IC? Wiyh each chip provider having their own (SDK) along with open source ones, would that make the IC dependent on specific chip providers replacing one vector of centralisation with another, greatly exarcerbating the challenges of onboarding and expansion in a context of covid, sanctions and peak extraction?

It would seem to me that without secure enclaves by default, the IC is a gruyere cheese of privacy and security holes, the more so the more decentralised node provision becomes. But making secure enclaves default would seem

a) likely to require significant design, development, and testing resources, which given even the state of current DC and node provider contracts and related discusssion it looks like priority number 9887, optimistically!
And
b) with the current invisibility of ICP and voting power distribution outside the named nodes, anonymity of whales, and ease of cartel economics, enforcing secure enclaves by default, if actually feasible, could just centralise and distort even more the infrastructure and economics and with it the democratic deficit and risk of exploitation, fraud and pump and dump than is the case this already.

As I said, massive props to you for engaging in this challenging discussion, and I hope someone in the team has a better answer to these issues than Zaphod’s danger glasses! :wink: I certainly can’t see any easy ones, and I’m glad you are collaborating cross-teams, because this is a systemic rather than a discrete issue.

2 Likes

I sense there is a deeper truth here, but not sure I follow.

Can you clarify how any blockchain would prevent this from its node providers? Any node provider for say (ETH) can see the data on-chain.

Is your main concern that there are no Secure Enclaves on the IC yet?

Any clarity would help. Thank you!

True but at least on ETH malicious nodes lose money with PoW and get slashed with PoS, do we even have mechanisms to detect attacking nodes on the IC and if so what happens to the provider?

I sought clarification on the node provider agreement because HIPPA requires a Business Associate Agreement (BAA) to ensure enforcement of security roles and responsibiliies. Likewise, PCI compliance has significant requirements. When 7 nodes are operating together to provide concensus - it not possible to have a BAA with each. I had hoped to use homomorphic encryption or another scheme to make the data opaque but it seems that using encryption does not remove the agreement obligation per the policies.

The IC design assumes that there are evil actors. The integrity consensus protections address this. Dapp confidentiality protections however requires a little more thinking - in particular key storage to access shared information (e.g. patient records, etc).

Regarding secure enclaves - most security sub-systems rely on the notion of SECURITY ASSUMPTION: NO EVIL ADMIN in order to be secure. I am not sure of the actual assurance when running in a hostile environment. Regardless, data soveignity remains an issue with distributed nodes in certain jurisdictions.

Replacing Web2 with IC is more complex than just the tech. Contractual/policy obligations have to be addressed somehow unless the policy owners/regulators evolve.

2 Likes

There are two separate issues here:

  1. Data visibility to the nodes - The quote I highlighted from @Leamsi was more about data access. I think here, the ETH analogy makes sense… and why I asked if the concern was secure enclaves. Currently, there are no strong privacy guarantees for data on the IC just like in ETH (and other blockchains). A node provider can inspect it. Strictly speaking, a node provider inspecting data is NOT against consensus protocol. That is why inspecting data in ETH will not affect rewards.

  2. Malicious nodes - The question you refer to @Zane is about malicious or faulty nodes that would try to break consensus or do malicious things to slow down the IC, etc… In a chain like ETH, these nodes would lose rewards because their blocks do not become part of the consensus. In the IC’s design, all nodes get paid constantly… but if they deviate from consensus a statistically significant number of times, they are removed by the NNS. AFAIK, the implementation of detecting statistical deviations of nodes that are always off-sync with the rest is fairly primitive currently, so the intent is to layer and improve as needed. Note that a node that is “honest, but has a slow internet connection” would be considered “faulty” and also would be eventually removed.

Does that help? Did I misunderstand your intent?

1 Like

You’re back, I see. I thought you bid us farewell.

Welcome back.

1 Like

Thanks for the engagement @diegop. Yes I think we may be talking slightly at cross purposes.

Please do correct me if I’m wrong, but my understanding is that unlike the IC, most of the data in ethereum is stored off-chain because it costs $a lot/gb, whereas IC aspires to obviate the need for arweave or the cloud, so the data risks are exponentially higher. My understanding is that a malicious node provider or datacenter could access node state and breach confidentiality. Is this mistaken? I thought this was the very reason why there’s been work on making secure enclaves possible?

This would mean that datacentres and node providers could currently a) be systematically compromised by organised crime or other malicious actors, to gain visibility into data contents, and b) penetrated by government on requisition. And most impactful, that no one would know if or when this happened and there would be no reddress, particularly with the lack of entry barriers at the nns level.

As to the assume trust in admin paradihm, it works because you generally have a huge amount of accountability, reddress, legal entities, very detailed contracts and SLAs, established reputations, technichal personnel, etc. So you know the incentives for compliance are greater than for dishonesty, and in case of the latter the opportunities for recourse, reddress and compensation are clear.

In the case of ICP node providers the instructions include: “Depending on how much time and space you have available, stage the hardware near your cabinets (either on a table or in a safe accessible area).” It takes a different order of magnitude of trust to trust admins like these!

The other concern is that requiring secure enclaves by default may rule out most of the “hardware on your table” node providers, favour centralisation with the potential of opaque data cartels and nns manipulation, and make the relevant chips a single point of failure, that becomes more and more appealing to attackers the more ICP grows.

To sum up, yes, I have concerns about data visibility, about centralisation, and about manipulation. But really happy to be corrected on anything at all.

Ps: @Sormarler no such luck! I said I would say goodbye to investing with you all, but would keep track of developments. This is me doing the latter. I guess I want to see how far the rabbithole goes as a more general case study in web3. Thanks for the welcome back!

3 Likes

Oh speaking of ledger data visibility, @diegop do you know the best way to find the ICP and voting power distribution? I have a hunch that the old ic.rocks made it possible, but not certain and I’m not yet in a position time wise to give the code on github a spin. I would have thought it should be possible to trace icp transactions from the beginning to establish current concentrations, stakes, etc? I am always amazed how few people entering crypto, usually bright eyed, seem totally uninterested in distribution statistics that can determine so much, and specially in proof of stake systems. No one has yet been able to tell me or even point me in a direction to finding out who owns what percent. In BTC, Ether, Dogecoin, etc., ownership of 50%+ is concentrated in %0.01 to 0.001%, and ownership of 0.1% distributed among 80% of coin owners. It was similar around genesis for ICP, but can’t find data for now. Which is genuinely psychologically bizarre to me, due diligence wise.

It’s germane to this thread because my argument is that anyone who stakes enough to influence NNS accreditation of node providers and datacenters could play both sides and choke market entry in their favour.

2 Likes