Long Term R&D: Boundary Nodes (proposal)

@jzxchiang - currently, all requests from the same region are likely to land at the same boundary node. We all understand that this is not perfect, hence the decentralization goals in this proposal (that we will hopefully post in detail later today to this thread).
The selection of which replica to forward a request to is done randomly. The reason for it is so users could receive results from different nodes and so with high probability they don’t encounter a malicious node (if such exists).

@ysyms - yes. Take a look at the icfront project I posted before.

1 Like

Boundary Nodes Motion Proposal

Summary

The boundary nodes are the Internet Computer edge infrastructure. This motion proposal sets the future roadmap for boundary nodes. It is proposed to enhance the design and implementation of the boundary nodes in several aspects, to make their deployment and operation more decentralized, make them easier to deploy and upgrade, and increase their security.

1. Objective

Design, implement, and deploy enhanced scalable, decentralized, and secure network of boundary nodes for the Internet Computer. This network will serve as the edge framework of the Internet Computer and will be managed by the NNS. It will provide end-to-end security, as well as caching, for users of the IC, while being highly decentralized. The enhanced boundary nodes will also support custom domain names and TLS certificate management.

2. Background

Boundary nodes provide the network edge services of the Internet Computer (IC) including enabling standard HTTPS requests from users to canister smart contract APIs on the IC and routing canister smart contract API calls to nodes hosting those contracts on the corresponding subnet, as well as caching, load balancing, rate limiting, IPv4-IPv6 translation (as IC nodes all use IPv6), and integrity verification for content served to users.

To bring the design and implementation of the boundary nodes to the next level with respect to functionality, scalability and decentralization, the currently supported feature set is to be extended and enhanced. Furthermore, the community will be enabled to decide on the allocation and sizing of the edge infrastructure and offer custom domains.

3. Why is this important?

The enhancements for boundary nodes in this proposal will enable the IC community to sustainably grow the IC network in a decentralized fashion, while guaranteeing end-to-end security, and providing improved experience for users of the IC and of canister smart contracts that run on top of it.

4. Topics under this project

Specifically, this proposal includes the following research and development directions for boundary nodes:

  1. Scalability and Reliability - support higher request rates and more client connections by means of smarter traffic management, improved load balancing over nodes in a subnet, failover and more.
  2. Decentralization - allow more node providers to deploy boundary nodes by reducing the operational requirements.
  3. NNS management - let the NNS manage boundary nodes, including adding, removing, and upgrading nodes and compensating node providers via proposals.
  4. Security - integrate security improvements mentioned in the proposals on Trusted Execution Enhanced IC and the decentralized DNS and CA.
  5. Domains - enable additional and custom domains for canister smart contracts discovery and enhanced decentralization.
  6. Customizable - make it easier for node owners to control the operations and capabilities of their boundary node like API-only boundary nodes, filtering etc.
  7. Resiliency - related to the scalability and decentralization goals above, improve the DoS protection mechanisms for the IC.
  8. Discovery and Steering - Provide distributed discovery and steering to boundary nodes and different node providers and make it easier for the community to provide such services themselves.
  9. HTTP and query API caching - Improve caching on the boundary nodes so that they are compliant with caching standards.
  10. Semantic caching - use read-only canister state to serve queries directly from the boundary nodes.
  11. Monitoring - enhance monitoring of boundary nodes using, for example, probing, and improved metrics. Provide metrics access to the community e.g. for use in node provider remuneration.
  12. Boundary Node Economy - provide remuneration for running boundary nodes and charging canisters for support services (e.g., serving cached results).
  13. Compliance with local laws - as recently discussed by the community, the boundary node providers may be liable, by local laws, for content served through their nodes. Our intent is to research mechanisms, review community suggestions, and propose to the community possible mechanisms that would empower boundary node providers to restrict content served through their nodes, such that they remain compliant with local laws. As a consequence of the decentralization goal, the content might still be accessible from other jurisdictions.

5. Key milestones

The following milestones are indicative and may not be reached in the order listed here.

  • M1: Provide a public and open source process for building a boundary node VM deterministically.
  • M2: Enable additional boundary nodes on different domains and include support for them in the CDKs/agent code.
  • M3: Have additional boundary nodes either API-only and/or on different domains.
  • M4: Introduce an economic model for boundary nodes based on additional monitoring.
  • M5: Enable the boundary nodes to be deployed and updated via NNS proposals.
  • M6: Increase the number of NNS controlled and remunerated boundary nodes and node providers.
  • M7: Improved Scalability, Resilience and Standards Compliant Caching
  • M8: Trusted Execution for improved security
  • M9: Distributed Discovery and Steering
  • M10: Semantic Caching

6. People involved

Discussion leads: Yotam Harchol, John Plevyak, Björn Tackmann, Rüdiger Kapitza

7. Why the DFINITY Foundation should make this a long-running R&D project

Boundary nodes are necessary for the Internet Computer, to provide transparent access for web users, as well as to secure the IC. Boundary nodes are part of the IC, and therefore should be as secure and as decentralized as possible. Therefore, the DFINITY Foundation is committed to researching and designing the next generation of boundary nodes including the above-mentioned areas for the benefit of the IC as a whole.

8. Skills and Expertise necessary to accomplish this

The problems described above require the cooperation of networking experts with security and cryptography experts, to design, review, and implement the prospective solutions, as well as to provide detailed security reviews and proofs. Specifically, experts from the following fields are necessary:

  • Network systems
  • Network management
  • Network security
  • Systems security
  • Secure hardware
  • Cryptography
  • Distributed systems
  • Economics

This project would require both researchers and software engineers with expertise in the above-mentioned fields.

9. Open research questions

  • Efficiently load balance subnet nodes at the boundary nodes, without introducing high bandwidth and computation overheads
  • Fully decentralized discovery and routing for unmodified standard Web2 devices and users
  • Semantic caching - execute query calls directly on the boundary nodes, using read-only replicated state and possibly different consistency guarantees
  • Boundary node economy - develop an economic model for running the boundary nodes edge network
  • Compliance with local laws based on the location of boundary nodes

10. Examples where the community can integrate into project

As boundary nodes are an important piece of the IC infrastructure, we expect high community interest in this proposal. We invite the community to join the engineers and researchers of DFINITY in the discussion of this topic. We welcome any ideas for the topics above, as well as any critical assessment. We plan to keep the community posted on this topic on a regular basis.

11. What we are asking the community

Please review this proposal and provide us with any feedback you have regarding the boundary nodes. Please also review the other related proposals on trusted execution environments and decentralized DNS and CA. We invite you to engage in the discussion and hope it will be fruitful and useful for the IC community and for the future edge framework of the IC.

6 Likes

This is a massive undertaking―but absolutely critical.

Decentralizing boundary nodes, bringing them under the control of NNS, creating economic remuneration schemes, etc… it spans the whole stack.

The more I think about it, boundary nodes are a high-risk point of failure for the IC. Even though replicas go through consensus, boundary nodes don’t. What if a malicious party runs a boundary node and modifies canister responses to make it seem like they reached consensus on something they in fact did not?

Is this where a deterministic, verified build process for boundary nodes comes into play? (Now that I think about it, this problem might have already been solved… how does DFINITY ensure that node providers are actually running the correct, unmodified replica software?)

2 Likes

Hmm… doesn’t that kind of defeat the purpose of DNS steering, if a us-west1 boundary node has to communicate with a asia-east2 replica? That still seems high latency.

@jzxchiang You are correct, but please take into account the following two issues:

  1. We would like the IC (and its clients) to be resilient to malicious nodes and node providers.
  2. The boundary node provides caching (and will provide more enhanced caching in the future as you can see in the proposal), so the DNS steering purpose is not completely defeated.
2 Likes

@jzxchiang I missed your first question, about the VM build and malicious boundary nodes. This is covered more thoroughly in the Decentralized CA and DNS and TEE Enhanced IC proposals, but yes – this is definitely something we think about and would like the community to discuss and drive forward.

1 Like

NNS Motion is live! Internet Computer Network Status

1 Like

Just for clarification: do you mean domains other than *.ic0.app?

Yes, we also mean such domains.

3 Likes

I am thinking about the following scenario: if a country doesn’t want Internet Computer dapps, they can just block the *.ic0.app domains and no one in the country can access dapps on the IC. In this proposal, you want to add more domains, but countries can just block those as well right? Is there a way for the IC to make it accessible for everyone even if countries have the power to block certain domains?
Thanks!

1 Like

Do you have any updates about how we can prevent malicious boundary nodes from serving modified responses, such as a tampered service worker?

2 Likes

Hi Gabe, we are looking into various ways to mitigate this. In fact we might provide multiple things to address this issue, because each measure has its own pros and cons. One direction is likely a web extension – here it can only be a point solution because the web extension needs substantial API support and for example Chrome likes to limit their extension APIs more and more. On the other side trusted execution is explored and we will amplify our efforts here during the next month. While trusted execution gives additional security it is not a silver bullet due to side-channel attacks that we have to take into account. Hope this helps a bit.

Hi, thank you for the answer. Do you mean a web extension as a replacement to the service worker acquired by boundary nodes? That sounds like a solution, but it takes away from the current seamless user experience on the IC. This should definitely not be underestimated in my opinion and may be one deciding factor for adoption.

I’m aware of the protection against intrusions at the host level that TEEs can bring, but do you care to explain how it can help ensure that e.g boundary nodes provide an unmodified response?

1 Like

Why cant the BNs just have a replicated state that they come to consensus to and serve the SW from that state?

@JaMarco am not clear if consensus would solve the puzzle here. The replicas can propose, agree, and even sign the canonical version of the service worker. However, it still won’t stop the malicious boundary node from serving a corrupt service worker. I understand that such a tampered service worker won’t have the subnet signatures, but there is no abstraction/control point to check the signatures/authenticity of the service worker itself (in the browser)

Probably the signed service worker can be checked by the browser extension, (then again who would authenticate the extension)

1 Like

How is this different than clients getting update responses from the IC? How are those results verified in the browser?

Hi, web extension will not be a replacement for the service worker - it is an alternative option for the security sensitive user that likes to have the browser experience but does not like to trust in the boundary node. The idea is that the web extension will work as a drop-in solution. If you have the extension the service worker will not be loaded – if you don’t have it the service worker will do the job. This way user experience should not be an issue.

Regarding trusted execution we aim to give the users means to validate via the browser and some additional easy tooling that she accesses a VM running on top of the right HW that executes the expected Boundary Node VM image. This includes the assumption that the HW is flawless and the attested code is correct – but we will take a rouge administrator out of the equation and exploits at the host os and hypervisor level.

1 Like

The core problem is that a browser does not know about the APIv2 protocol of the IC. Thus in order to empower a clueless browser to speak to the IC, we let the browser contact the Boundary Node. The Boundary Node can now translate the ordinary HTTP request to a IC protocol conform request or return the IC service worker to the browser. The IC service worker enhances the browser to speak natively to the IC. If we would replicate the state at the Boundary Node level we run again into the problem that the browser is clueless on how to speak to a replicated system – in this case the Boundary Nodes.

By providing a web extension as @faraz.shaikh pointed out, we provide the knowlege on how to speak to the IC via a different path. (That in principle could be manipulated – however installing a web extension is a more explicit step and unlike a service worker that is frequently reloaded, a web extension needs to be updated. Thus, an explicit step, where a security sensitive user could take a closer look and inspect what is updated.)

2 Likes

Having the web extension as an alternative surely helps with the user experience/adoption points that I mentioned earlier. I think it is great to have extra tools for security sensitive users. However, we should face that if there is an easier path, i.e not using the extension, the majority of users will take that path. If 1/3 of users on IC installs the extension and 2/3 works with a less secure option as source of truth, then that can of course be really harmful for the ecosystem. And 1/3 is probably a very generous number in the context of mass adoption. Hence, it is more interesting to discuss the security aspects of the more common “path” in my opinion.

Can you explain in more detail how a user could validate a boundary node’s image to ensure that she is served an unmodified response? Also, what means exactly with the assumption that “the attested code is correct”?

2 Likes

Hi again, I fully agree with your approximation regarding the use of a web extension. However, the idea is to at least provide an alternative. Such a web extension might also be a crystallisation core for more projects in this direction such as making the ic protocol a native part of a browser.

Regarding the validation: The current direction that we evaluate is to secure the VMs via SEV-SNP (and later TDX maybe – see the other roadmap proposal) and empower users to perform remote attestation of the boundary node VM. If via remote attestation you can validate that only code that you trust in is running in the VM, the VM itself is protected via hardware mechanisms and that for the connection to the boundary node there is no way to establish a man-in-the-middle – this would be a big step forward.

How can you know what code is running in the VM? Well there needs to be a deterministic build process to create the boundary node image from the IC repository. You can inspect the code and see that nobody can get access to the VM once it is running. (No ssh access and no backdoor.) By repeating the build you gain a hash sum that should match with the one which is included in the remote attestation report. Of course, there is more to say how remote attestation exactly works but this might lead too far here.

The critical point of course as with any software that you (have to) trust in is that there could be bugs that can be exploited or there can be side channel attacks that might enable devoted attackers to circumvent the hardware protection.

Again you might ask, user experience?! The remote attestation will be again performed mainly by security sensitive users – via another web extension – or even a standalone tool. (In both cases the effective code will be very small and easy to validate and transfer.) In this case however the security-sensitive users will do something for the others. In case we have say only 10% of users doing remote attestation the likely hood of hosting a rogue boundary node is very limited to not get spotted. And the commodity users will benefit.

3 Likes