Long Term R&D: Boundary Nodes (proposal)

1. Summary

This is a motion proposal for the long-term R&D of the DFINITY Foundation, as part of the follow-up to this post: Motion Proposals on Long Term R&D Plans (Please read this post for context).

This project’s objective

The boundary nodes are the gateways to the IC. Their main purpose is to translate HTTP requests from users into calls to canister smart contracts on the IC and route calls to nodes on the corresponding subnet. In addition, boundary nodes provide load balancing, caching, rate limiting, IPv4-IPv6 translation (as IC nodes all use IPv6), and integrity verification for content served to users. This motion proposal sets the future roadmap for boundary nodes. It is proposed to enhance the design and implementation of the boundary nodes in several aspects, to make their deployment and operation more decentralized, make them easier to deploy and upgrade, and increase their security.

2. Discussion lead

Yotam Harchol

3. How this R&D proposal is different from previous types

Previous motion proposals have revolved around specific features and tended to have clear, finite goals that are delivered and completed. They tended to be measured in days, weeks, or months.

These motion proposals are different and are defining the long-term plan that the foundation will use, e.g., for hiring and organizational build-out. They have the following traits and patterns:

  1. Their scope is years, not weeks or months as in previous NNS motions
  2. They have a broad direction but are active areas of R&D so they do not have an obvious line of execution.
  3. They involve deep research in cryptography, networking, distributed systems, language, virtual machines, operating systems.
  4. They are meant to match the strengths of where the DFINITY foundation’s expertise is best suited.
  5. Work on these proposals will not start immediately.
  6. There will be many follow-up discussions and proposals on each topic when work is underway and smaller milestones and tasks get defined.

An example may be the R&D for “Scalability” where there will be a team investigating and improving the scalability of the IC at various stages. Different bottlenecks will surface and different goals will be met.

3. How this R&D proposal is similar to what we have seen

We want to double down on the behaviors we think have worked well. These include:

  1. Publicly identifying owners of subject areas to engage and discuss their thinking with the community
  2. Providing periodic updates to the community as things evolve, milestones reached, proposals are needed, etc…
  3. Presenting more and more R&D thinking early and openly.

This has worked well for the last 6 months so we want to repeat this pattern.

4. Next Steps

Developer forum intro posted
1-pager from the discussion lead posted
NNS Motion proposal submitted

5. What we are asking the community

  • Ask questions
  • Read 1-pager
  • Give feedback
  • Vote on the motion proposal

Frankly, we do not expect many nitty-gritty details because these are meant to address projects that go on for long time horizons.

The DFINITY foundation’s only goal is to improve the adoption of the IC so we want to sanity-check the projects we see necessary for growing the IC by having you (the ICP community) tell us what you all think of these active R&D threads we have.

6. What this means for the existing Roadmap or Projects

In terms of the current roadmap and proposals executed, those are still being worked on and have priority.

An intellectually honest way to look at this long-term R&D project is to see them as the upstream or “primordial soup” from which more baked projects emerge from. With this lens, these proposals are akin to asking, “what kind of specialties or strengths do we want to make sure DFINITY foundation has built up?”

Most (if not all) projects that the DFINITY foundation has executed or is executing are borne from long-running R&D threads. Even when community feedback tells the foundation, “we need X” or “Y does not work”, it is typically the team with the most relevant R&D area that picks up the short-term feature or project.

3 Likes

Please note:

Some folks gave asked if they should vote to “reject” any of the Long Term R&D projects as a way to signal prioritization. The answer is simple: “No, please, ACCEPT” :wink:

These long-term R&D projects are the DFINITY’s foundation’s thesis at R&D threads it should have across years (3 years is the number we sometimes use internally). We are asking the community to ACCEPT (pending 1-pager and more community feedback of course). Prioritization can come at a separate step.

Hi, I’m Yotam, a researcher at DFINITY.
My research is focused on networked systems, distributed systems, and edge computing.
I am driving this proposal, but many other people at DFINITY are working on the Boundary Nodes with me.
I would be happy to answer any question, take suggestions from the community, and engage in more detailed discussion.

8 Likes

hi thanks for sharing information and Really proposal for the long term R&D.

1 Like

Hi @yotam I have a question, if a client calls a canister and the client’s-agent sees that the response has an invalid bls signature or some other invalidity in the certification of the response, how can a client(the agent) request to speak to a different node on the subnet?

Are boundary nodes simple enough that they would run on more basic and diverse hardware? Could this component of the network be opened up to independent “mom and pop” operators?

3 Likes

When my browser queries DNS for the IP address of, say, https://erxue-5aaaa-aaaab-qaagq-cai.raw.ic0.app/, which boundary node does the eventual request hit? Is it round-robin across multiple boundary nodes in different jurisdictions?

I think a helpful start would be a one pager explaining boundary nodes as they exist right now.

3 Likes

I thought boundary nodes also provided TLS termination for HTTPS requests? Or is the traffic between boundary nodes and IC nodes also encrypted, so the replica software provides TLS termination?

2 Likes

Whether each user can run a boundary node client to access DAPPs

Hi, thank you for these questions! Here are some answers:

@levi When you make a request to a boundary node, it is directed at a random node in the corresponding subnet, so a client can retry the call in such a case as you describe. The client also receives the node ID in the response, so they can see which node responded. We are checking with our security team the option to provide a feature that allows clients to specify a node.

@jorgenbuilder Our goal is to make the IC as distributed as possible. This of course includes the boundary nodes. Currently we do have some strict requirements for them such as available bandwidth and CPU, but we are looking at directions to reduce these requirements as we grow the network of boundary nodes. On the other hand, we are looking at using trusted execution environments for the boundary nodes (for enhanced security, see separate proposal on this topic), and this might be another hardware requirement. Nonetheless, TEEs become more and more available, so this may not be too bad.

@jzxchiang Re: DNS, the DNS records for *.ic0.app point to all boundary nodes. DNS Steering is used to direct you to the nearest boundary node.
Re: TLS, traffic to boundary nodes (HTTPS) and between boundary nodes and IC nodes is encrypted with TLS (as well as traffic between IC nodes). The boundary nodes terminate TLS to authenticate with the *.ic0.app certificate (because they have to authenticate), and tunnel the traffic to the corresponding IC node over a TLS connection that is established between the two.

Users are encouraged to do their own TLS termination for their own domains, for example like Fleek. As a long term solution, we are working on having custom domains served securely. One way of doing that is proposed in the icfront project.

@ysyms see my answer to @jzxchiang right above here, would that be what you are looking for?

5 Likes

Thanks for the response.

I’m curious what type of DNS Steering policy is used. Is it based on node health or geographic proximity?

1 Like

Geographic proximity

2 Likes

Interesting. What if all the requests are coming from the same region?

For example, let’s say you have a dapp that’s only available in New York. Then, DNS steering would route all requests from the dapp to some boundary node near New York. That seems like it could overload the node pretty quickly.

How does that boundary node then select which replica (i.e. IC node) to forward the request to? Is that also based on geometric proximity as is suggested here, or is that round-robin?

Apologies if you’ve answered this already.

1 Like

Can we make each user run a boundary node to interact with canister in the IC subnet?

@jzxchiang - currently, all requests from the same region are likely to land at the same boundary node. We all understand that this is not perfect, hence the decentralization goals in this proposal (that we will hopefully post in detail later today to this thread).
The selection of which replica to forward a request to is done randomly. The reason for it is so users could receive results from different nodes and so with high probability they don’t encounter a malicious node (if such exists).

@ysyms - yes. Take a look at the icfront project I posted before.

1 Like

Boundary Nodes Motion Proposal

Summary

The boundary nodes are the Internet Computer edge infrastructure. This motion proposal sets the future roadmap for boundary nodes. It is proposed to enhance the design and implementation of the boundary nodes in several aspects, to make their deployment and operation more decentralized, make them easier to deploy and upgrade, and increase their security.

1. Objective

Design, implement, and deploy enhanced scalable, decentralized, and secure network of boundary nodes for the Internet Computer. This network will serve as the edge framework of the Internet Computer and will be managed by the NNS. It will provide end-to-end security, as well as caching, for users of the IC, while being highly decentralized. The enhanced boundary nodes will also support custom domain names and TLS certificate management.

2. Background

Boundary nodes provide the network edge services of the Internet Computer (IC) including enabling standard HTTPS requests from users to canister smart contract APIs on the IC and routing canister smart contract API calls to nodes hosting those contracts on the corresponding subnet, as well as caching, load balancing, rate limiting, IPv4-IPv6 translation (as IC nodes all use IPv6), and integrity verification for content served to users.

To bring the design and implementation of the boundary nodes to the next level with respect to functionality, scalability and decentralization, the currently supported feature set is to be extended and enhanced. Furthermore, the community will be enabled to decide on the allocation and sizing of the edge infrastructure and offer custom domains.

3. Why is this important?

The enhancements for boundary nodes in this proposal will enable the IC community to sustainably grow the IC network in a decentralized fashion, while guaranteeing end-to-end security, and providing improved experience for users of the IC and of canister smart contracts that run on top of it.

4. Topics under this project

Specifically, this proposal includes the following research and development directions for boundary nodes:

  1. Scalability and Reliability - support higher request rates and more client connections by means of smarter traffic management, improved load balancing over nodes in a subnet, failover and more.
  2. Decentralization - allow more node providers to deploy boundary nodes by reducing the operational requirements.
  3. NNS management - let the NNS manage boundary nodes, including adding, removing, and upgrading nodes and compensating node providers via proposals.
  4. Security - integrate security improvements mentioned in the proposals on Trusted Execution Enhanced IC and the decentralized DNS and CA.
  5. Domains - enable additional and custom domains for canister smart contracts discovery and enhanced decentralization.
  6. Customizable - make it easier for node owners to control the operations and capabilities of their boundary node like API-only boundary nodes, filtering etc.
  7. Resiliency - related to the scalability and decentralization goals above, improve the DoS protection mechanisms for the IC.
  8. Discovery and Steering - Provide distributed discovery and steering to boundary nodes and different node providers and make it easier for the community to provide such services themselves.
  9. HTTP and query API caching - Improve caching on the boundary nodes so that they are compliant with caching standards.
  10. Semantic caching - use read-only canister state to serve queries directly from the boundary nodes.
  11. Monitoring - enhance monitoring of boundary nodes using, for example, probing, and improved metrics. Provide metrics access to the community e.g. for use in node provider remuneration.
  12. Boundary Node Economy - provide remuneration for running boundary nodes and charging canisters for support services (e.g., serving cached results).
  13. Compliance with local laws - as recently discussed by the community, the boundary node providers may be liable, by local laws, for content served through their nodes. Our intent is to research mechanisms, review community suggestions, and propose to the community possible mechanisms that would empower boundary node providers to restrict content served through their nodes, such that they remain compliant with local laws. As a consequence of the decentralization goal, the content might still be accessible from other jurisdictions.

5. Key milestones

The following milestones are indicative and may not be reached in the order listed here.

  • M1: Provide a public and open source process for building a boundary node VM deterministically.
  • M2: Enable additional boundary nodes on different domains and include support for them in the CDKs/agent code.
  • M3: Have additional boundary nodes either API-only and/or on different domains.
  • M4: Introduce an economic model for boundary nodes based on additional monitoring.
  • M5: Enable the boundary nodes to be deployed and updated via NNS proposals.
  • M6: Increase the number of NNS controlled and remunerated boundary nodes and node providers.
  • M7: Improved Scalability, Resilience and Standards Compliant Caching
  • M8: Trusted Execution for improved security
  • M9: Distributed Discovery and Steering
  • M10: Semantic Caching

6. People involved

Discussion leads: Yotam Harchol, John Plevyak, Björn Tackmann, Rüdiger Kapitza

7. Why the DFINITY Foundation should make this a long-running R&D project

Boundary nodes are necessary for the Internet Computer, to provide transparent access for web users, as well as to secure the IC. Boundary nodes are part of the IC, and therefore should be as secure and as decentralized as possible. Therefore, the DFINITY Foundation is committed to researching and designing the next generation of boundary nodes including the above-mentioned areas for the benefit of the IC as a whole.

8. Skills and Expertise necessary to accomplish this

The problems described above require the cooperation of networking experts with security and cryptography experts, to design, review, and implement the prospective solutions, as well as to provide detailed security reviews and proofs. Specifically, experts from the following fields are necessary:

  • Network systems
  • Network management
  • Network security
  • Systems security
  • Secure hardware
  • Cryptography
  • Distributed systems
  • Economics

This project would require both researchers and software engineers with expertise in the above-mentioned fields.

9. Open research questions

  • Efficiently load balance subnet nodes at the boundary nodes, without introducing high bandwidth and computation overheads
  • Fully decentralized discovery and routing for unmodified standard Web2 devices and users
  • Semantic caching - execute query calls directly on the boundary nodes, using read-only replicated state and possibly different consistency guarantees
  • Boundary node economy - develop an economic model for running the boundary nodes edge network
  • Compliance with local laws based on the location of boundary nodes

10. Examples where the community can integrate into project

As boundary nodes are an important piece of the IC infrastructure, we expect high community interest in this proposal. We invite the community to join the engineers and researchers of DFINITY in the discussion of this topic. We welcome any ideas for the topics above, as well as any critical assessment. We plan to keep the community posted on this topic on a regular basis.

11. What we are asking the community

Please review this proposal and provide us with any feedback you have regarding the boundary nodes. Please also review the other related proposals on trusted execution environments and decentralized DNS and CA. We invite you to engage in the discussion and hope it will be fruitful and useful for the IC community and for the future edge framework of the IC.

6 Likes

This is a massive undertaking―but absolutely critical.

Decentralizing boundary nodes, bringing them under the control of NNS, creating economic remuneration schemes, etc… it spans the whole stack.

The more I think about it, boundary nodes are a high-risk point of failure for the IC. Even though replicas go through consensus, boundary nodes don’t. What if a malicious party runs a boundary node and modifies canister responses to make it seem like they reached consensus on something they in fact did not?

Is this where a deterministic, verified build process for boundary nodes comes into play? (Now that I think about it, this problem might have already been solved… how does DFINITY ensure that node providers are actually running the correct, unmodified replica software?)

2 Likes

Hmm… doesn’t that kind of defeat the purpose of DNS steering, if a us-west1 boundary node has to communicate with a asia-east2 replica? That still seems high latency.

@jzxchiang You are correct, but please take into account the following two issues:

  1. We would like the IC (and its clients) to be resilient to malicious nodes and node providers.
  2. The boundary node provides caching (and will provide more enhanced caching in the future as you can see in the proposal), so the DNS steering purpose is not completely defeated.
2 Likes

@jzxchiang I missed your first question, about the VM build and malicious boundary nodes. This is covered more thoroughly in the Decentralized CA and DNS and TEE Enhanced IC proposals, but yes – this is definitely something we think about and would like the community to discuss and drive forward.

1 Like