HTTPS Outcalls to IPv4 Destinations - Update, and request for comments

Hi,

Yotam from DFINITY here.

We would like to give an update on the HTTPS outcalls feature and our efforts in allowing canisters to make such calls to destination hosts that only support IPv4.

First, let me explain the problem, or, why currently HTTPS outcalls only work if the destination server has an IPv6 address. The Internet Computer nodes only have IPv6 addresses. Each node has its own public IPv6 address, but no node, with the exception of boundary nodes, has a public IPv4 address. For this reason, nodes cannot directly communicate with IPv4 hosts on the Internet. The IC is available over IPv4 due to the fact that all user traffic is routed to it through the boundary nodes, which serve as proxies in this aspect.

The reason nodes have only IPv6 addresses is because the IC cannot scale well if we require IPv4 addresses for nodes. IPv4 addresses are scarce and expensive, and this will only get worse with time. Therefore, it was a design decision to have the IC as an IPv6-only network.

When a node tries to make an HTTPS outcall, it first resolves the IP address of the requested domain name (as part of the given URL) by making a DNS query. If the domain name does not resolve to one or more IPv6 addresses (that is, the DNS response does not contain any AAAA record), the request fails instantly.

However, we see a great value in the ability to make HTTPS outcalls to web 2.0 services that are only available over IPv4, and we are therefore actively looking for a good solution. Unfortunately, this is not an easy problem, and we would like to share our thoughts with the wider IC community so we all work together in finding a solution.

Part 1: Solution Space

Since assigning IPv4 addresses to all IC nodes is impractical, and since we want all IC nodes to eventually be able to make outcalls to IPv4 servers, we must use some sort of proxy mechanism to provide IPv4 connectivity from these IPv6 nodes.

The two main approaches in the industry are NAT64 and SOCKS. NAT64 usually is coupled with the use of DNS64, which resolves requests to IPv4 destinations to an IPv6 address of a NAT64 server, where the TCP connection is proxied over IPv4 to the real destination server. SOCKS is a lower layer protocol where any connection from one side is proxied to the other side, so if the SOCKS proxy is placed on a dual-stack machine (i.e., supports both IPv4 and IPv6), it can accept connections over IPv6 and initiate proxied connections over IPv4.

The problem with using proxies is that they require some additional trust. In our case, HTTPS outcalls are end-to-end encrypted, so such proxies cannot look into the traffic or modify it, but they can certainly affect availability (e.g., by blocking requests, or delaying them).

Another problem with using proxies is that sharing a proxy may increase their vulnerability to availability attacks, for example by malicious canisters, or malicious nodes. We should therefore partition the use of proxies. We can reason about separating them among different canisters, different subnets, and possibly different node operators. However, this requires having many available proxies.

Part 2: Proposal

We propose to have proxies as part of the IC infrastructure. For example, the boundary nodes may serve as SOCKS proxies as they are already dual-stacked. A mapping from node to its proxy, based on its corresponding node operator and subnet type, will be maintained in the registry through proposals.

As discussed above, there are certain security and availability concerns to take into account when sharing a proxy. For this reason, the price for a call that may go through a shared proxy should be higher, and developers should be aware of the reduced security and availability guarantees when making such calls. For the same reason, I would like to explicitly note that we are still in the phase of analyzing this proposed solution, and we cannot guarantee that it will be eventually implemented.

Part 3: Pricing and Interface

It is clear that the price for a call that goes through a shared proxy must be higher than a direct call as it is done today to IPv6 destinations. The proxy is a bounded resource and a higher fee would incentivise users to use IPv6 when possible. Due to the way we reach consensus over HTTPS outcall responses, this means that we need to add a field in the request to specify whether the call should be made over IPv6 or whether a proxy should be used. Here again we have several options: either keep the default IPv6 only, and then the default pricing is as it is today, or make IPv4+IPv6 the default, making the default pricing more expensive, but enabling more outcalls by default.

Part 4: Discussion

Please share with us your thoughts on this matter. Specifically, we would like to understand better the following points:

  1. Do you think HTTPS outcalls should support IPv4 destinations?
  2. Do you have any suggestions or comments on the proposed solution?
  3. Do you have any preferences on the default pricing options?
  4. Do you see any other options or do you have any suggestions?

Thank you!

11 Likes

Bumping for visibility!

2 Likes

Hi @yotam Trade secret - if we need IPv4 services we just use our own server to call and aggregate that data, and then serve that data to the IC via IPv6 Outcalls. Why?

a. We can control the way the data looks, prune it, before is gets replicated across a subnet.

b. If there are a chain of calls needed and we only care about the final result… we just need an endpoint for the final result in the IC.

c. We can have the server in our physical possession or a controlled environment. (only bad things happen if we let them) Cannot say the same about calling a third party provider directly from the IC.

  1. Do you think HTTPS outcalls should support IPv4 destinations?

Would it be convenient? yes. Many prominent and trusted services have IPv6 endpoints.
This is a bit like adding Oauth into Internet Identity because most people use Gmail login.

Future proof is my bias.

  1. Do you have any suggestions or comments on the proposed solution?

Your proxy idea seems correct.

  1. Do you have any preferences on the default pricing options?

Engineers will most likely do what we do if IPv6 is not available for a API provider. Build their own aggregator and serve it via IPv6. Particularly if there is a price difference.

  1. Do you see any other options or do you have any suggestions?

Secure data enclaves have more business use cases to pull new orgs into the IC. I would like that to be high on the dev to-do list :wink:

1 Like

Thank you for your response @apotheosis !
Using an intermediate server to aggregate requests and bridge IPv6->IPv4 is of course an option every canister developer has, but it requires some additional trust in that additional server and its operator. Our goal is to enable direct communication between the IC and Web2 servers, without any additional intermediate trust. I understand that for some use cases it is less important, and these such solutions can work (and I’m really happy for that).

Regarding secure enclaves, I can assure you that other teams at DFINITY are (very) actively working on that.

2 Likes

If possible (no idea if it is), it may be good to make the default the cheapest possible for each individual call, so that the call may attempt IPv6 and fallback to IPv4 if no record is found, hence minimising cost while maximising functionality by default. It may be good to also log a warning to alert the developer that the call used IPv4, was more expensive, and suggest finding an alternative IPv6 source for future calls.