HTTPS Outcalls to IPv4 Destinations - Update, and request for comments

Hi,

Yotam from DFINITY here.

We would like to give an update on the HTTPS outcalls feature and our efforts in allowing canisters to make such calls to destination hosts that only support IPv4.

First, let me explain the problem, or, why currently HTTPS outcalls only work if the destination server has an IPv6 address. The Internet Computer nodes only have IPv6 addresses. Each node has its own public IPv6 address, but no node, with the exception of boundary nodes, has a public IPv4 address. For this reason, nodes cannot directly communicate with IPv4 hosts on the Internet. The IC is available over IPv4 due to the fact that all user traffic is routed to it through the boundary nodes, which serve as proxies in this aspect.

The reason nodes have only IPv6 addresses is because the IC cannot scale well if we require IPv4 addresses for nodes. IPv4 addresses are scarce and expensive, and this will only get worse with time. Therefore, it was a design decision to have the IC as an IPv6-only network.

When a node tries to make an HTTPS outcall, it first resolves the IP address of the requested domain name (as part of the given URL) by making a DNS query. If the domain name does not resolve to one or more IPv6 addresses (that is, the DNS response does not contain any AAAA record), the request fails instantly.

However, we see a great value in the ability to make HTTPS outcalls to web 2.0 services that are only available over IPv4, and we are therefore actively looking for a good solution. Unfortunately, this is not an easy problem, and we would like to share our thoughts with the wider IC community so we all work together in finding a solution.

Part 1: Solution Space

Since assigning IPv4 addresses to all IC nodes is impractical, and since we want all IC nodes to eventually be able to make outcalls to IPv4 servers, we must use some sort of proxy mechanism to provide IPv4 connectivity from these IPv6 nodes.

The two main approaches in the industry are NAT64 and SOCKS. NAT64 usually is coupled with the use of DNS64, which resolves requests to IPv4 destinations to an IPv6 address of a NAT64 server, where the TCP connection is proxied over IPv4 to the real destination server. SOCKS is a lower layer protocol where any connection from one side is proxied to the other side, so if the SOCKS proxy is placed on a dual-stack machine (i.e., supports both IPv4 and IPv6), it can accept connections over IPv6 and initiate proxied connections over IPv4.

The problem with using proxies is that they require some additional trust. In our case, HTTPS outcalls are end-to-end encrypted, so such proxies cannot look into the traffic or modify it, but they can certainly affect availability (e.g., by blocking requests, or delaying them).

Another problem with using proxies is that sharing a proxy may increase their vulnerability to availability attacks, for example by malicious canisters, or malicious nodes. We should therefore partition the use of proxies. We can reason about separating them among different canisters, different subnets, and possibly different node operators. However, this requires having many available proxies.

Part 2: Proposal

We propose to have proxies as part of the IC infrastructure. For example, the boundary nodes may serve as SOCKS proxies as they are already dual-stacked. A mapping from node to its proxy, based on its corresponding node operator and subnet type, will be maintained in the registry through proposals.

As discussed above, there are certain security and availability concerns to take into account when sharing a proxy. For this reason, the price for a call that may go through a shared proxy should be higher, and developers should be aware of the reduced security and availability guarantees when making such calls. For the same reason, I would like to explicitly note that we are still in the phase of analyzing this proposed solution, and we cannot guarantee that it will be eventually implemented.

Part 3: Pricing and Interface

It is clear that the price for a call that goes through a shared proxy must be higher than a direct call as it is done today to IPv6 destinations. The proxy is a bounded resource and a higher fee would incentivise users to use IPv6 when possible. Due to the way we reach consensus over HTTPS outcall responses, this means that we need to add a field in the request to specify whether the call should be made over IPv6 or whether a proxy should be used. Here again we have several options: either keep the default IPv6 only, and then the default pricing is as it is today, or make IPv4+IPv6 the default, making the default pricing more expensive, but enabling more outcalls by default.

Part 4: Discussion

Please share with us your thoughts on this matter. Specifically, we would like to understand better the following points:

  1. Do you think HTTPS outcalls should support IPv4 destinations?
  2. Do you have any suggestions or comments on the proposed solution?
  3. Do you have any preferences on the default pricing options?
  4. Do you see any other options or do you have any suggestions?

Thank you!

15 Likes

Bumping for visibility!

2 Likes

Hi @yotam Trade secret - if we need IPv4 services we just use our own server to call and aggregate that data, and then serve that data to the IC via IPv6 Outcalls. Why?

a. We can control the way the data looks, prune it, before is gets replicated across a subnet.

b. If there are a chain of calls needed and we only care about the final result… we just need an endpoint for the final result in the IC.

c. We can have the server in our physical possession or a controlled environment. (only bad things happen if we let them) Cannot say the same about calling a third party provider directly from the IC.

  1. Do you think HTTPS outcalls should support IPv4 destinations?

Would it be convenient? yes. Many prominent and trusted services have IPv6 endpoints.
This is a bit like adding Oauth into Internet Identity because most people use Gmail login.

Future proof is my bias.

  1. Do you have any suggestions or comments on the proposed solution?

Your proxy idea seems correct.

  1. Do you have any preferences on the default pricing options?

Engineers will most likely do what we do if IPv6 is not available for a API provider. Build their own aggregator and serve it via IPv6. Particularly if there is a price difference.

  1. Do you see any other options or do you have any suggestions?

Secure data enclaves have more business use cases to pull new orgs into the IC. I would like that to be high on the dev to-do list :wink:

1 Like

Thank you for your response @apotheosis !
Using an intermediate server to aggregate requests and bridge IPv6->IPv4 is of course an option every canister developer has, but it requires some additional trust in that additional server and its operator. Our goal is to enable direct communication between the IC and Web2 servers, without any additional intermediate trust. I understand that for some use cases it is less important, and these such solutions can work (and I’m really happy for that).

Regarding secure enclaves, I can assure you that other teams at DFINITY are (very) actively working on that.

2 Likes

If possible (no idea if it is), it may be good to make the default the cheapest possible for each individual call, so that the call may attempt IPv6 and fallback to IPv4 if no record is found, hence minimising cost while maximising functionality by default. It may be good to also log a warning to alert the developer that the call used IPv4, was more expensive, and suggest finding an alternative IPv6 source for future calls.

Another way could be to form an ipv4 subnet (or more?) which cost a little bit more than the others.
People can split their canister and put ipv4 site related logic to there and access the data with cross subnet calls. The capacity and performance of the subnet can be low, because canisters would be idle most of the time, aka wait either for the request or the response.
Over time, as the transition to ipv6 continues, the demand for ipv4 subnet would decrease.

Of course cross subnet calls are slower and more expensive, but this further incentivise the use of ipv6 sites.

1 Like

Hi @janosroden , I am sorry for my late reply. Indeed this is one direction we are looking at. But it also has its downsides. It means more heterogeneity across IC nodes, which we try to minimize in general. It also requires careful address assignment and inventory management, as nodes may move across subnets, subnets may split, etc. But indeed, we are looking at this option, among others.

1 Like

Hi @yotam , I came across this thread as I was working on a proxy server for browser requests (which I thus learned can fail because of IPv4). Is there any potential workaround you could recommend for now? Thank you

Hi @patnorris , I am not sure what exactly is this proxy server you are working on. Is it off-chain? If so, can you assign it an IPv6 address? Otherwise, I cannot think of a solution for canisters to send HTTPS outcalls to IPv4-only servers.

Hi @yotam , thanks for your fast reply. The proxy server is actually a Motoko canister. So I might have to look into potentially running a IPv4- and IPv6-compatible proxy server off-chain. Will see first though if sticking to IPv6-compatible websites for now will be enough.

1 Like

The transition to IPv6 is a really important aspect of protecting networks for the future, and it is interesting to learn about the design decisions made to maintain an IPv6-only network for ICs. IPv4 address scarcity and associated costs are serious issues that many networks face. Buy proxies is also a pressing problem because it’s not easy to find something worthwhile. It is commendable that you are actively seeking a solution to enable HTTPS data transmission to IPv4 nodes, recognizing the importance of connectivity to web 2.0 services that do not yet support IPv6. Collaborative efforts within the IC community are key to finding innovative solutions to such complex problems.

Is there an update here? I was helping a team at a hackathon, http was working great for them locally but when they pushed to mainnet they were hit by this IPv4 issue, we think.

Two things: it would obviously be great to not hit this limitation at all, but also it would be great if the local replica would reflect mainnet, it’s a bit surprising and maybe frustrating to have a working app locally that just breaks on mainnet with the same code

1 Like

Are you sure about this statement? I literally implemented HTTP outcalls last week and I recall hitting the IPv4 issue already using a local replica.

I am not sure actually since it wasn’t my project and I was hearing things second-hand, this just seemed to be the case

I see. I’m not saying the message was easy to understand at all – the error message didn’t explicitly mention something like “IPv4 is not supported”, more like a generic networking issue which even lead me to think the issue was in my Docker image. That’s why I recall encountering it locally. A pity I didn’t open a post about it this time.