Hi,
Yotam from DFINITY here.
We would like to give an update on the HTTPS outcalls feature and our efforts in allowing canisters to make such calls to destination hosts that only support IPv4.
First, let me explain the problem, or, why currently HTTPS outcalls only work if the destination server has an IPv6 address. The Internet Computer nodes only have IPv6 addresses. Each node has its own public IPv6 address, but no node, with the exception of boundary nodes, has a public IPv4 address. For this reason, nodes cannot directly communicate with IPv4 hosts on the Internet. The IC is available over IPv4 due to the fact that all user traffic is routed to it through the boundary nodes, which serve as proxies in this aspect.
The reason nodes have only IPv6 addresses is because the IC cannot scale well if we require IPv4 addresses for nodes. IPv4 addresses are scarce and expensive, and this will only get worse with time. Therefore, it was a design decision to have the IC as an IPv6-only network.
When a node tries to make an HTTPS outcall, it first resolves the IP address of the requested domain name (as part of the given URL) by making a DNS query. If the domain name does not resolve to one or more IPv6 addresses (that is, the DNS response does not contain any AAAA record), the request fails instantly.
However, we see a great value in the ability to make HTTPS outcalls to web 2.0 services that are only available over IPv4, and we are therefore actively looking for a good solution. Unfortunately, this is not an easy problem, and we would like to share our thoughts with the wider IC community so we all work together in finding a solution.
Part 1: Solution Space
Since assigning IPv4 addresses to all IC nodes is impractical, and since we want all IC nodes to eventually be able to make outcalls to IPv4 servers, we must use some sort of proxy mechanism to provide IPv4 connectivity from these IPv6 nodes.
The two main approaches in the industry are NAT64 and SOCKS. NAT64 usually is coupled with the use of DNS64, which resolves requests to IPv4 destinations to an IPv6 address of a NAT64 server, where the TCP connection is proxied over IPv4 to the real destination server. SOCKS is a lower layer protocol where any connection from one side is proxied to the other side, so if the SOCKS proxy is placed on a dual-stack machine (i.e., supports both IPv4 and IPv6), it can accept connections over IPv6 and initiate proxied connections over IPv4.
The problem with using proxies is that they require some additional trust. In our case, HTTPS outcalls are end-to-end encrypted, so such proxies cannot look into the traffic or modify it, but they can certainly affect availability (e.g., by blocking requests, or delaying them).
Another problem with using proxies is that sharing a proxy may increase their vulnerability to availability attacks, for example by malicious canisters, or malicious nodes. We should therefore partition the use of proxies. We can reason about separating them among different canisters, different subnets, and possibly different node operators. However, this requires having many available proxies.
Part 2: Proposal
We propose to have proxies as part of the IC infrastructure. For example, the boundary nodes may serve as SOCKS proxies as they are already dual-stacked. A mapping from node to its proxy, based on its corresponding node operator and subnet type, will be maintained in the registry through proposals.
As discussed above, there are certain security and availability concerns to take into account when sharing a proxy. For this reason, the price for a call that may go through a shared proxy should be higher, and developers should be aware of the reduced security and availability guarantees when making such calls. For the same reason, I would like to explicitly note that we are still in the phase of analyzing this proposed solution, and we cannot guarantee that it will be eventually implemented.
Part 3: Pricing and Interface
It is clear that the price for a call that goes through a shared proxy must be higher than a direct call as it is done today to IPv6 destinations. The proxy is a bounded resource and a higher fee would incentivise users to use IPv6 when possible. Due to the way we reach consensus over HTTPS outcall responses, this means that we need to add a field in the request to specify whether the call should be made over IPv6 or whether a proxy should be used. Here again we have several options: either keep the default IPv6 only, and then the default pricing is as it is today, or make IPv4+IPv6 the default, making the default pricing more expensive, but enabling more outcalls by default.
Part 4: Discussion
Please share with us your thoughts on this matter. Specifically, we would like to understand better the following points:
- Do you think HTTPS outcalls should support IPv4 destinations?
- Do you have any suggestions or comments on the proposed solution?
- Do you have any preferences on the default pricing options?
- Do you see any other options or do you have any suggestions?
Thank you!