Enable canisters to make HTTP(S) requests

The overhead we would have would not be the problem. The main problem with this approach would still be that there is no way to know whether the replica designated for sending out the request does the right thing, e.g., it could send a modified request or not send a request at all. A compromised replica could do anything it wishes.

The easiest approach for making a state-changing post call would be to simple select one replica to make it and then make a full-quorum call by all replicas to validate that the server’s state has been changed accordingly as outlined further above. This requires that this kind of verification is possible, but, if it is, this is a viable way to do posts with those HTTP servers nicely. Note that, though, the feature of only one replica sending a request is a future extension and will not be readily available. However, it seems to be something we should prioritize as it can help solve quite some use cases.

2 Likes

Firstly, this feature is a game-changer that I can’t wait to use.

To solve an immediate need I hacked together a proof of concept/workaround that uses a “Web2 Bridge” canister to receive HTTP requests and a node.js server that polls, processes, and returns the results. (I’ve only tested it on the local replica so far). Sharing here in case anyone else is in the same situation.

Also, Is there an IC equivalent to thread::sleep? I’m using an iterating counter that goes up to 2 Billion as a delay hack and it feels very wrong indeed :triangular_flag_on_post::triangular_flag_on_post::triangular_flag_on_post:

2 Likes

Don’t query calls on read data from 1 replica? How is it verified for query calls that 1 replica returned the correct data?

Correct. If one wants a higher level of security or guarantee, one should use Update calls (since they go through consensus).

A common pattern devs have been using on front-ends is:

  1. Fire off a query call on the frontend for quick answer
  2. Simultaneously fire off an update call for higher security
  3. Use frontend code to reconcile if #1 and #2 are different
3 Likes

… or use certified variables to allow the client to check the response of the query call.

8 Likes

Summary

This is a draft/preview of a motion proposal for “HTTP Requests from Canisters” which we will submit tomorrow (February 23, 2022) for the community to vote on.

Once proposal is live, I will update the forum.

Background

Canister smart contracts cannot make requests to HTTP/HTTPS services on the Internet by default. Doing so is a challenge on any blockchain caused by the fact that different replicas / nodes can (and will) receive different responses for the same call, be it, for example, for timestamps or ids contained in the response. Such different responses received by different replicas and further processing being based on those different responses on each replica would lead to a divergence of state on the different replicas and thus destroy the determinism property of the computation with the result that no consensus could be achieved. Thus, replicas cannot just make HTTP/HTTPS requests to outside services without creating a major problem for the subnet.

However, it is technically possible to enable canisters on the IC to safely make HTTP/HTTPS (henceforth just “HTTP”) requests with an extension of the Internet Computer protocol. This feature discusses such extension to the Internet Computer protocol. We think that this is a crucial step towards better integrating the Internet Computer with the public Internet and thus breaking down so-far inherent borders that blockchains face. We think that opening up the Internet Computer to interface it with HTTP services on the Internet is a major step towards the future of the Internet Computer as a platform that can run general-purpose workloads and also for a more open, integrative, blockchain world at large.

Today, obtaining data from the outside world requires, as on any blockchain, the use of blockchain oracles, or oracles. However, oracles lead to a more complicated programming model, charge substantial fees, add complexity and indirections, and require additional trust assumptions to be made. Allowing canister code to directly make HTTP requests would remove the dependence on oracles and its disadvantages. Of course, oracles may still be useful for certain use cases when we have HTTP calling capabilities, but many use cases could be covered with direct HTTP calls.

Allowing canisters to make HTTP calls to services on the public Internet has long been a community-requested feature for the Internet Computer. It will give smart contract canisters the ability to autonomously connect to services on the Internet to retrieve or submit data and will thus enable a large range of additional or enhanced use cases, e.g., obtaining exchange rate data from external servers for DeFi applications, obtaining weather data for decentralized insurance services, or sending notifications to users via traditional communications channels, all without using oracles. HTTP support for canisters is one of the features on our strategic R&D initiative on “General Integration” (see Long Term R&D: General Integration (Proposal) - #4 by dieter.sommer), thus we want to now proceed with launching a motion proposal for this feature and ask for community approval w.r.t. going forward. If accepted, a launch in the Q1 Chromium release is planned, meaning an aggressive timeline for our engineering teams.

Goals and Requirements

This feature should enable canisters to directly make requests to HTTP URLs, using the GET method initially, and receive the corresponding response back into the canister’s state in a deterministic fashion. The functionality should be realized in a direct, trustless, way. Direct, trustless, integration is a common theme in other integrations of the IC, e.g., the Bitcoin integration that is to launch on mainnet in the near future. Direct integration means that we do not need to make any additional trust assumptions or involve any additional parties to realize the functionality.

For the first version, all replicas in the subnet will send out the request and return a response that goes through the IC Consensus mechanism. In the future, we may present an option for canister developers to reduce the size of the required quorum, so that even only one replica may make the request, if desired, but the guarantees on such request would be accordingly lower. Another future envisioned extension are POST requests. In combination with the reduced quorum, those can be of tremendous utility for many use cases where reliability of the calls is less important than compatibility with APIs out there.

Proposed Design

We next outline the proposed design at a high level.

System API (Management Canister)

We implement a system API in the Management Canister that provides a method for making an HTTP/HTTPS call to an outside service and receiving back a response. See below for the original proposal w.r.t. the API and note that not all community feedback from the discussions in this forum topic have not been included here yet.

type http_header = record { 0: text; 1: text };

type http_response = record {
  status: nat;
  headers: vec http_header;
  body: blob;
};


type http_request_error = variant {
  no_consensus;
  timeout;
  bad_tls;
  invalid_url;
  transform_error;
  dns_error;
  unreachable;
  conn_timeout;
};

New method in ic0:

  http_request : (record {
    url : text;
    method : variant { get };
    headers: vec http_header;
    body : opt blob;
    transform : opt variant {
      function: func (http_response) -> (http_response) query
    };
  }) -> (variant { Ok : http_response; Err: opt http_request_error });

See the (draft) PR on the interface specification repository, which is now public, for details regarding the proposed API and related discussions: IC-530:Canister HTTP requests by ielashi · Pull Request #7 · dfinity/interface-spec · GitHub.

High-Level Request and Response Flow

Calling this API will store the request in a specific area of replicated state that is periodically read by a component at the networking / consensus layer. This component, once it sees a new request, provides this request to an HTTP Adapter at the networking layer which performs the actual request and provides a response in return.

Responses are put into a new HTTP Artifact Pool, are signed by the replica to endorse the response, and the signature is gossiped to all replicas in the subnet. Once a request has support by at least 2/3 of the replicas of the subnet in the view of the current block making replica, it adds this endorsed response to an IC block that is going through Consensus. Because at least 2/3 of the replicas of the subnet have supported the response, it is ensured that the subnet can achieve consensus on it.

Once the IC block with the HTTP response has made it through the IC consensus layer, it is routed back to the system API and is provided back to the calling canister, which concludes the original API call for making an HTTP request.

We do not go into the details of the error handling. In short, it is possible, as error scenarios, that requests time out or cannot be consented on, in which case a corresponding error response is generated and returned in response to the request.

Handling Differences in Responses

Many HTTP-based services like API providers include fine-granular timestamps or unique ids into their responses, implying that it would not be possible to achieve consensus on the responses received by the different replicas in the subnet. This can be addressed by allowing the caller to specify a response processing function to be performed on the responses before they are provided to the consensus layer. This allows for a much broader field of application for the feature by allowing a broader class of responses to obtain consensus on.

The canister may specify a response processing method that, when a response is received by each replica, is applied on the response on each replica to transform it accordingly into a response that is intended to be the same on each replica and thereby will be accepted by the IC consensus mechanism.
The transformation may, for example, only keep specific fields from the responses, while removing other values that might differ across responses, such as timestamps or unique identifiers. The transformation can also just retain a single value of interest from the whole response, e.g., an exchange rate value, which would substantially reduce the required IC “block bandwidth”.

The design choice to expose a canister method to perform the transformation and not do the transformation directly in replica code has multiple reasons behind it:

  • The computational effort for the transformation can be directly accounted for through consuming the canister’s cycles. Thereby certain kinds of denial of service attacks that would be possible and would need to be addressed for a replica implementation are not possible.
  • It is fully flexible in terms of which transformations can be implemented. Implementations in replica code would use a specific approach, e.g., a templating language, for defining the transformation.
    A drawback of the approach of exposing a canister method for the transformation instead of an alternative considered design of allowing for a set of transformation types parameterized by a template as input to the method call is that it may be slightly more effort on the side of the canister author to implement the canister method. However, the tradeoffs have been considered substantially in favour of the approach of exposing a canister method, as in the other approach it would be difficult to charge for the transformation effort and to prevent DoS attacks using long-running transformations.

Roadmap

We plan, assuming a supportive community vote, to build the feature to be ready for a release around the end of Q1 / 2022. We propose a design and scope that is reasonable to implement for a first MVP as outlined in this motion proposal to trigger further discussions and support a community vote.
The implementation cuts through all the layers of the IC protocol stack and thus requires tight collaboration between the core IC engineering teams. Most engineering effort is expected on the consensus layer, next are networking, execution, and message routing in descending order of effort. In order to meet the tight timeline to a Q1 release, the different teams will work in parallel as far as reasonably possible.

In order to ensure the high quality of the feature, we will perform extensive automated testing in our system testing environment and a security review.

Extensions

The envisioned feature implementation is a first MVP that provides the core functionality of allowing canisters to make HTTP requests. We have already identified some enhancements that have been decided to not be implemented as part of the first release, but that we can realize as separate features in the future.

  • POST/PUT requests: Those would be pretty similar in terms of implementation assuming idempotency of the requests. Not assuming this inherently requires us to use a reduced quorum of size 1 to emulate traditional POST/PUT calls or to extend the called API such that it can handle multiple requests for the same POST/PUT to be done and execute them only once.
  • Customizable quorum (unsafe requests): This allows the canister to specify the quorum size to make a tradeoff between performance, resource consumption, and compatibility with traditional HTTP-based servers on the one hand and security on the other hand. The most relevant reduced quorum size in practice will be quorum size 1. This extension, together with the possibility of making POST/PUT requests will enable another large array of use cases without making changes to external services.
  • Persistent connections: This is an extension purely for better performance, and thus left as an extension instead of implementing it already as part of the MVP.
  • Different numerical response values: Some APIs will result in slightly different response values if called at slightly different times. The latter is typically the case in the setting of all replicas making the same call to a service. A further extension to the feature can allow for such different received numerical response values to be consented on and being returned in appropriate form, e.g., their median or all values are returned so that the calling canister can directly receive or apply an appropriate function to determine the “actual” response value.
10 Likes

At first blush, this is a clean solution. I do have a question, though.

What would be the scope/namespace and capabilities of the pre-consensus hook function? That function is SUPER cool, but seems like it could be abused.

What kind of ways of abuse of the function are you envisioning? It’s a function that can only be executed as a query.

Update:
On a first glance, I do not see any major means of abusing this. It essentially behaves like a pure function, so no state of the canister can be changed.
It should be similar in terms of abuse potential to any query call that a canister offers.

Proposal is live: Internet Computer Network Status

1 Like

I foresee the need to make http requests from query calls.

1 Like

Responses are put into a new HTTP Artifact Pool, are signed by the replica to endorse the response, and the signature is gossiped to all replicas in the subnet. Once a request has support by at least 2/3 of the replicas of the subnet in the view of the current block making replica, it adds this endorsed response to an IC block that is going through Consensus. Because at least 2/3 of the replicas of the subnet have supported the response, it is ensured that the subnet can achieve consensus on it.

Once a request has support by at least 2/3 of the replicas of the subnet in the view of the current block making replica, it adds this endorsed response to an IC block that is going through Consensus.

Did you mean “Once a response has support…”?

A couple of questions inspired by this:

  • Will this integration support HTTP/2?
  • What happens if the HTTP response is large, much larger than 2 MB?
  • Will this integration support compression, i.e. Content-Encoding?
  • Will this integration support streaming large responses via methods like HTTP/1.1 chunked transfer encoding?
  • Will this integration support range requests?

This is exciting stuff! Easy yes vote.

3 Likes

Do you have specific use cases in mind for HTTP requests from query calls?

Technically, it would be completely different to what we are implementing now. In many ways, one could realize HTTP requests for query calls by simply the replica making the HTTP request without any Consensus involvement as the replica is either honest or not and depending on this the query result is trustworthy or less so.
The biggest conceptual problem I can see here is that query calls are synchronous and are done within fractions of a second and making an HTTP request might take a relatively long time, being an asynchronous operation with an entity in the outside world. This might be a large enough mismatch making your request potentially very hard to realize. Also, to make it clear, this is wild thinking in response to your request and currently not planned to be implemented.

Other opinions on HTTP requests for query calls?

2 Likes

Just as there is currently a great want for Inter-Canister Query Calls, there will be a similar desire for http requests from query calls. Both of these features presume a developer wants to request data from somewhere outside of their own canister on the IC or off of the IC, and do it quickly. There will be many uses for this, I can’t foresee them all.

Specifically for me I want to use a canister as a proxy for podcast downloads. Currently podcasts require a server proxy in many cases to download audio files to a web client, because the web clients have CORS restrictions that servers simply ignore. To create an entirely on-chain podcast ecosystem we’ll need canisters to act as proxies (or boundary nodes, but something needs to do this).

A web app served from the IC might want to perform an http get request to a canister that quickly aggregates information from two other canisters and three http endpoints in the legacy world.

If the IC is going to replace traditional cloud then it needs to allow what traditional cloud allows. Coming from Node.js, I feel very strongly that flexible http request functionality is essential to achieving this vision. And I can foresee many asking for this feature in the future.

3 Likes

Yes, indeed I meant to say “Once a response has support…” in the proposal text. Those things happen…
Nice to see that people read our texts so carefully! :slight_smile:

1 Like

You may not get a response to this unrelated question in this topic. If you re-post it in a better-suited topic I am pretty sure you will quickly get help on this.

1 Like

Good points! We will discuss this internally. The main issue I can see, as already mentioned further above, is the unpredictable time the http requests take and the effects this has for the synchronous query calls. It might clash with the general architecture behind query calls.
Good points to discuss, however. Thanks for the inputs!

2 Likes

As a first MVP we plan to only support HTTP/1.0 or 1.1. It would be quite straightforward to add HTTP/2 support for simple requests.

There must be some size limit on the response, and for now we have set it to 2MB as it is the maximum size for payload in a block.we could allow bigger responses given that the transform function can then reduce them below 2MB but we would still want to have some upper bound so the feature is not abused. For starter, we’ll just have it at 2MB.

For MVP we do not plan to support more than one response per request or content encoding. Decoding compression would be easy to implement at the adapter level so we might do that, but it could then violate size limits. Let me discuss that with the team.

I am not sure what you mean by “range requests”.

2 Likes

Thanks, this makes sense.

When this integration eventually supports query calls as @lastmjs suggested, I foresee a use case where a canister may want to stream and serve a large media file. Why? I’m not entirely sure at the moment, but I can see some wanting it.

By range requests, I mean the case where a server returns an Accept-Ranges header and a client requests portions of a large blob using the Range header.

Totally see how this would be useful, but right now if you’re aggregating data it might be best to set up a cron-job to pre-fetch what you might need and have it ready on your canister(s).

If you’re setting up a download, what about setting up some sort of a WebRTC streaming solution? I haven’t done this myself, but was listening to the OpenChat episode of the Internet Computer Weekly podcast where Matt Grogan talks about using WebRTC to make the chat experience feel instant while processing the update calls simultaneously. It seems like a pretty sweet solution if they’re actually doing that and not just blowing smoke.

1 Like