(I am going to get on my soapbox a bit and defend Dieter and some of my fellow colleagues working hard on this feature)
Let’s give the folks working on this project some slack here…
To be fair, enabling HTTP requests from smart contracts (in a secure way) is a feature that has eluded every blockchain. This team has taken a few months to do what no other chain has in years.
Did they underestimate the work necessary? Yes, but this is not surprising since they are literally inventing things so they are not sure what they will find as they build it (not to mention a litany of security reviews to make sure it is bulletproof).
I second this opinion from @diegop …copying the gist of something that i picked up from elsewhere in a completely different context(btc integration context)…
"…They are being super cautious; even if the math adds up. That is to say we know we can go to mars; but we really dont need an disaster to land on mars.
… This is …literally…a mission to mars project. It hasnt beem done before. Paving the road and learning along the way.
We have one shot and we better not screw it up. On the positive side, the worlds best team to handle these crypographic needs…"
We still need to conclude the following tasks before a release on the IC:
Making the infrastructure changes mentioned further above. Parts of it has been implemented already, parts are still to be done.
Reviewing and merging the code related to a major performance improvement resulting from the change of the signature scheme used to sign artifacts in the consensus layer. We wanted to get this out before a first release.
Writing and publishing the documentation for the feature.
The abovementioned performance improvement results in a major reduction of CPU overhead due to using a different signature scheme.
So I’ve been working with http requests in the local replica, I have a question about cycle costs. It seems like you have to send a few hundred billion cycles with each http request call, why is that?
const http_result: CanisterResult<HttpResponse> = yield ManagementCanister.http_request({
url: ethereum_url,
max_response_bytes: null,
http_method: {
POST: null
},
headers: [],
body: utf8.fromString(JSON.stringify({
jsonrpc: '2.0',
method: 'eth_blockNumber',
params: [],
id: 83
})),
transform_method_name: 'eth_block_number_transform'
}).with_cycles(300_000_000_000n); // TODO why is it asking for this many cycles?
Hi @lastmjs!
Each HTTP request costs currently 400M cycles flat fee plus cycles per request and response byte. The max response size is taken as a parameter for charging and the default is 2MB which makes it really costly. You must set the max response size to something in the range you expect in order to not be charged insane amounts of cycles.
Hope that helps.
We are currently revising pricing, so the pricing for this feature may change. Currently it is rather conservatively (expensively) priced.
I see, thanks for the clarification. Another quick question, my transform_method doesn’t seem to be necessary locally, requests are returning just fine even if the transform_method doesn’t do anything. Is the local replica in dfx 0.11.0 working like it will in production? I just don’t want to run into any nasty surprises
Yes, correct observation.
Transform is there to ensure that when having >1 replicas, their respective responses are made the same. For example, responses often contain timestamps or other items that change between responses. This only becomes an issue in case you have more than 1 replica because then it may lead to no consensus being reached on different responses.
The behaviour between the dfx environment and IC deployment can vary a lot for this feature unfortunately, exactly because on IC mainnet all replicas of the subnet make the request and if the responses do differ in some parts you need a proper transform function to get the same responses and have the response go through consensus. There are some more pitfalls that I am currently writing up in the feature documentation to help folks not waste time things that we already know may cause problems.
Stay tuned, we are very close to finalizing this and releasing the documentation.
What are you working on if I may ask? Some form of Ethereum integration based on cloud nodes by any chance?
What are you working on if I may ask? Some form of Ethereum integration based on cloud nodes by any chance?
Exactly! I’m getting Azle ready for outgoing http requests, and as part of that I’m writing some Ethereum examples pulling data (and hopefully writing data using POST requests) to Ethereum using a Web2 service.
So, I really think that the local replica needs to simulate the http consensus, otherwise it could be extremely difficult to figure out how to properly transform the data. It’s strange that this isn’t simulated, as my understanding is that the local replicas simulate the consensus delay for update calls so that developers aren’t surprised by mainnet. Also dfx 0.10+ has a local cycle environment that more closely resembles production.
Are there plans to get the local replica to work like production? All of the code I’m writing now, I doubt it will work once I deploy it.
Currently, the single replica in the dfx environment will behave differently (not causing the problems one may run on IC mainnet) and we currently do not have a plan to change this as this would essentially mean a completely different architecture with multiple replicas in the dfx environment or implementing this “simulation” by hand. What we are planning to do is to provide documentation that mentions the pitfalls we are aware of, either by theory of by having run into them when writing the sample dApp. This should help folks already a lot.
You definitely need to analyze the responses by the service you are making requests to for variable response fields or just extract the data items you are interested in and throw away the remaining parts of the response. Pro tip: Also look at the response headers as they may contain timestamps.
So do I understand correctly that the best way to debug this right now locally is to create a transform function and just log the HttpResponse that is the parameter to that function, looking for non-determinism?
I would start by making the same request twice and diffing it to find the variable parts, both in body and headers. Then write the transform function based on the diff you observe. Then wait for the IC mainnet release and test it there. This should get you a solution that works immediately or get you close to that on mainnet if you proceed like this and work thoroughly.
Our own engineers ran into some problems in that area as well when writing example code, so you really need to get used to it and know about the pitfalls. You (and others) should have a good starting point with the information I gave you in the last few posts. But think of it that way: That’s the first time in history that a smart contract can make HTTP requests to Web 2.0 services. And you are one of the first people implementing such a smart contract. We are very much operating at the forefront of technology here.
Let’s say I send a lot of cycles, much more than required based on the response size, will the IC refund me the cycles? Or if I send a lot of cycles will the system just take them all from me?
I have a lot of questions actually, I don’t want to spam this thread. Is this the best place to ask these? I think they will be useful to others as well.
The system should refund the cycles, it only takes what it costs.
Yes, this is the best place, please ask them. They will also be valuable for the documentation as others will have the same. I will answer as time permits me to.
Yes, POST support is implemented already. Please note that all replicas will make the same POST call, so there must be some way to prevent it to be made 13 times on the server. The standard solution for this are idempotency keys.
I’m running into some confusing behavior here. If I set max_response_bytes to null and I send 300_000_000_000 cycles with the http request, just to be sure, and my response ends up only being 200 bytes, will I get charged an outrageous amount, or will the system refund me everything that it didn’t need to use? Because in my local testing, if I set max_response_bytes to null and send 300_000_000_000 cycles, even though my response bodies are around 100 bytes, I’m getting charged like 1T cycles for 6 http requests.
I’m a little confused at why we even have to set the max_response_bytes and send cycles with the request. Can’t the IC just charge the canister based on what it used? Why can’t we get rid of these two requirements?
If you do not specify max_response_size, the system takes the default of 2MB. This results in 2M * 100M cycles to be charged for the response. This is around 200B, which is in line with what you are observing. Always set max_response_bytes to not be charged for the maximum response size for the HTTP request.
The reason we have the max_response_bytes is that it would be technically too much effort to charge what was actually going over the wire in terms of incoming responses (this information would also need to go through consensus). Thus, we decided to introduce the max_response_bytes parameter and always charge the max response size instead of the actual size. Thus it’s important to set the parameter to a value close the real value to not be overcharged.
The default way of charging is to send cycles along, it would be theoretically possible also to directly deduct. We decided to not do that (can’t recall the exact reason, but think it was compliance with the typical way of charging) and thus one has to send cycles. You can always send along the max and the system deducts what it costs and returns the rest, so it should be convenient enough.