Availability
- WebAssembly:
(import "ic0" "performance_counter" (func $ic0_performance_counter (param i32) (result i64)))
– available on all the subnets of the Internet Computer - Motoko:
ExperimentalInternetComputer.countInstructions(comp)
– available in the latest Motoko and Motoko Playground. Note: as the Performance Counter doesn’t support measuring computations with inter-canister calls, the Motoko is designed the library to reject async functions to prevent misuse at the moment. - Rust:
ic_cdk::api::call::performance_counter(counter_type: u32) -> u64
– available in the latest Rust CDK - The Internet Computer SDK (i.e.
dfx
):
– available in the latest beta.
What Is It?
The Canisters are encoded in WebAssembly. By executing a Canister method, in fact we execute WebAssembly instructions:
The Performance Counter is a new System API call, so Canisters can ask the Internet Computer how many instructions the Canister executed so far in the current message.
The counter is reset between messages. Note, if you use async calls, each async point corresponds to a new message.
Why Do We Need It?
The Performance Counter is a way to know runtime the complexity, and hence the cost of a piece of code.
We can use this information to profile our Canisters, benchmark our libraries, or even to dynamically charge other Canisters based on exact work done.
Are There Any Alternatives?
There are many profilers, flame graphs etc. The key differences are:
- The Performance Counter is an exact number of WebAssembly Instructions, not an estimation.
- The information is available runtime, during the execution, so the Canister can take decisions based on that.
Demo
Here is a small demo on Motoko Playground:
Fun fact, as I’m a backend engineer, it’s probably, my first Motoko program
High-level, it’s a bit modified Counter
Motoko tutorial, but with two stable counters.
Click Deploy
, and then mark Enable Profiling
to enable the Flame Graphs.
incCounter()
stable var counter : Nat = 0;
public func incCounter() : async Nat64 {
EIC.countInstructions(func() { counter += 1 });
};
The first public function incCounter()
just increases a stable counter. Note, the actual work is wrapped inside the countInstructions()
, and it returns the number of instructions consumed by the lambda inside.
In the Candid UI
on the left, call incCounter
.
It returns 638
WebAssembly instructions. Why so many?
First of all, the ic0.performance_counter()
System API call takes 200 instructions for a start. That’s an overhead of exiting the WASM Execution and providing the information to the Canister.
Next, profiling to build the Flame Graph. If we deploy the Demo with no profiling enabled, it will take just 234
Instructions, out which 200
instructions is a fixed overhead of the Performance Counter itself. Profiling is expensive, but invaluable!
Let’s compare the results with the Flame Graph. Hover the mouse over incCounter
and it shows 929
WebAssembly instructions estimated by the profiler. Flame Graph includes many other things, like Candid arguments deserialization and function result serialization, so it’s expected to be more than the pure countInstructions()
call.
Those are still comparable results. So far so good!
incBigCounter()
stable var bigCounter : Nat = 12345678901234567890;
public func incBigCounter() : async Nat64 {
EIC.countInstructions(func() { bigCounter += 1 });
};
Let’s try the second counter. As you can see, everything is exactly the same, but the initial value of the counter is huge.
Click Call
in the Candid UI
to call the incBigCounter()
Previously it took 638
WebAssembly Instructions, and now it’s 5330
Flame graphs to the rescue! We see that Motoko has started to use BigInt
library underneath, as the initial value was too big to represent in normal form.
That’s a super simple, yet a great example why we need to profile our Canisters with real data structure sizes. The more data flows in, the more instructions it might take to process the requests. Even on such a basic level of a single counter.
Comparing with the Flame Graph estimations, now we start to see a difference. The Flame Graph estimates the call to ~2K
Instructions, while in fact it took ~5K
. But it will be super clear in the last example.
readStableMemory()
public func readStableMemory() : async Nat64 {
EIC.countInstructions(func() {
var o = StableMemory.grow(1);
var b = StableMemory.loadBlob(0, 65536);
});
};
In the readStableMemory()
we just grow the stable memory, and read a binary blob from it.
Let’s call it…
Ok, so it took ~66K
WebAssembly Instructions, but since the most of the work was done on the backend, on the System API level, the flame graph estimation shows just ~1K
.
And that is why we need the Performance Counter! It’s precisely what the Canister will be charged for, it includes all the work done in WebAssembly and on the System API level.
Links
- The Internet Computer Interface Specisication:
The Internet Computer Interface Specification | Internet Computer Home - Canister Cycles vs Performance Counter Wiki (or why async functions are tricky):
Comparing Canister Cycles vs Performance Counter - Internet Computer Wiki - This Demo on Motoko Playground:
Motoko Playground - DFINITY
Next Iterations
The Performance Counter interface is extensible, so a few new Counters might be added. At the moment we’re considering to add:
- A counter which returns Cycles, not WebAssembly instruction.
- A stable counter which will be growing across async calls/responses.
It was a small feature, but many changes across components and teams.
Thanks @roman-kashitsyn @ulan @claudio @chenyan @lwshang and everyone in Execution, Languages, Runtime, SDK teams!