I just hit the instruction limit!

I just hit the instruction limit with dfx 0.20.1 and azle 0.20.1 (whoa, cool version coincidence).

This is a great example of the major blocker the instruction limit can be, as I am trying to use the motoko npm package, calling into its API. To overcome the limit as an end-user developer might be practically impossible, as it would require improvements to ICP, Azle, or the motoko npm package.

Here’s the code:

import { Server } from "azle";
import express from "express";

export default Server(() => {
  const mo = require("motoko");

  const app = express();

  app.post("/compile", (req, res) => {
    mo.write(
      "/main.mo",
      `
          actor {
              public query func helloWorld(): async Text {
                  "Hello World!"
              };
          }
      `
    );

    const result = mo.wasm("/main.mo", "ic");

    res.send(result.wasm.length);
  });

  return app.listen();
});

Calling the /compile endpoint gives in part: Replica Error: reject code CanisterError, reject message Error from Canister bkyz2-fmaaa-aaaaa-qaaaq-cai: Canister exceeded the instruction limit for single message execution.

When executing this during post_upgrade the instruction limit was not reached.

1 Like

Hello there, I am experiencing the same issue. I have reached the instruction limit, even though the code is not very complex.

Code explanation:

  1. This is an API that validates the initData from the Telegram web app to check if the hash is legitimate.
  2. If the hash is legitimate, it will be used as the “password” and hashed using bcrypt, then updated in the database.
  3. Respond to the user.

Does this mean the process of few loops, two HMACs (secret and Telegram hash generation), querying the database, and then updating the data is too much?

There’s clearly an issue if we’re hitting computation limits. Are these tools even necessary? ICP should be built with tools that don’t hit instruction limits.

Does this mean the process of few loops, two HMACs (secret and Telegram hash generation), querying the database, and then updating the data is too much?

If you would write the same code in Rust or Motoko, I think it would be 100x - 1000x 10x - 100x times faster. (Edit: benchmarks show 10x - 100x).

The issue is that interpreting JavaScript currently has high performance overhead and AFAIK no one is looking into that.

My suspicion is the crypto operations in this case, especially since we’re using crypto-browserify under-the-hood.

Wasi-crypto would allow us to use a much more performant version of Node’s crypto module.

Are you thinking this for general JS interpretation or just for crypto operations?

At least the QuickJS benchmarks put it about 35x less performant than JIT V8.

Are you thinking this for general JS interpretation or just for crypto operations?

I meant it for general JS interpretation, but crypto operations are probably at the top range.

Note that the numbers I wrote were based on my intuition from seeing a few examples. Maybe my impression was biased due to the slow JS candid implementation.

To get more concrete numbers, I did a small experiment: a program that computes sum((i % 100)^2) for all 0 <= i < 1M. The expression was chosen to make it difficult for the compiler to optimize it away.

Here are the results:

  • Rust: instructions: 13_761_694, sum: 3_283_500_000
  • Motoko: instructions: 92_001_970, sum: 3_283_500_000
  • Azle: instructions: 1_341_172_853, sum: 3_283_500_000

In this experiment Azle is 97x slower than Rust and 14x slower than Motoko.

I corrected 100x-1000x to 10x-100x in my post based on these results.

At least the QuickJS benchmarks put it about 35x less performant than JIT V8.

Assuming that V8 is close to native, I wonder if there is a factor of ~3x that’s missing here?

Attaching the source code of programs in case anyone wants to double check.

#[ic_cdk::query]
fn bench() -> String {
    let mut sum: i64 = 0;
    for i in 0..1_000_000 {
        sum += (i % 100) * (i % 100); 
    }
    format!("instructions: {}, sum: {}", ic_cdk::api::performance_counter(0), sum)
}
import IC "mo:base/ExperimentalInternetComputer";
import Nat64 "mo:base/Nat64";

actor {
  public query func bench() : async Text {
    var sum : Nat64 = 0;
    var i: Nat64 = 0;
    while (i < 1_000_000) {
       sum += (i % 100) * (i % 100);
       i += 1;
    };
    let instructions = IC.performanceCounter(0);
    return "instructions: " # Nat64.toText(instructions) # ", sum: " # Nat64.toText(sum);
  };
};
import { IDL, query, update, instructionCounter } from 'azle';

export default class {
    @query([], IDL.Text)
    bench(): string {
        let sum = 0;
        for (let i = 0; i < 1_000_000; ++i) {
            sum += (i % 100) * (i % 100);
        }
        let instructions = instructionCounter(0);
        return `instructions: ${instructions}, sum: ${sum}`;
    }
}
4 Likes

Awesome thanks for going through the effort! For anyone who is interested Demergent Labs has some ideas for closing the performance gap over time, including using SpiderMonkey or V8 potentially in the future, looking into JIT in Wasm, and looking into JS compiled into Wasm directly.

Swapping QuickJS for SpiderMonkey is the most promising of those options, but it’s unknown how much better the performance will be.

Apparently JIT in Wasm is either very difficult or impossible (hoping it’s not impossible).

V8 is apparently so complicated it would be hard to compile it into Wasm/Wasi.

What’s very interesting is that this project which compiles JS directly into Wasm is having surprisingly excellent performance results: GitHub - CanadaHonk/porffor: A from-scratch experimental AOT JS engine, written in JS

3 Likes

Hi Ulan, interesting numbers. Did you try Motoko with *% and +%= operators, which avoid the overflow check?

1 Like

Great to see those figures, thanks!

In my proj, C–ATTS I am running JS based “recipes” in the Rust based canister using javy (QuickJS). But perhaps I should reconsider that choice. Slight tangent from the topic of this thread, but would it be difficult to run compiled WASM binaries in the canister? In my case, the recipes could as well be created using assemblyscript and compiled before use.

1 Like

Easy to run basic Wasm binaries: crates.io: Rust Package Registry

We have Wasmi integrated into Azle, not with full general host imports yet, but it works with basic binaries.

I’m not sure on the performance trade-off yet for running Wasmi in Wasm, but I feel optimistic.

1 Like

Ok, just tried Wasmi. Setup a new dfx project and copypasted from Wasmi docs example. Worked great, gzipped canister weighs in at 700k. Promising!

1 Like

I would love to see some performance numbers if you can!

1 Like

That gives a good speedup!

sum +%= (i % 100) *% (i % 100);

instructions: 57_001_970, sum: 3_283_500_000

@lastmjs @ulan

Rust based wasm run by wasmi in the canister:

instructions: 205_812_831, sum: 3_283_500_000

  • 15x slower than Rust
  • 3,5 x slower than Motoko
  • 6,5 x faster than JS

I believe there might be ways to optimise the generated wasm. Haven’t looked into that and don’t know if it applies to this kind of thing. But, the wasmi usage guide mentions it.

#[no_mangle]
pub extern "C" fn run() -> i64 {
    let mut sum: i64 = 0;
    for i in 0..1_000_000 {
        sum += (i % 100) * (i % 100);
    }
    sum
}
2 Likes

I started a new thread for a related topic. The wasmtime execution environment used to run the canisters should expose functionality accessible through ic_cdk to let us run wasm from wasm. So we don’t have to run a wasm runtime inside a wasm runtime like in my benchmark test.

2 Likes