Divided Message yet still receiving "Canister exceeded the limit" after few rounds

Divided a long run into chunks, and first few runs go through fine until a call down the line breaks with “Canister exceeded the limit” error.
Steps to reproduce is as follows:

  1. clear && icpp build-wasm --config=icpp_libraries.toml && wasm2wat build/SimSoccerServer.wasm --output=build/SimSoccerServer.wasm.wat
  2. Open SimSoccerServer.wasm.wat in Notepad++
  3. Mark all lines that start with export
  4. Find start_match & play_match and remove bookmark for their export
  5. Search/Bookmark/Remove Bookmarked Lines then save
  6. wat2wasm build/SimSoccerServer.wasm.wat --output=build/SimSoccerServer_noexports.wasm && …/binaryen/bin/wasm-opt build/SimSoccerServer_noexports.wasm -o build/SimSoccerServer.wasm -Oz --enable-bulk-memory && dfx deploy
  7. dfx canister --network local call SimSoccerServer start_match
  8. dfx canister --network local call SimSoccerServer play_match ‘(1674211940: nat64, 2: nat64)’
  9. Repeat step 8 until error received.

Repo Link:

The full error says something like “Canister exceeded the limit of 42949672960 instructions for single message execution”, right? Normally I’d suggest looking into using canbench to see where your canister is using many instructions - is there any support for something similar in icpp (@icpp)?

1 Like

Here’s a sample run. I’ve made the simulation so that it runs the number of seconds I send in the play_match call (as seen in the screenshot, the second parameter is set to 2 seconds).
When I make the call a few times, it runs fine, until it breaks at some point. Does the instruction limit accumulate across several calls, or should it count only a single call?

The count does not accumulate over calls, so I suspect that the calculations are really different.

Perhaps some data in Orthogonal Persisted memory is updated or corrupted resulting in a different calculation path? Or an infinite loop?

There are no great tools, so adding print statements is your best bet…

Yep this is correct. Is there anything changing in the state that would cause longer runs to use more instructions?

You can try using the performance_counter API directly to see how many instructions you’re using. It could be that there’s slight variation between the calls and most are just under the limit, but some go just over.

@abk @icpp you were absolutely right, the simulation was running differently on chain as agent was untrained producing different results than the ones produced by running natively which caused an endless loop 15 seconds into the simulation. Good news simulation with nn agents is running fine now and next i’ll try training on chain.
Thanks a lot for the valuable input.

1 Like