Heap out of bounds, error code Some("IC0502") on C++ code run

The function is inlined, correct.

I added some logs until i found out the program breaks at this specific part.

Would be great if dfx could be modified to increase heap limits to verify if it fixes the issue, thanks a lot.

I’m saying sCollideShapeVsShape actually is not being inlined and we can see that the oob access is not happening in it because it’s not in the backtrace. I also think this isn’t related to any dfx/IC limits because I can reproduce the issue in pure wasmtime. Here’s how you can do that:

  1. Apply these changes to your jolt sample to avoid using IC APIs (except ic0.trap):
diff --git a/src/HelloWorld.cpp b/src/HelloWorld.cpp
index 58a5d8f..b8efbda 100644
--- a/src/HelloWorld.cpp
+++ b/src/HelloWorld.cpp
@@ -3,7 +3,7 @@
 // SPDX-License-Identifier: MIT
 
 #include "HelloWorld.h"
-#include "ic_api.h"
+// #include "ic_api.h"
 
 // The Jolt headers don't include Jolt.h. Always include Jolt.h before including any other Jolt header.
 // You can use Jolt.h in your precompiled header to speed up compilation.
@@ -219,10 +219,10 @@ public:
 // Program entry point
 void hello()
 {
-       IC_API ic_api(CanisterQuery{std::string(__func__)}, false);
+       // IC_API ic_api(CanisterQuery{std::string(__func__)}, false);
 
-       // Get the principal of the caller, as cryptographically verified by the IC
-       CandidTypePrincipal caller = ic_api.get_caller();
+       // // Get the principal of the caller, as cryptographically verified by the IC
+       // CandidTypePrincipal caller = ic_api.get_caller();
 
        // Get the name, passed as a Candid parameter to this method
        // uint64_t seed{0};
@@ -353,7 +353,7 @@ void hello()
                Vec3 velocity = body_interface.GetLinearVelocity(sphere_id);
 
                std::string msg_step = "Step " + std::to_string(step) + ": Position = (" + std::to_string(position.GetX()) + ", " + std::to_string(position.GetY()) + ", " + std::to_string(position.GetZ()) + "), Velocity = (" + std::to_string(velocity.GetX()) + ", " + std::to_string(velocity.GetY()) + ", " + std::to_string(velocity.GetZ()) + ")" + "\n";
-               IC_API::debug_print(msg_step); // print it
+               // IC_API::debug_print(msg_step); // print it
                msg.append(msg_step);              // msg send back over wire
 
                // If you take larger steps than 1 / 60th of a second you need to do multiple collision steps in order to keep the simulation stable. Do 1 collision step per 1 / 60th of a second (round up).
@@ -383,6 +383,6 @@ void hello()
        //////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
        // Create a msg, to be passed back as Candid over the wire
 
-       // Send the response back
-       ic_api.to_wire(CandidTypeText{msg});
+       // // Send the response back
+       // ic_api.to_wire(CandidTypeText{msg});
 }
  1. Make these changes to icpp.toml so that we can get debug symbols with @icpp 's rc release:
diff --git a/icpp.toml b/icpp.toml
index 3d75854..f2bb359 100644
--- a/icpp.toml
+++ b/icpp.toml
@@ -15,6 +15,31 @@ cpp_compile_flags = [
     "-D JPH_PLATFORM_SINGLE_THREAD",
 ]
 cpp_link_flags = []
+cpp_compile_flags_defaults = [
+    # "-O3",
+    # "-flto",
+    "-fno-exceptions", # required for IC
+    # "-fvisibility=hidden",
+    "-D NDEBUG",
+    "-D ICPP_VERBOSE=0",
+]
+cpp_link_flags_defaults = [
+    "-nostartfiles",
+    "-Wl,--no-entry",
+    # "-Wl,--lto-O3",
+    # "-Wl,--strip-all",
+    # "-Wl,--strip-debug",
+    "-Wl,--stack-first",
+    "-Wl,--export-dynamic", # required for IC
+]
+c_compile_flags_defaults = [
+    # "-O3",
+    # "-flto",
+    "-fno-exceptions", # required for IC
+    # "-fvisibility=hidden",
+    "-D NDEBUG",
+    "-D ICPP_VERBOSE",
+]
 c_paths = []
 c_header_paths = []
 c_compile_flags = []
  1. Create a file ic0.wat which can provide a stub for the ic0.trap API:
❯ cat ic0.wat
(module
  (func (export "trap") (param i32) (param i32) unreachable)
)
  1. Install the icpp rc version with pip install icpp-pro==3.11.0rc1 and build your project with icpp wasm-build.
  2. We can now run your example in pure wasmtime and see that we hit the same oob access in ProcessBodyPair (this is using wasmtime version 15.0.0):
❯ wasmtime run --preload ic0=ic0.wat --invoke 'canister_query hello' build/joltsample.wasm
Error: failed to run main module `build/joltsample.wasm`

Caused by:
    0: failed to invoke `canister_query hello`
    1: error while executing at wasm backtrace:
           0: 0x566f10 - <unknown>!JPH::PhysicsSystem::ProcessBodyPair(JPH::ContactConstraintManager::ContactAllocator&, JPH::BodyPair const&)
           1: 0x566ac4 - <unknown>!JPH::PhysicsSystem::JobFindCollisions(JPH::PhysicsUpdateContext::Step*, int)
           2: 0x57f437 - <unknown>!JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_2::operator()() const
           3: 0x57f3c6 - <unknown>!decltype(std::declval<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_2&>()()) std::__2::__invoke[abi:v160000]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_2&>(JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_2&)
           4: 0x57f37c - <unknown>!void std::__2::__invoke_void_return_wrapper<void, true>::__call<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_2&>(JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_2&)
           5: 0x57f332 - <unknown>!std::__2::__function::__default_alloc_func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_2, void ()>::operator()[abi:v160000]()
           6: 0x57f2c6 - <unknown>!void std::__2::__function::__policy_invoker<void ()>::__call_impl<std::__2::__function::__default_alloc_func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_2, void ()>>(std::__2::__function::__policy_storage const*)
           7: 0x3b4d9d - <unknown>!std::__2::__function::__policy_func<void ()>::operator()[abi:v160000]() const
           8: 0x3b40d4 - <unknown>!std::__2::function<void ()>::operator()() const
           9: 0x3b3c57 - <unknown>!JPH::JobSystem::Job::Execute()
          10: 0x3b388a - <unknown>!JPH::JobSystemSingleThreaded::QueueJob(JPH::JobSystem::Job*)
          11: 0x3b4283 - <unknown>!JPH::JobSystemSingleThreaded::QueueJobs(JPH::JobSystem::Job**, unsigned int)
          12: 0x5618ba - <unknown>!JPH::JobSystem::JobHandle::sRemoveDependencies(JPH::JobSystem::JobHandle const*, unsigned int, int)
          13: 0x55ff66 - <unknown>!void JPH::JobSystem::JobHandle::sRemoveDependencies<32u>(JPH::StaticArray<JPH::JobSystem::JobHandle, 32u>&, int)
          14: 0x55d6a4 - <unknown>!JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)
          15: 0x384a16 - <unknown>!canister_query hello
          16: 0x83a6bd - <unknown>!canister_query hello.command_export
       note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable may show more debugging information
    2: memory fault at wasm address 0x10000cd6c in linear memory of size 0xad0000
    3: wasm trap: out of bounds memory access

So this makes me think the problem is somewhere in the compilation/build process.

When playing around with this a couple things I noticed are:

  • Increasing the stack size with something like wasmtime run --preload ic0=ic0.wat --invoke 'canister_query hello' -W max-wasm-stack=4000000000 build/joltsample.wasm doesn’t change anything, so I doubt we have a stack overflow.
  • Looking at the wasm where the oob access occurs, we can see that it’s happening in the function setup:
start of Wasm code for ProcessBodyPair:
 566eef: 23 80 80 80 80 00          | global.get 0 <__stack_pointer>
 566ef5: 21 03                      | local.set 3
 566ef7: 41 c0 a5 04                | i32.const 70336
 566efb: 21 04                      | local.set 4
 566efd: 20 03                      | local.get 3
 566eff: 20 04                      | local.get 4
 566f01: 6b                         | i32.sub
 566f02: 21 05                      | local.set 5
 566f04: 20 05                      | local.get 5
 566f06: 24 80 80 80 80 00          | global.set 0 <__stack_pointer>
 566f0c: 20 05                      | local.get 5
 566f0e: 20 00                      | local.get 0
 566f10: 36 02 9c 9e 04             | i32.store 2 69404 <--OOB occurs here
 566f15: 20 05                      | local.get 5
 566f17: 20 01                      | local.get 1
 566f19: 36 02 98 9e 04             | i32.store 2 69400
 566f1e: 20 05                      | local.get 5
 566f20: 20 02                      | local.get 2
 566f22: 36 02 94 9e 04             | i32.store 2 69396
 566f27: 20 05                      | local.get 5
 566f29: 28 02 9c 9e 04             | i32.load 2 69404
 566f2e: 21 06                      | local.set 6
 566f30: 41 08                      | i32.const 8
 566f32: 21 07                      | local.set 7
 566f34: 20 06                      | local.get 6
 566f36: 20 07                      | local.get 7
 566f38: 6a                         | i32.add
 566f39: 21 08                      | local.set 8
 566f3b: 20 05                      | local.get 5
 566f3d: 28 02 94 9e 04             | i32.load 2 69396
 566f42: 21 09                      | local.set 9
 566f44: 20 08                      | local.get 8
 566f46: 20 09                      | local.get 9
 566f48: 10 b7 8b 80 80 00          | call 1463 <JPH::BodyManager::GetBody(JPH::BodyID const&)>

Anyway, we probably need something that can run by itself in wasmtime before it will work on the IC.

2 Likes

@abk,
Thank you so much for doing this investigation and giving your summary. Good to hear it is not the IC, and we now have a path to dig deeper trying to find the cause.

@abk thanks a lot for all the details and info.
@icpp any thoughts on what might be causing this?

@ktimam ,
I do not yet have a clue what it could be, but from above experiment we learned that:

  • it is not the IC, because it happens in regular wasmtime too
  • it is not the C++ Candid, because that was stripped out

What’s remaining as the potential cause of this issue:

  • the C++ code itself might have an issue that is only happening when compiling to wasm
  • perhaps the compiler is doing something wrong or we use a wrong combination of compile and link flags

We should to try to further simplify the test case to the point where we are 100% sure there is nothing wrong with the C++ code, and we can then reach out to the wasi-sdk community. I opened issues in that github repo before and they’re very responsive.

I also noticed there ia a pre-release of wasi-sdk 21, which upgrades the backend to LLVM 17.

It is worth to do a quick test with that version. You just never know…

@ktimam , @abk ,

Today, I ran into the heap out-of-bounds error while implementing the http_request_update method.

In my case, the issue was introduced when I used Orthogonal Persistence for a variable stored in the static/global memory, like this:

// This leads to heap out-of-bounds after call the http_request_update
...
// Orthogonally Persisted counter
uint64_t counter{0};

...
void http_request_update() {
  ...
  ++counter;
  ...
}

I have used Orthogonal Persistence for much more complex cases, so I was really surprised by this error.
In all my other cases though, I only stored a pointer in the static/global memory, and dynamically allocate the memory in canister_init.

I updated my code to use that same approach, and then the heap out-of-bound error went away.

This is how the code looks that works:

file: canister.h

#pragma once
#include "wasm_symbol.h"
#include <memory>

// Self managed pointer to a wrapped uint64_t
class Counter {
public:
  uint64_t counter;
};

extern Counter *p_counter;

void canister_init() WASM_SYMBOL_EXPORTED("canister_init");

file: canister.cpp

// Initialization of the canister
#include "canister.h"

#include <algorithm>
#include <memory>
#include <string>
#include <variant>

#include "ic_api.h"

// Orthogonally Persisted counter

Counter *p_counter{nullptr};

void canister_init() {
  IC_API ic_api(CanisterInit{std::string(__func__)}, false);

  // Create a Counter instance
  if (p_counter == nullptr) {
    IC_API::debug_print(std::string(__func__) + ": Creating Counter Instance.");
    p_counter = new (std::nothrow) Counter();
    if (p_counter == nullptr) {
      IC_API::trap("Allocation of p_counter failed");
    }
  }
}

file: …cpp

#include "canister.h"
...
void http_request_update() {
  ...
  if (p_counter) {
    ++p_counter->counter;
  }
  ...
}

This does not explain what you are seeing, but perhaps sharing this example, related to use of the static/global memory for orthogonal persistence provides hints towards the solution.

1 Like

@ktimam
In your C++ code, are you storing any variables in static/global memory, similar to the counter?

If yes, can you replace it with a pointer approach, like p_counter?

Got it finally, “-Wl,-z,stack-size=1048576” was added to cpp_link_flags_defaults to increase stack size (clang defaults to a limited stack size when building wasm).
Many thanks @abk & @icpp , now we have a fully working physics engine running on ICP :fire:

1 Like

Awesome!!! :tada:

Congratulations @ktimam, to get the physics engine running on the IC. That is a huge milestone.

I am truly impressed with the grit you showed to battle through these hurdles. It takes that when you’re the first trying to do something, and we all learned a lot more about WASM along the way :slightly_smiling_face:

3 Likes