Security Sandboxing (Community Consideration)


Proper sandboxing has come up as a security concern in a few projects:


I am getting worried about the security of canisters running on non-system subnets, that hold large amounts of monetary value.

I’m not sure enough security precautions have been taken to feel confident that canisters running alongside possibly malicious canisters will be safe.

I did not realize that each canister was running within the same process on each replica within a subnet, at least that seems to be the case.

Perhaps a prerequisite to this project moving forward is process sandboxing, so that even if malicious canisters break out of the Wasm environment, they’ll be stopped by the process boundary.

People involved

Helge Bahmann (@hcb ), Dieter Sommer (@dieter.sommer )


Early stages: Formulating a plan and discussing.

Next steps

@hcb will engage in the community and propose a plan for sandboxing which will be voted on via NNS motion proposal. If it passes, there will be subsequent code upgrade proposals as the implementation is developed. if it fails, the community will go back to the drawing board.


Update on this project:

@hcb has been quietly working away with @ulan. He has a project proposal for sandboxing that he is polishing. I will post it in a subsequent message for folks to take a look.

1 Like

Proposal: Sandboxing mechanism for canister wasm execution

Main Authors: @hcb , @ulan


Protect IC nodes and canisters hosted on them from rogue canisters that try to exploit holes in the WebAssembly through maliciously crafted canister code. The attack scenarios include:

  • side-channel data reads of secrets in nodes and canisters,
  • all classes of remote code execution vulnerabilities in the WebAssembly JIT compiler and runtime


Canister code execution is confined by the WebAssembly runtime. The constraints of the runtime are enforced by a) limiting access through the system API and b) correctness of the JIT-compiled native code derived from the WebAssembly code. The full implementation correctness cannot necessarily be fully assumed due to the complexity of the components.

Additionally, even assuming a perfectly correctly operating JIT compiler, the generated native code executes in the context of its host process. This makes it indistinguishable from all other code executing in the same process to the CPU. As a consequence, the CPU fundamentally cannot be stopped from performing speculative access to any memory reachable to the host process. This can be abused by completely legitimate WebAssembly code to trigger such speculative access. Timing measurements against code execution paths may then reveal secrets.

Stopping this class of attacks at its core is difficult or outright impossible. They can however be rendered harmless by confining the attack scope to a strongly confined “sandbox process”. No sensitive information must be held or be accessible to this process. Information that can be obtained by an attacker through other means (e.g. through the official system API through their own canisters) is not sensitive. As a consequence, a breach of the sandbox process in itself does not gain an attacker anything.


Implement a process sandboxing mechanism for canister wasm execution. The scope will be “one sandbox per canister” to protect both the IC node itself as well as other canisters. Guarantee integrity and confidentiality of all system components under the following attack scenarios:

  • side channel attack allowing arbitrary memory of the host process through side channels
  • WebAssembly runtime flaws allowing remote code execution

The design of the sandbox process is such that the security properties of the IC hold under the assumption that the sandbox process is under complete attacker control.

The implementation consists of a re-design of the canister runtime to allow for separation into multiple processes, and system means to enforce isolation and confinement of the processes.

Compatibility & performance

The sandboxing mechanism will incur no user-visible functional changes to how canisters operate – there is nothing developers need to change about their canisters. It is possible that the introduction of the isolation mechanism has observable performance effects. It is believed that the initial version will exhibit some performance degradation, but it is also believed that a fully optimized revision will long-term even improve performance over the non-sandboxed execution through better memory management parallelism in the system.


This proposal aims to improve security of the IC through canister process sandboxing. Under the assumption of a compromise through specially crafted code in one canister, the system guarantees the confidentiality of the following assets (meaning that a successful attacker cannot read them):

  • IC node data itself (including private keys)
  • User data of other canisters (including Wasm memory, stable memory and all metadata)
  • System data of other canisters (including cycles, tokens & similar assets)
  • Artifact pool
  • Ingress / egress messages of other canisters

The system guarantees the integrity of the following assets (meaning that a successful attacker cannot modify them):

  • IC node data itself (including private keys)
  • User data of other canisters (including Wasm memory, stable memory and all metadata)
  • System data of all canisters (including cycles, tokens & similar assets)
  • Artifact pool
  • Ingress / egress messages of all canisters

In the initial version we may not be able to guarantee availability of the system under all attack scenarios. This means that a successful attacker might still be able to hamper or disrupt the IC service until corrective administrative action can be taken.

Development and rollout plan

The risks of the feature are two-fold:

  • complexity/stability: sandboxing is an architectural change that incurs stability risks due to complexity of implementation
  • performance: we anticipate initial performance degradation, heavily loaded subnets might be pushed beyond their operational bounds

Development and rollout structure is intended to minimize these risks.

An initial functional, slow, and non-launchable version of the feature is under preparation already. It will not be rolled out onto mainnet but will serve as validation for proper design proposals to be published in the course of this process. It also allows very early rigorous continuous testing to validate system stability.

The initially launched version of this feature will provide all of the desired security properties, but may need to make performance compromises. It needs to be rolled out to the IC mainnet in a controlled fashion: We anticipate it to be activated first to head-runner subnet blockchains for live testing (especially regarding performance considerations), and then be activated throughout the entire IC mainnet over time. During this roll-out process, both the code to run “sandboxed” and “non-sandboxed” WebAssembly execution will be part of the image distributed as updates to IC nodes. Per-subnet / per-node customizations will allow selective activation.

Future versions will close the performance gap and will likely also go beyond the performance of the status quo. They will be rolled out as incremental improvements over the sandboxing mechanism activated on all IC nodes.


Thanks for the update.

A few questions I had reading this:

  1. Will this block the BTC integration?
  2. What is the “artifact pool” referenced above?
  3. What type of changes will “close the performance gap”? My understanding is that process sandboxing incurs a performance penalty for the same reason multiprocessing is generally more expensive than multithreading: context switches are heavier with processes, IPC is expensive, etc. How will those fundamental OS-imposed limitations be overcome in a “one canister, one process” world?


1 Like

I will let @hcb and @ulan answer, but I did want to note three things:

  1. This project is being accelerated because others depend on it and it is baked enough
  2. Ulan and Helge are focusing on it
  3. We have a basic enough plan to have folks question it, ask on, so we are scheduling a community conversation and NNS motion proposal for folks to engage with
1 Like

@jzxchiang partial answers to your questions:

  1. I do not think there is a decision that this will block BTC integration, but some people have expressed an opinion that stronger security measures should be a prerequisite
  2. The “artifact pool” is an internal data store of an IC node replica that holds data communicated between nodes for replication purposes.
  3. Context switching between threads and processes is only marginally different to the point that it practically does not matter. There is no fundamental OS-imposed limitation, and actually multiple processes will eventually be faster than threads for the IC (canister operation heavily relies on dynamically mapping/unmapping memory, and with a single process this causes contention on locks protecting the address space data structures; with multiple processes this contention goes away, we also verified all of this with measurements). The performance gap presently exists because data structures were not originally designed to place data in shared memory (to avoid copies that are otherwise necessary to transfer across process boundaries), and that system API implementation is not structured to “collect & batch” updates but rather cause IPCs due to tying resolution too much to internal data structures. There is no conceptual hurdle to overcoming all of this, just that it takes time to implement and we are more willing to temporarily sacrifice performance in order to ensure correctness & security from day 1 and leaving performance to be reclaimed over time.

Wise words, very happy with them