Allow installation of large Wasm modules

We continue to get pummelled by this limit. This latest example from today is just one of a number of issues stemming from this limit that we deal with at Demergent Labs while working on Azle and Kybra.

Our latest issue is updating Kybra’s Python interpreter and other important dependencies. We went through the updates over the last couple days and now we are just over the binary limit, again mind you. We’ve been through this before of course, and each time we have to add some config or hack to overcome it.

Unfortunately we’re at almost the end of our rope. We’ve messed with various optimization parameters in our Cargo.toml, we’re using ic-cdk-optimizer, and gzip with the highest compression setting. None of this is good enough anymore. So now we may have to use the hack where we ask our users and ourselves to chunk upload the Kybra binary to deploy to an intermediate canister, and then deploy from there to take advantage of the weird difference in message size limits for cross canister calls.

This will upend the deployment flow for our users, getting rid of the ability to use dfx deploy.

This limit is really killing us.

Does anyone have other ideas for optimization in the mean time? We’ve try so many things.

6 Likes

@lastmjs Thanks for your input. I can empathise with the problems you mentioned but unfortunately I don’t think we can offer something more than what you described as a workaround (maybe someone in the community has thought about a clever alternative).

We will be looking more closely at a proper design on this but a fair warning: as usual, we have realized that things are not as easy as they might seem at first glance. We have discovered a couple of potential problems with larger Wasms that we will need to ensure we have solutions for before we really allow larger Wasm modules, e.g. with potentially longer compilation times or interactions with the sandbox given how data is transferred between the replica and sandbox.

I will update this thread once we have more to share. Again, I understand the pain this limitation is causing but we also want to make sure we don’t accidentally add regressions in other places by rushing any solution.

1 Like

Been experimenting with various designs lately I came to the following as the more reliable. Some of the concerns are (besides supporting larger wasm modules) are:

  • One command to deploy the canister and one to upgrade
  • Optimise as much as possible to not compile or deploy unnecessarily
  • Stay as close to the default dfx config as possible

This is an example repo:

How it works:

  1. There’s a deployer canister that handles both deploys and upgrades
  2. Has a script that runs the build command and deploys the canister wasm with the deployer canister

Although there nothing new so far I think there a simple feature that could handle many use cases. A new “deploy” property can be introduced to dfx.json. Such property allows to customise the deployment scripts without introducing new commands.

For new projects that need setup this can be included as it will come in dfx.json along with the necessary build scripts. For existing projects that need to increase the wasm module they can introduce their custom deploy scripts without breaking their existing config.

@lastmjs Will a custom “deploy” script solve some of the issues with Azle / Kybra?

1 Like

That could be nice yes, I’ve long wanted to control the deploy process more with dfx.json, perhaps a pre deploy and post deploy should be considered as well.

But, this will only help to alleviate some of the issues, I think that’s well understood.

Since we basically have to use a deployer canister now, a custom deploy command could help us to ease the process for our users, because asking them to use a custom script us going to upend our whole deploy process now, when thus far they’ve been able to just use dfx essentially as for any other Rust or Motoko canister.

I wonder if the dfx extensions feature that is coming out is of any help here.

2 Likes

Another question is how big are you going to be able to raise the limit? To be safe a minimum of 50MiB is preferable for our use case, based on what I remember from the last time I looked at the numbers necessary for compiling RustPython with the whole stdlib. I could check again if necessary, but will we be able to make Wasm binaries of arbitrary size?

To be safe a minimum of 50GiB is preferable for our use case

Just to make sure, this is not a typo, right? This is 50GiB? If yes, this is orders of magnitude more than we had in mind. I remember numbers in the order of ~100MiB that you had mentioned for Kybra (maybe they are outdated numbers though).

50GiB is a lot. We certainly cannot go to such numbers in one go, if we can ever go there that is. As I mentioned, we need to ensure that compilation times are in order for such big Wasm modules and that we also don’t regress other parts of the system like the communication with the canister sandbox process. Also, a chunked upload of 50GiB would potentially be quite slow if done “naively”, i.e. just in chunks of 2MiB, we’re bound by how fast the network is.

I could check again if necessary, but will we be able to make Wasm binaries of arbitrary size?

I think it’s useful if you can double check and at least we know what numbers we need to be looking at. Arbitrary size is hard to promise – usually we do need to impose some limit to make reasonable assumptions about DoS attacks/scenarios and be able to predict how the system will behave under those.

Sorry about that, definitely a typo. Mega not giga :slight_smile: 100MiB would be fantastic and I think well more than we need at the moment, 50MiB could cut it close. I will double-check our numbers, but I’m hoping we can have a safe excess so we’re not always towing the line.

1 Like

Ok, thanks for the confirmation, I was really worried there for a moment :sweat_smile:.

100MiB is the rough target we have in mind and I’m optimistic we can get this (we might still go there in a few steps rather than a big jump from 2MiB to 100MiB but that’s mostly to ensure there will be no issues in production).

4 Likes

Sounds amazing! Sorry for the scare, that would be very very concerning to hear.

2 Likes

For our foreseeable use cases, it looks like a minimum increase to 25MiB is needed, but that could be towing the line. Hopefully we can start with an increase to 30-40+MiB?

Here’s some of the results of testing I did a while back with RustPython which has given use the most trouble with its larger binary sizes: [RFC] Reduce Wasm binary size · Issue #4203 · RustPython/RustPython · GitHub

And I just did another test now with an example project when adding everything into RustPython that I think we would need, and the binary rose from ~9MiB to ~25MiB.

If I had to choose an absolute minimum increase to target for the initial rollout of this feature, I would love for it to be 30MiB.

2 Likes

Thank you for providing this information Jordan! Really useful for us. I suppose these are the numbers for uncompressed Wasm modules, do you have the numbers if you try to compress them?

1 Like

I know you asked about compression, but one of our main goals is to remove all optimization requirements. Having to install and use tools like ic-cdk-optimizer, ic-wasm, wasm-opt has been the source of many problems and long compilation times for Azle and Kybra.

gzip takes less time and isn’t as much of a burden, but it would be ideal to not have to rely on that machinery either. It still introduces extra moving parts in our codebase and requires the user to change the location of the Wasm binary to the gzipped version. Ideally dfx could just gzip automatically when pointing to any Wasm file.

With that in mind, I don’t know the numbers for gzipping but I can check. I hope we can shoot for non-gzipped non-optimized binaries.

2 Likes

The recent increase in the Wasm binary limit in dfx 0.14.2 has been awesome, allowing us to reduce compilation times in Kybra.

For those who aren’t aware, my understanding is that the dfx 0.14.2 has a new limit of 30MiB total for uncompressed Wasm binaries, with a maximum code section size of 10MiB and a maximum data section size of 30MiB.

But the issue of the 2MiB ingress message limit and the 10MiB cross-canister message limit still exist. The 2MiB limit is the most sinister for us right now. It has required us to do a rather complicated deploy process.

What’s the status on dfx automatically chunk uploading Wasm binaries?

4 Likes

We’re actively working on this feature: we aim to allow uploading Wasm modules to some replica storage and installing modules from there. Uploading will be done in chunks but installing would not suffer from the message limits.

6 Likes

Hey everyone,

I wanted to take a moment to give a huge shoutout to @peterparker and his incredible work on the Juno app. It’s a fantastic tool that showcases the power and flexibility of the Internet Computer Protocol. If you haven’t checked it out yet, I highly recommend you do.

In my project, B3Wallet, I’ve implemented two unique approaches to handle the uploading and installation of Wasm modules, effectively bypassing the message size limit that can often be a bottleneck when dealing with larger Wasm modules.

Approach 1: Uploading to a System Canister and Creating a New Canister

  1. Chunking the Wasm Module: The Wasm module is broken down into smaller pieces, or chunks, using Node.js. Each chunk is small enough to fit within the message size limit.
  2. Uploading the Chunks: Each chunk is then uploaded to a system canister. The system canister stores these chunks until all pieces of the Wasm module have been received.
  3. Reassembling the Wasm Module: Once all chunks have been uploaded, the system canister reassembles them into the original Wasm module.
  4. Creating and Installing a New Canister: The system canister then creates a new canister and installs the reassembled Wasm module into it.

Approach 2: Upgrading an Existing Canister

  1. Chunking the Wasm Module: The Wasm module is broken down into smaller pieces, or chunks, directly on the frontend. Each chunk is small enough to fit within the message size limit.
  2. Uploading the Chunks: Each chunk is then uploaded directly to the canister that is set to be upgraded. The canister stores these chunks until all pieces of the Wasm module have been received.
  3. Reassembling the Wasm Module: Once all chunks have been uploaded, the canister reassembles them into the original Wasm module.
  4. Requesting Installation: After the Wasm module has been fully reassembled, a request is sent to the management canister to install the new module. This is done by calling the install_code method on the management canister, passing in the reassembled Wasm module as an argument.

These approaches ensure that the process of creating or upgrading a canister with a new Wasm module is smooth and efficient, even for larger modules that exceed the message size limit. It’s a testament to the flexibility of the Internet Computer Protocol and the innovative solutions it enables.

You can check out these functionalities live at b3wallet.live and explore the open-source code on the B3Wallet GitHub repository. I welcome any feedback or contributions from the community.

Let’s continue to build and innovate, pushing the boundaries of what’s possible with the Internet Computer Protocol!

Cheers,
Behrad

3 Likes

Thanks @b3hr4d! :smiling_face::pray:

B3Wallet looks really neat and showcases some interesting patterns that I might also consider using.
Starred :star::white_check_mark:

3 Likes

Hi @b3hr4d ,

maybe you can point me in the right direction with this…?

1 Like

Hello, @sadernalwis,

If you’re trying to deploy it to Mainnet, you should use the load script first. It uploads the wasm into the system canister, and then you can create and install a new wallet using the system canister. If you want to install it locally, you should not encounter this error!

May I ask which version of dfx you are using? I’d like to provide more specific assistance based on your setup.

Is there an update on this feature? We would love to allow users to point to Wasm files over 2MiB in gzipped size and have all of this chunking done automatically for them.

3 Likes

@b3hr4d Thank you!

I’m still trying to get the local build up.
using
dfx 0.14.3

And what further info can I provide?

1 Like