Dynamic Canister Method Registration

As far as I can see, nothing in this manipulation would be dynamic – it would be a step of the build/linking process. Or perhaps I misunderstand what you mean by that.

To be sure, shifting indices is what tools like wasm-merge (a static linker that’s part of Binaryen) also do as part of their normal operation, nothing shady about that – programming language nerds would simply call this alpha-renaming :).

That said, a tool like described shouldn’t even need to do that if it simply inserts the additional data segments at the end?

I had issues inserting active data segments at the end, even after manually inspecting everything and ensuring all indexes were correct, a seemingly spurious Rust ownership/borrowing error would occur where it shouldn’t have. I would remove the data section and the error would disappear. After studying all ways to accomplish this it seemed I couldn’t accurately update the data section without essentially doing what the Rust compiler would do, thus why not just use the Rust compiler? But then we’re back to the original problem I aim to solve, which is removing the need to ship a Rust compiler environment.

By dynamic I just mean I have to manipulate the Wasm binary “dynamically” to add the functions into the binary and create their exports.

I think the missing piece is really adding the data segment into the binary, an active segment doesn’t seem very easy to do, a passive segment I was definitely aware of but didn’t pursue for some reason I don’t remember.

Another way to do it that @ulan has suggested is to create a static array in Rust and initialize it with like 10MiB of zeros, thus the data segmenting would all be done correctly. As long as less than 10MiB of source code is needed, I would then just write into that segment. But this would ensure at least a 10MiB binary, which when gzipped would compress very well, but still.

It’s a hack still, I would love to have an elegant solution.

Hm, I can’t speak to the specific problems Rust was creating with this, other than saying that Rust is obviously not a great language for writing compilers and similar tools in. In principle, this transformation should be fairly straightforward to implement. FWIW, I don’t see how passive segments change it much.

I looked for existing tools that might help here. Wizer seems promising:

First we instantiate the input Wasm module with Wasmtime and run the initialization function. Then we record the Wasm instance’s state:

  • What are the values of its globals?
  • What regions of memory are non-zero?

Then we rewrite the Wasm binary by intializing its globals directly to their recorded state, and removing the module’s old data segments and replacing them with data segments for each of the non-zero regions of memory we recorded.

@lastmjs: I wonder if the tool would work for you. If not, I think we could write something similar but custom to your case that can embed bytecode and also export static endpoints.

IIUC, in your use case, developers need to upload a new Wasm binary and use standard canister installation. In such cases, dynamic endpoints might be an overkill since we have the new binary anyways.

That’s said, dynamic endpoints might still be useful for interpreted language to enable completely new use cases. For example, running multiple versions of the application in the same canister (e.g. for A/B testing or for zero downtime upgrades). This would work if Azle would support multiple JS contexts in the same JS engine. I am not sure if there is an appetite for such use cases in the community right now.

2 Likes

Do you have public seed/genesis neuron?

Here’s a small POC I made which shows how you could create a “static” wasm module which expects a passive data segment to be added later and can read from that data segment at runtime: GitHub - adambratschikaye/wasm-inject-data: Example of injecting a data section into an existing wasm module

It currently uses a rust crate wasm-transform to mutate the wasm and this is code that I’ve copied over from our instrumentation code in the replica. But we could easily publish it as a stand alone crate if you’d find it useful.

3 Likes

Thank you so much.

After thinking and discussing with @ulan I will pursue the Wasm binary modification again for now. If I could continue to get help as I go then hopefully we can get this sorted.

It should become apparent any limitations and then we can weigh those against possible protocol changes, as @ulan said dynamic method registration may provide more use cases down the road.

1 Like

Nice! Is there a reason why you made passive_data_size a function instead of just a global?

I think it would be better to have it a global, but I just ran into some issues when trying to make it work. I’m sure it’s doable, but I just went with the function for this simple example because I got it working faster.

1 Like

@rossberg @abk Is it necessary to use a passive data segment?

It should also be doable with an active segment, but I think it might require some more complicated logic when you inject the new data segment. Since the segment is active, you’ll need to decide exactly where it ends up in the Wasm memory when do the injection and you’ll need to make sure that that region doesn’t later get overwritten by the Rust stack or heap while the module executes. A Wasm module compiled from Rust seems to have globals called __stack_pointer, __data_end, and __heap_base so it might be enough to place the data at __data_end and then increment both __data_end and __heap_base, but I haven’t tried that.

Is there any reason you’d prefer to have it as an active data segment?

1 Like

I would prefer whatever works best, and I had everything working (well, I had the manipulation working at least) with an active segment, but it just didn’t work. I’m assuming for similar reasons to what you described.

I just don’t remember why I didn’t pursue passive segments, so just curious. I’m excited to try passive segments.

I really appreciate all of the help, I’ll be getting to this soon.

3 Likes