Let's discuss reproducible builds and code verification once again

marc0olo · March 5, 2025, 1:00pm

Hey there

we have some ongoing discussion around reproducible builds again.

I think ever since there is a high demand to be able to verify the source code of deployed canisters as easy as possible.

Although there are quite some projects providing instructions how to verify the source code via a reproducible build, this is currently not standardized and thus it is almost impossible to provide a generic service that can perform a reproducible build and check the hash of the build artifact.

Most recently, @timo created a template for Motoko that aims to solve this:

GitHub - research-ag/motoko-build-template: Reproducible build template for single Motoko canister

The general ideas are:

a repository should be self-contained, i.e. the reproducible build instructions are checked in and part of the repo
the standard says a Dockerfile and docker-compose.yml must be in the repo
those two files together are the build instructions, i.e. the build can be reproduced by a single docker-compose command
- the two files bind the toolchain
this allows mass verification by a service for every repo that follows this guideline
docker images are layered so that base image can be cached
one canister - one repo, not for multi-canister projects in mono repo
the second layer Docker images contains all dependencies, toolchain and source
- → supply chain attacks leave a detectable trace

What the repo of @timo is:

a template adhering to this standard
only for Motoko at the moment (Rust is to be done)
designed for building without dfx
- a version for dfx, if desired, can be introduced as a different set of base images

We’d appreciate as much feedback as possible on this - so please check it out and test it!

Ideally we can define a standard for that and not only cover Motoko builds, but also Rust, Azle and possibly other CDKs.

icme · March 6, 2025, 1:27am

Reproducible builds are a core requirement of some of the features on our roadmap (mid-late 2025)

Having a standard and this tooling in place not just for locally reproducible builds, but that also comes with a standard CI/GH action setup baked in would save weeks/months of time from our end, and it’s a huge ecosystem time saver for canisters that have received an audit to be able to tie the audit to a reproducible build and verifiable commit hash.

So yes please, need a reproducible build standard for all the languages, ideally with some overlap and standard commands/config that external services can plug into.

skilesare · March 6, 2025, 4:24am

I almost never have one canister per repo. Most of what I build is modules that people can use to build canisters. I guess spinning up a repo that pulls in the module and constructs just one canister in it is possible. Maybe a bit chatty, but possible.

Why no dfx?

marc0olo · March 6, 2025, 7:31am

just to clarify:

this discussion is not to get confirmation / feedback that a standard for reproducable builds is needed (I think we all agree that this would be awesome to have)
I shared the repo of Timo for others to test / provide feedback on the general approach
- does anybody have ideas to improve / simplify the solution?
- does anybody have a proposal to handle this for Rust / Azle / other CDKs?

yeah, this needs to be figured out how to incorporate this as easy as possible in a multi-canister repo

I am not sure if I understand this correctly. can you share an example of module builds? or do you mean that you load other dependencies into your canister repo? (e.g. via mops)

did you check the template of Timo? it works with mops.

this is a first suggestion to see if people would be willing to adopt this and if we can potentially agree on the path to choose for reproducible builds. ideally, this will be incorporated into dfx somehow in the future and would also be defined for Rust, Azle, …

it should be as easy as possible to build, deploy and verify those reproducible builds. and ideally the developer will have as little chances as possible to make mistakes that could “break” the approach.

EnzoPlayer0ne · March 6, 2025, 8:46am

For Rust you can take a look at WaterNeuron’s repository. We are using nix to pin dependencies to their checksum, and a custom Rust script, ic_wasm_utils, to build local canisters and fetch remote ones from DFINITY’s remote storage over at https://download.dfinity.systems.

We do not use dfx, but rather a combination of cargo, ic-wasm, and gzip to get a more granular control on our build process.

ic_wasm_utils/src/lib.rs#L152

    let rustflags = format!(
        "RUSTFLAGS=\"--remap-path-prefix={}= --remap-path-prefix={}=\"",
        WORKSPACE_ROOT.display(),
        cargo_dir.display()
    );

    let file_name = format!("{0}{1}", name, if self_check { "_self_check" } else { "" });

    let build_steps = [
        format!(
            "{0} cargo canister -p {1} --release --bin {1} --locked {2}",
            rustflags,
            name,
            if self_check { "--features=self_check"} else {""}
        ),
        format!("ic-wasm target/wasm32-unknown-unknown/release/{0}.wasm -o artifacts/{0}.wasm metadata candid:service -f {0}/{0}.did -v public", name),
        format!("ic-wasm artifacts/{0}.wasm -o artifacts/{1}.wasm metadata git_commit_id -d $(git rev-parse HEAD) -v public", name, file_name),
        format!("ic-wasm artifacts/{0}.wasm shrink", file_name),
        format!("gzip -cnf9 artifacts/{0}.wasm > artifacts/{0}.wasm.gz", file_name),
        format!("rm artifacts/{0}.wasm", file_name),
    ];

It runs in our CI at every commit, checks the hash of remote canisters is correct, and display the checksums of local ones along with build properties of canisters:

SHA256 Checksums:
─────────────────────────────────────────────
boomerang    2faf01570df4e90212a55f972eee0d4c7cb179c10624384ec94b9c8318eb67df
           → "/home/runner/work/WaterNeuron/WaterNeuron/artifacts/boomerang.wasm.gz"
           ✓ git commit metadata
           ✓ candid metadata

water_neuron 8509cbcc68a97e1e4ad6ea70f203a33ed0949cc0f8f079c305bb48c7a7b819ad
           → "/home/runner/work/WaterNeuron/WaterNeuron/artifacts/water_neuron.wasm.gz"
           ✓ git commit metadata
           ✓ candid metadata
           ✓ does not have `self_check`

water_neuron_self_check 3d373772ab653e8426e085351c59db0f81db4f29529ed2be48123f143b6b141c
           → "/home/runner/work/WaterNeuron/WaterNeuron/artifacts/water_neuron_self_check.wasm.gz"
           ✓ git commit metadata
           ✓ candid metadata
           ✗ does not have `self_check`

sns_module   18c0891d4ab095d45ba7f0cc04713a97055b828837502f658ce3e78213373eb9
           → "/home/runner/work/WaterNeuron/WaterNeuron/artifacts/sns_module.wasm.gz"
           ✓ git commit metadata
           ✓ candid metadata

Git commit:   9e194ac4ed71f44a2b928021edd6feef622bb72b

Local canisters are then uploaded to GitHub as artifacts, and stored for 30 days.

If we want to fetch a remote canisters for testing purposes, it has to be defined in the ic_wasm_utils library, we are using one source of truth to make sure different canister all have the same dependencies.

lib.rs#L56

            CanisterName::NnsGovernance,
            WasmBinary {
                hash: "8f76b2de37197b3ff0ae188f1ef99ddd5bd75cb8f83fb87c2889822ece0b5576",
                ic_version: "ad5629caa17ac8a4545bc2e3cf0ecc990c9f681e",
                name: "governance-canister.wasm.gz",
            },

You can then import them in your testing environment to get the Wasm bytes payload
water_neuron/src/state_machine/mod.rs#L511

        let governance_id = env
            .install_canister(governance_wasm(), arg.encode_to_vec(), None)
            .unwrap();

As a result of all this, you can check the checksums of every SNS canister upgrade, and make sure the code in the repository is the one running on the IC. For instance on proposal #2147:

git fetch
git checkout 9e194ac4ed71f44a2b928021edd6feef622bb72b
./build.sh

Community members will then verify the checksums independently:

@quint and @Lorimer verifying proposal#2147 on the WaterNeuron Telegram

marc0olo · March 6, 2025, 10:45am

thanks for sharing the WaterNeuron approach @EnzoPlayer0ne!

I want to quote @timo again here to give more context to about the goals that we aim to achieve with the introduction of a standard:

The important thing to settle on is what we actually want to achieve. The goal is to make deployed canisters verifiable. This happens in three steps:

Get from canister id to wasm module hash.

Get from source code to module hash.

Semantically understand the source code.

What we are concerned with here is step 2. That is to verify the mechanical procedure to get from human-readable source code to the wasm module hash. Step 1 is provided by the IC and includes the canister history feature because previously deployed version are also important to verify the current canister state. Step 3 is about finding bugs or backdoors in the source code.

Step 2 is about giving the verifier access to the source code and assuring him that it is exactly the source code used to build the given wasm module hash. The developer is not trusted in this process. In fact, we have to assume that the developer is malicious. We have to assume that the developer wants to hide a backdoor in the source code. He can either hope that it won’t be found in step 3 or he can make the verifier obtain the wrong source code in step 2. Since it is the developer who provides the reproducible build instructions, we have cannot trust the build instructions. That means that the verifier has to read and understand the instructions as well as part of verification. In the current setup this means the verifier has to read and verify the Dockerfile, docker-compose.yml and build.sh. For example, he has to check:

What is the bare linux base image that is used? Is it a standard one, or is it a manipulated one? Does it come from a trusted registry, or one that the developer controls?

Where is the toolchain downloaded from? Is it from github or a less reliable source? If on github who made the releases on github?

Where do the dependencies come from (e.g. mops packages)?

Etc.

This is more than most verifiers would do or have the expertise to do. It would therefore be good to define a rather tight standard which leaves less freedom to the developer. Simply saying the reproducible build instructions are “a URL where a Dockerfile is supposed to be found” is nice and flexible but it’s too broad. It gives too much freedom to the developer and makes life hard for the verifier. The build process should be more standardized. For example, the Dockerfile and build script should be pre-defined and all projects should use them same one. The developer should ideally only be able to specify compiler version (moc) and dependencies (mops.toml) and nothing else.

This means that if we are to define a standard then the Dockerfile and build script has to be part of the standard. It will be hard to agree on one, to find one that we think is general enough to cover all projects. But we can try at least.

If the standard does not specify a Dockerfile and build script then we are mixing step 2 and step 3. Step 2 is supposed to be a mechanical step. One that can be easily automated. Semantical verification is supposed to happen in step 3. If as a verifier I have to semantically understand the Dockerfile and the build script then step 2 isn’t purely mechanical anymore

@EnzoPlayer0ne is there a specific reason you decided against using Docker for this? if I may recall Timo and if we decide to go with a Docker based approach, the Dockerfile as well as the build script would ideally be standardized.

do you want to try coming up with a generalized approach for Rust builds that could also be adopted by WaterNeuron easily?

Vivienne · March 6, 2025, 10:54am

Also, does this work across different CPU architectures? IIRC we went with a Docker-based solution for the cycles ledger because otherwise we would have had differences between our macOS and Linux machines

timo · March 6, 2025, 11:25am

Yes, of course the canister repo has all kinds of dependencies (your modules) which can live elsewhere. The question is do you have multiple canisters in the same repo, i.e. multiple entries in “canisters” in your dfx.json? And do you feel strongly about it? Or are you willing to split it up into one canister per repo for the sake of adhering to a reproducibility standard?

It would make the standard more complicated if we have to support multiple canisters in the same repo. It raises all kinds of new questions that don’t arise otherwise. For example, does the single Dockerfile in the repo now have to build multiple canisters? Or are there now multiple Dockerfiles in the repo, one for each canister? How do you distinguish the canisters from each other, they now need alias names so that you know which hash belongs to which one. Etc.

timo · March 6, 2025, 11:33am

To my knowledge nix pins the executables’ binaries but that alone is not enough to be cpu architecture independent. You can still end up with different module hashes on Linux vs on Mac. I would be interesting to get some feedback here: Have you observed cpu architecture dependence for Rust canisters? For Motoko we have observed it. If @Vivienne says that a docker-based build was used for the cycles ledger and the cycles ledger is in Rust that probably means that there was an architecture dependence. But can someone confirm who has explicitly observed it a Rust build?

marc0olo · March 6, 2025, 11:55am

yes, but probably we have to come up with a good approach to this. I assume there would be quite some devs that want to include multiple canisters in their repo.

actually dfx already uses alias names for canisters.

it probably starts to become more complicated if devs want to mix canister source code of different CDKs in one repo. but I personally think it would be fine to not support this specific case. I am wondering if some team out there is having such a setup. I assume that case is very rare

Vivienne · March 6, 2025, 12:02pm

Only 90% sure since it’s been more than a year(?). IIRC Rust also has problems similar to gzip if you skip -n. If you compile the project in a different path the binaries can also have different hashes. But even if it is ok for Rust we may want to consider that the standard ideally also works for languages with such issues

EnzoPlayer0ne · March 6, 2025, 12:33pm

We had this, we solved that issue with the following flag:

RUSTFLAGS=\"--remap-path-prefix={}= --remap-path-prefix={}=\

EnzoPlayer0ne · March 6, 2025, 12:55pm

Docker does not allow you to pin build dependencies.

Nix, specifically with flakes, allows us to pin cargo, gzip, ic-wasm, and all the tools we use to build canisters to a specific checksum. In Docker, when you rebuild an image, you re-fetch all the packages and they all update too. Your image will also contain a filesystem where you won’t use 90% of it.

Nix allows you to build extremely compact container images:

Using nix with Dockerfiles by Mitchell Hashimoto
Building a Rust service with Nix by FasterThanLime

Additionally, if what you are using is a nix package, you can have a dependency list of all the dependencies your dependencies have. It might seem like over the top, however it helps or at least is a first step to mitigate supply-chain attacks.

I use nix as I was influenced @basvandijk and @nmattia. They are nix experts and convinced me to use it in production.

We did not make ours work for multiple architecture but you can, yes.

timo · March 6, 2025, 1:42pm

That’s right. To take the example of a Motoko build the only relevant binaries are moc, ic-wasm and gzip. The Dockerfile installs those directly from pinned releases. I don’t care about pinning the rest of the system. That isn’t necessary and not a goal.

For Rust, you would have to do the same with cargo.

My Docker image is 75 MB and I once counted that at least 60 MB are binaries that I absolutely need (moc + ic-wasm + mops-cli). So at most 20% are not used.

I see two problems with using nix alone (no Docker):

Nix does not emulate a cpu architecture (unlike docker does with qemu). It does not solve the problem we have with dependency on cpu architecture. So it simply isn’t a replacement for docker.
If you rely on nix at the top-level, i.e. every verifier has to have nix installed, then you are cutting down the number of verifiers that are willing to do validation by 90% (my random guess). I would say people are 10 times more likely to run a Docker command than to run nix.

Maybe nix inside Docker is possible? I think docker has to be at the top-level.

timo · March 6, 2025, 2:09pm

Thanks for bringing this up. Supply-chain attacks are the most important thing that has to be discussed. The the threat model is the most important thing that we have to agree on.

The verifier wants to get to see some source code and wants to be convinced that the source code he is looking at is what produced a given wasm module hash. The developer is the adversary in this context (assume developer wants to hide a backdoor from the verifier). So the developer pinning something isn’t productive at all here. Because it would mean that the verifier has to go through every single thing that the developer has pinned and check if it is malicious or not. That is a lost cause.

What has to happen is that the developer specifies as few things as possible. Maybe just a .ini style config files with the version numbers of the toolchain. Like 1-3 version numbers, that’s it. Then the verifier chooses his own image to perform the verification. He can take an off-the shelf Dockerfile that fits the 3 version numbers or an off-the-shelf nixpkgs or his own creation. It doesn’t matter. But the developer must not be in control of what the verifier chooses.

With this context it becomes clear that version pinning by the developer is wrong.
Instead, the developer should only be allowed to specify the toolchain in high-level terms from a relatively small set of allowed choices. And the verifier should be able to figure out for those few choices how to get or produce an image that runs this toolchain.

Now how do we deal with supply chain attacks? I postulate that
a) supply-chain attacks cannot be completely prevented (on a fundamental level)
b) the solution to deal with them in the end will always boil down to having as many verifiers as possible and hope that at least one verifier is not a victim to the attack and can raise an alarm.

This brings us to the same point as before: The less we allow the developer choose (e.g. the source of something) the better, because that makes it harder fro him to control the supply chain.

Again to illustrate the problem: Usually we try to mitigate supply-chain attacks by specifying hashes of everything. For example the hash of the docker linux base image, the hash of installed binary releases, github commit hashes of source code dependencies, etc. But in the end what does it help us? Who provides those hashes? If the developer provides them then that is worthless. If the developer has complete freedom to choose any hash for anything then it means the validator has to download everything behind every single hash and check it in detail. That approach cannot work. We have to do the opposite. If we continue to speak in terms of hashes then it must be like this: The validators have to come up with a set of hashes that they all agree on that they want to allow the developer to use. And then the developer can choose from just a few allowed sets of hashes.

peterparker · March 6, 2025, 2:19pm

Not sure if it’s the kind of inputs you are looking for, but I literally had to search my codebase to find following arguments since the build in a new repo wasn’t reproducible. So that got me thinking to share that it might be worth documenting gzip --no-name somewhere if it hasn’t been already done.

On a related note, beyond the backend, if there are discussions about reproducibility, it might be helpful to document some tips for making the frontend reproducible as well, as I’d bet that few, including SNSes, aren’t.

marc0olo · March 7, 2025, 8:45am

timo:

Again to illustrate the problem: Usually we try to mitigate supply-chain attacks by specifying hashes of everything. For example the hash of the docker linux base image, the hash of installed binary releases, github commit hashes of source code dependencies, etc. But in the end what does it help us? Who provides those hashes? If the developer provides them then that is worthless. If the developer has complete freedom to choose any hash for anything then it means the validator has to download everything behind every single hash and check it in detail. That approach cannot work. We have to do the opposite. If we continue to speak in terms of hashes then it must be like this: The validators have to come up with a set of hashes that they all agree on that they want to allow the developer to use. And then the developer can choose from just a few allowed sets of hashes.

I think this is crucial for adoption. we need to set the boundaries as tight as possible and put ourselves more in the perspective of the validators. but of course the solution needs to make sense and projects should be able to easily adhere to the standard.

EnzoPlayer0ne · March 7, 2025, 11:16am

We should care about it for deterministic builds. Build leaks, files or paths get included which you did not intend on. This changes the final binary, which changes reproducibility and produces non-deterministic builds.

For Motoko, it probably is not a problem, as the Foundation controls the compiler and language specs. However when using Rust, which is outside Foundation control, it is important.

I fully agree with you here. I think it can mitigate some supply chain attacks by forcing the pinning and prevent packages from updating when it is not necessary.

With the “pure” docker method, in 6-months time, if I rebuild your image from scratch some packages might get updated which will leak into the build. Without this, it removes the possibility of verifying the reproducibility of the canister from its inception to its current status.

The approach I like about nix is you can fetch and re-build every package with a specific pinned version from 6-months ago (as long as the source is still available). This steps ensure build reproducibility through time.

The two quotes below from Nix is a better Docker image builder than Docker's image builder - Xe Iaso summarize my argument:

Above all though, the biggest advantage Nix gives you is the ability to travel back in time and build software exactly as it was in the past. This lets you recreate a docker image exactly at a later point in the future

You don’t just build your software though, you crystallize a point in time that describes the entire state of the world including your software to get the resulting packages and docker images.

We should also be using container images instead of Dockerfile and use podman instead of docker, as it needs less permissions to run.

I agree with you, nix is too much to ask for for most developers, which is part of the reason why the Foundation moved away from it for the mono-repo.

Our goal here is to create an environment where developers have every tool necessary to build a canister from source, and so can validators. It should be easily available, and quick to spin up. It should be light weight while containing everything you might want for Rust or Motoko. You should be able to use it in your CI system.

If we come up with a standard it should be used at the Foundation to build reproducible and deterministic canisters (tagging the IDX team @basvandijk @nmattia @marko).

I have a proposition: we build one container image with a single flake.nix file. This container image is built on CI. We can have the most minimal filesystem. Pin every dependencies or build them from source. We can import Motoko and Rust related build tools. We then roll it out and try to have as many canisters as possible adopt the container image.

Ideally we should also have a library in Rust and Motoko (not bash) which helps to build the canisters and fetch them remotely just like we did with ic_wasm_utils. In case of a non-deterministc build this library should also have diffoscope where we can diff the two different builds.

For anyone who’s more curious about reproducible builds you should check out https://reproducible-builds.org/.

marc0olo · March 7, 2025, 12:29pm

+1 for the frontend side, especially after what happend to Bybit

is somebody aware of a good approach there?

this and maybe even Azle (and frontends?! )

+1 yeah, ideally every project adopts this standard then.

peterparker · March 7, 2025, 1:53pm

Frontend reproducibility across OS is really tricky due to the hidden magic Node performs at build time. So, I generally limit my checks to reproducing the frontend build on the same OS by asserting same files with same content are bundled when re-running builds. I have a script and a CI action for this, for example in OISY (CI / script) or same in Juno.

I’m not sure, but I would also assume that NNS dapp or II goes a step further with additional CI checks by asserting that their WASM remains reproducible once the assets have been statically embedded.

Topic		Replies	Views
Non reproducibility of ic-os build Governance	54	2013	November 13, 2023
Issue Verifying Proposal#130106 NNS Governance	5	404	August 12, 2024
Secure icrc ledgers for free :mage: Developers	25	1003	November 3, 2023
[RFC] Canister Metadata Standard Developers	30	2384	December 18, 2024
Voting is open for two new IC OS releases Governance	25	1832	February 15, 2024

Let's discuss reproducible builds and code verification once again

Related topics