Voting is open for two new IC OS releases

Hello there!

We are happy to announce that voting will soon be open for two new releases. These will be action-packed because it’s two weeks of release in a single one.

The first proposal is for the standard release – find the information for the feature release at the bottom of this post.

Elect new replica binary revision 8d4b6898d878fa3db4028b316b78b469ed29f293

Some changes did not fit in the proposal. They are included in the forum post.

Release Notes:

Features:

  • [ae14ceb79] Boundary Nodes,Node: some nginx-related tweaks, cleanups
  • [9c2eb38f4] Boundary Nodes,Node: Further improve logging, add more fields
  • [0f707cd1d] Boundary Nodes,Node: Adds Crowdsec bouncer to BN
  • [882bcfaa9] Boundary Nodes,Node: add crowdsec, refactor base docker a bit
  • [751cf03f5] Boundary Nodes,Node: reconfigure vector
  • [5627103f6] Boundary Nodes,Node: bump nginx to 1.25.3 in the base image
  • [65d3fcad4] Consensus(ecdsa): tECDSA signer component for reduced latency
  • [40ef17f4d] Consensus(ecdsa): Payload builder/verifier for improved tECDSA latency
  • [735ebc8ef] Consensus(ecdsa): Populate key_unmasked_ref in PreSignatureQuadrupleRef
  • [aab055b89] Consensus(ecdsa): Move new tECDSA priority function behind feature flag
  • [1dd119090] Consensus: Add hard bound on notarization/certification and notarization/CUP gap
  • [e0993fb0b] Consensus(ecdsa): Add get_oldest_ecdsa_state_registry_version()
  • [44fdb947c] Consensus(ecdsa): populate key_id in QuadrupleId for newly crated Quadruples
  • [683a67c5f] Consensus(ecdsa): add ecdsa_key_id label to ECDSA metrics
  • [2c828fa99] Consensus(ecdsa): Serialize key transcript ref in quadruple
  • [2f64ebd43] Consensus(ecdsa): Refill matched quadruples
  • [adba2f937] Consensus: Add metrics for the size of the certification pools
  • [19517642d] Consensus: Add an artifact_type label to consensus_pool_size metric.
  • [8d45d7035] Crypto: accept also unmasked random origin of kappa in PreSignatureQuadruple
  • [f1b703465] Crypto: Add RandomUnmasked transcript operation type
  • [895ee8bcb] Execution,Runtime: Canister can use itself as store for chunked install
  • [9ad470e83] Execution,Runtime: Metric for large intra subnet calls
  • [939a5e672] Execution,Runtime: Add fetch_canister_logs simple query call stub
  • [9ff6758fe] Execution,Runtime: Change instruction buckets to be more granular for install_code messages
  • [e841826d5] Execution,Runtime: Add log_visibility logic when checking whether an ingress message should be accepted
  • [c49d40680] Execution,Runtime: Add fetch_canister_logs stub to the management canister API
  • [4bb93cbc9] Execution,Runtime: Add log_visibility to canister_settings
  • [130a4a6af] Execution,Runtime: Query Cache: Initial composite query support
  • [dec4b8872] Execution,Runtime(ecdsa): Add metrics for delivered quadruples and completed ECDSA contexts
  • [3c5cfc671] Execution,Runtime(ecdsa): Match quadruples with contexts in replicated state
  • [ed10b3961] Node: Remove old vsock code
  • [96127923c] Node: Update config.ini with ipv4 values
  • [6f75846ce] Node: add the IC OS Name Service Switch library to enable guestos and hostos host name resolution.
  • [1e942c61b] Prodsec,Crypto(fuzzing): Implement arbitrary for the callservice fuzzer

Bugfixes:

  • [e2808ad89] Boundary Nodes,Node: remove content type nosniff
  • [c4ac0218b] Boundary Nodes,Node(custom-domains): nginx statement order
  • [2c8721428] Boundary Nodes,Node: update bouncer, config
  • [2a4477c33] Boundary Nodes,Node: update cs-bouncer, decrease freq
  • [96b16d04f] Boundary Nodes,Node: rollback nftables counters
  • [402549862] Consensus(artifact_pool): Remove warning if unvalidated artifact to be removed doesn’t exist
  • [31ade35ab] Consensus: use a proper key type for the ECDSA pool keys
  • [67730a29e] Consensus: Stop ingress selector prematurely removing canisters from selection
  • [6fed0c308] Consensus: Fix parameter order in CUP maker error log
  • [1dce834f7] Consensus: Make certification lookups more efficient (Part 2)
  • [ed617e740] Consensus(ecdsa): Do not count transient retain_active_transcripts errors as critical
  • [b20777873] Consensus(ecdsa): fix reporting of the key_transcripts_created metric
  • [e30aaaf82] Execution: only reveal cycles top up balance of frozen canisters to controllers
  • [a02fbc926] Execution,Consensus(consensus): Fix query stats aggregation epoch
  • [b34272ac0] Execution,Message Routing: fairness of the Ingress Messages to Management Canister
  • [dfcdd4e90] Execution,Runtime: Query Cache: Avoid caching errors
  • [a8a1f93e7] Execution,Runtime: Query Cache: Expire data certificate after 1 min
  • [d88a1bd92] Execution,Runtime: Track large Wasm assembly charge
  • [97e4d6aaf] Execution,Runtime(execution): Include bitcoin canisters as aliases for IC_00 routing
  • [794a7ba92] Message Routing: Accidental full manifest for some files
  • [323fde86a] Message Routing,Runtime: Create base files efficiently with LSMT
  • [37b70b4dc] Networking(state_sync_manger): Join all outstanding task downloads.
  • [29fc6ad1f] Networking(state_sync_manager): reject state adverts that differ from current state sync
  • [b5859f990] Node: Move IPv4 connectivity check to be a separate service
  • [c6728d951] Node: Use correct inputs to setupos-inject-tool
  • [363b3e25d] Node: - Retry ethtool parsing
  • [fd19e5739] Node: only enable systemd units that are enableable.
  • [a47bd693b] Node: Update telemetry datacenters in HostOS.
  • [df4871796] Node(toolchains): clean up temp dirs after regression and specify a temp dir prefix
  • [c2a3d1575] Node,T&V: Revert IPV6 prefix changes
  • [570512674] Runtime: Raise unzipped Wasm limit

Performance improvements:

  • [d205e8fb2] Crypto: extend IDKG complaint benchmarks with other modes
  • [7405dfaae] Crypto: Add support for multiplication by generator

Chores:

  • [bd7f49857] Boundary Nodes,Networking: bump h2
  • [aa735436a] Boundary Nodes,Node(boundary-node): extend rosetta list
  • [6fb24fcaa] Boundary Nodes,Node(custom-domains): clean up after SW removal
  • [1c35b20b9] Boundary Nodes,Node(ic-boundary): update ic-boundary args
  • [a212196cc] Consensus(orchestrator): Remove unused metrics
  • [0f805c2e0] Consensus(ecdsa): add ecdsa_key_id label to EcsdaPayloadMetrics
  • [3f0220c71] Consensus: split ic_consensus_utils::get_active_data_at into three separate functions
  • [60c4dd818] Consensus(ecdsa): Add metric for available quadruples with key transcript
  • [4cdbd4e1e] Consensus: remove unnecessary clone in ic_consenus_utils::get_adjusted_notary_delay_from_settings
  • [5d9e3bf5a] Crypto: bump tarpc version to 0.34
  • [1b47dee89] Crypto: Avoid using use ic_types::*
  • [0026fd4e2] Crypto: Remove dependency on ic-types in seed crate
  • [40717aafc] Crypto,Interface(crypto): Add AlgorithmId for threshold BIP340 Schnorr
  • [6808e9dd9] Execution: use CanisterSettingsArgsBuilder instead of corresponding constructor
  • [398044605] Execution,Consensus(consensus): Return earlier in QueryStatsPayloadBuilder if there are no messages
  • [35dcd0e1d] Execution,Consensus: QueryStatsPayloadBuilderMetrics
  • [5280b28d9] Execution,Runtime: Fix ic00_permissions comment for fetch_canister_logs
  • [9b5356b65] Execution,Runtime: Cleanup execute_subnet_message match statement
  • [d2bfbb4cd] Execution,Runtime: Add fetch_canister_logs feature flag
  • [36f4f5cf4] Execution,Runtime: Increase query cache max expiry time to 10min
  • [fb709378c] Execution,Runtime: Increase query cache max expiry time to 5min
  • [9d3651cf5] Execution,Runtime: Add round_inner_iteration_exe to scheduler metrics
  • [fcc524f77] Message Routing: Metric for number of files in checkpoint
  • [1e90c37f4] Message Routing: Better mr_routed_payload_size_bytes description
  • [46595aed4] Message Routing,Interface: Make fields in RequestMetadata non-optional.
  • [c65c46810] Networking: build call service with builder and expose it
  • [b315a1724] Networking: build query service with builder and expose it
  • [9c4c77455] Networking: build read state service with builder and expose it
  • [9514174fd] Node: Replace libusb with rusb
  • [fba6c0ab4] Node: Remove ipv4 nameserver propagation and hard-code values into generate_network_config.rs
  • [84e4f37ed] Node: Remove ipv6_subnet parameter remnants
  • [252d3c1c0] Runtime: Combine replica controller crate
  • [15418fb44] Runtime: Combine launcher binary with sandbox crate
  • [89302e259] Runtime: Combine sandbox binary and backend_lib crates
  • [037419050] Runtime: Sync wasmparser and wasm-encoder to wasmtime
  • [e198508fd] Runtime,Execution: Use let-else in embedders and execution_environment

Refactoring:

  • [d5fd7bf03] Consensus: Refactor malicious_code in ecdsa component
  • [a4895b9db] Execution,Message Routing: Provide a more efficient ReplicatedState::message_memory_taken() method
  • [0e23b0a01] Execution,Runtime: Optimize and simplify heartbeat and global timer scheduling
  • [76804dbde] Networking: rename some types to match better the intent
  • [24abd67ac] Networking,Message Routing: move the StateSyncArtifactId into the P2P interfaces
  • [2e614cf9d] Networking,Message Routing: Make the error code when adding chunks sane

Tests:

  • [f8b728841] Consensus(ecdsa): Add more unit tests for signer with improved latency
  • [0794a082d] Consensus: fix assert_consistency test utility function
  • [597bca3a2] Consensus: Refactor validator unit test dependencies
  • [4ff506b7b] Consensus: Share aggregator unit tests
  • [6dea12aca] Consensus: add malicious ecdsa test to consensus test framework
  • [98eceedbf] Consensus(ecdsa): Purge unmatched quadruples referencing old key transcripts once certified height reaches the latest summary height
  • [09bb6f365] Consensus: Run ecdsa component in consensus test framework
  • [a41412346] Crypto: add local vault tests for idkg_load_transcript
  • [0c216754a] Crypto: remove large flag from crypto integration tests
  • [4864e7f44] Crypto: move crypto integration test utils into crates
  • [496008542] Crypto: Add test that IDKG transcript sizes are as expected
  • [e5b319595] Crypto: fix cargo test for remote vault tests
  • [a450f4da2] Crypto: add local vault tests for idkg_create_dealing()
  • [d35c2a7e1] Crypto: Mark NIDKG transcript size test as #[ignore]
  • [6cddb86d8] Crypto: better names for Nodes::receivers and some other fns
  • [e1ebc89a9] Crypto: split remote vault integration tests
  • [09f53f4c3] Crypto: extend IDKG test infrastructure
  • [f6a72969b] Crypto: add EccPoint deserialization fuzzer
  • [82c6a46e5] Crypto: Add a cheating dealer test
  • [7be5fcccd] Crypto: fix test_combined_secret_key that fails if num_receivers=0
  • [27e0d682d] Crypto: use clib functions to corrupt dealings
  • [1985b9a43] Crypto: fix should_verify_transcript_reject_reshared_transcript_with_dealings_swapped
  • [ac6d92d0e] Crypto: fix flakiness in a remote vault test
  • [df4d8328d] Execution,Consensus: System Tests for QueryStats feature
  • [2ad42420b] Execution,Runtime: Enable DTS by default in execution and scheduler tests
  • [3beed4393] Execution,Runtime(run-895): Add tests to check whether the DTS slicing for dirty memory copy is disabled for system subnets.
  • [efffe3bbd] Execution,Runtime: Add more fetch_canister_logs tests to verify submitting ingress messages
  • [a7fbbd64b] Message Routing,T&V: ensure malicious state sync chunks are rejected
  • [125126042] Networking(http-endpoint): Refactored helper function to start endpoint for integration tests
  • [b398704c6] Networking(consensus_manager): fix flakiness in unit tests
  • [058672d9b] Node,T&V: Pass IPv6 prefixes via test driver
  • [d83a79d4e] Runtime: Remove extra test block in integration test

Documentation:

  • [2dfa7b678] Node: Update docs on where config info comes and goes

Other changes:

  • [5b07a6bd7] Boundary Nodes,Node: disable throttling for now
  • [35bb6c0ee] Boundary Nodes,Node: refactor nginx config
  • [f0373c631] Boundary Nodes,Node: () Limit the number of open tcp connections in BN per ip
  • [8868cfa22] Boundary Nodes,Node: feat() fix nginx config to match v1.25.3
  • [da9fc0553] Consensus: Fix QueryStatsFeatureGate
  • [261b6a947] Consensus,Execution,Message Routing(crypto): extract time test utilities into separate crate
  • [fd9304538] Consensus,Execution,Runtime,Interface: Make Time method names more idiomatic
  • [86c70dfde] Consensus,NNS,T&V,Node(ipv4-for-nodes): make orchestrator pick up IPv4 config
  • [2710b1a8a] Crypto: remove obsolete dependency on ic-crypto
  • [7a512af09] Crypto: remove unused dependencies in crypto code
  • [7b65c7a6e] Crypto: remove obsolete getrandom-for-wasm dependency from non-canister code
  • [19533e685] Execution: Avoid traps in the ic0.call_perform System API
  • [f63f066d7] Execution,T&V: Revert " Enable chunked install"
  • [e7c2f86a2] Execution,Message Routing,Interface: Assume that Callback::originator, respondent and prepayment_.* are always present
  • [563434696] Execution,Runtime: Add new metric for advance install code executions
  • [3253f9d58] Execution,Runtime: [hotfix]chore: Query Cache: Revert max expiry time to be below 5 min
  • [42cfbcff4] Execution,Runtime: Enable chunked install
  • [88160bf34] Execution,Runtime: Charge for chunked install on hash mismatch
  • [879331e82] IDX,Consensus,Cross Chain,Execution,Runtime: fix existing cargo clippy errors and make sure we run cargo clippy on the whole repository only with relevant lints
  • [76f5ebf17] Message Routing: () read api_boundary_nodes from the registry
  • [ad2d9b3b9] Message Routing,Execution: () Save Api Boundary Nodes in ReplicatedState metadata
  • [3c9dd30e5] Networking,Message Routing,Runtime: remove the dependency on the old static_assertions crate
  • [ef23f44ad] Networking,NNS: remove various unused dependencies
  • [58aa6d6cb] Node: Updating container base images refs [2024-02-06-1029]
  • [f7c33787c] Node: Updating container base images refs [2024-02-01-0814]
  • [cb783e1aa] Node: Clean up systemd services
  • [bdd6081e6] Node: Adjust nftables ratelimits
  • [5021a28c2] Node: Updating container base images refs [2024-01-31-0730]
  • [390eab8b4] Node: Bump vsock version
  • [3d7899f69] Node: Updating container base images refs [2024-01-25-0815]
  • [d24bf43e6] Node: Updating container base images refs [2024-01-24-1007]
  • [1c069ced6] Node: SEV Cleanup
  • [954e4a850] Node: Updating container base images refs [2024-01-23-0736]
  • [2c806877c] Node: Updating container base images refs [2024-01-18-0814]
  • [78c333172] Node: Reduce network dependencies for replica service
  • [3c8292125] Node: Updating container base images refs [2024-01-17-1411]
  • [0dc5c64d4] Node: Updating container base images refs [2024-01-15-1422]
  • [bf0a7cbbb] Node: Updating container base images refs [2024-01-11-1533]
  • [99bffe617] Node,T&V: Update firewall configuration through SetupOS and HostOS
  • [69386e157] Node,T&V: () Add domain at node registration
  • [d43bca767] Runtime: Improve DTS slicing for memory copy and remove from system subnets
  • [cfd9466de] Runtime: Allocate memory as requested by Wasmtime
  • [273b11aef] Runtime: Remove LinearMemory trait
  • [5a76af861] Runtime: Upgrade wasmtime to 15.0.1
  • [8f2ae8ec7] Runtime,Execution: Add DTS slicing for messages that touch many pages.
  • [630ea70d0] Runtime,Message Routing,Execution: Populate Request Metadata

IC-OS Verification

To build and verify the IC-OS disk image, run:

# From https://github.com/dfinity/ic#verifying-releases
sudo apt-get install -y curl && curl --proto '=https' --tlsv1.2 -sSLO https://raw.githubusercontent.com/dfinity/ic/8d4b6898d878fa3db4028b316b78b469ed29f293/gitlab-ci/tools/repro-check.sh && chmod +x repro-check.sh && ./repro-check.sh -c 8d4b6898d878fa3db4028b316b78b469ed29f293

The two SHA256 sums printed above from a) the downloaded CDN image and b) the locally built image,
must be identical, and must match the SHA256 from the payload of the NNS proposal.

The feature release proposal follows here:

Elect new replica binary revision 3e25df8f16f794bc93caaefdce41467304d1b0c7

Release Notes:

Features:

IC-OS Verification

To build and verify the IC-OS disk image, run:

# From https://github.com/dfinity/ic#verifying-releases
sudo apt-get install -y curl && curl --proto '=https' --tlsv1.2 -sSLO https://raw.githubusercontent.com/dfinity/ic/3e25df8f16f794bc93caaefdce41467304d1b0c7/gitlab-ci/tools/repro-check.sh && chmod +x repro-check.sh && ./repro-check.sh -c 3e25df8f16f794bc93caaefdce41467304d1b0c7

The two SHA256 sums printed above from a) the downloaded CDN image and b) the locally built image,
must be identical, and must match the SHA256 from the payload of the NNS proposal.

6 Likes

Tiny mismatch for SetupOS

2 Likes

@dmanu @sat @nikola-milosa @pietrodimarco

It looks like this was a big change (based on the number of commits). Do you mind explaining why there wasn’t a Replica Version Management proposal last week? Is this a new normal to submit these proposals every other week or was this just an anomaly? I’m interested from a resource planning perspective for CodeGov. Thanks for any insight you can provide.

Hi there!

Yes, last week there was no RVM proposal, because we were fixing tests that weren’t passing. This week is extraordinary because of that – the changes from last week accumulated into this week’s release.

We’re looking into the SetupOS nonreproducibility issue. Please be patient.

3 Likes

What is the intention behind the addition of IP addresses to rosetta?
" * [aa735436a] Boundary Nodes,Node(boundary-node): extend rosetta list"

1 Like

@dmanu @sat @nikola-milosa @pietrodimarco

Would you please help us recalibrate expectations on the IC-OS Verification. DFINITY has spent a considerable amount of time working on the reproducibility of the SetupOS and HostOS hash and most of the time it works, but historically only the GuestOS hash was expected to match between the sha256sum of the local build, the sha256sum of the downloaded build, and the sha256sum reported in the proposal payload. Even though the SetupOS and the HostOS match most of the time these days, the payload of the proposal still only references the GuestOS.

The mismatch of the SetupOS hash in the proposal this week poses an interesting dilemma. I’ve only seen one reviewer for CodeGov who posted a match for SetupOS, but 4 other reviewers so far have reported a mismatch. My inclination is that our team should vote to Reject this proposal since the IC-OS Verification script that is provided in the proposal does not result in reproducible builds for all sha256 sums that are evaluated by the script. The Summary of the proposal no longer indicates that IC-OS Verification should only include GuestOS even though that is the only sha256sum and the only release package url that is listed in the payload.

Should there be an expectation that IC-OS Verification matches for GuestOS, SetupOS, AND HostOS or do we still only need to make decisions based on GuestOS? Personally, I want to hold us to the higher standard of expecting all builds to match, but just want to verify if this position is consistent with expectations presented in the proposal.

For reference, here is a link to the thread on OpenChat where CodeGov reviewers are posting their results. They still have 24 hours to complete their reviews, so it is actively changing at this time. Some people post their IC-OS verification first and then come back to post their full review when they are done reviewing the code.

3 Likes

For me, the build script gets stuck here and doesn’t build any other os image

2 Likes

@wpb Reproducibility is a very hard problem, but also very important for us. So far we have spent a considerable effort in 1) making builds be reproducible at all, an 2) making CI infrastructure that will catch non-reproducibility. In the last few weeks I have also noticed a few CI heisenfailures in the repro checks but they would always succeed when we tried to repro locally.

Regarding guest/host/setup OS, in these proposals we care about guest. In the host version elect proposals we care about host OS. And setup OS is only used when deploying new nodes. We have no proposal for setup OS, yet, since NPs are free to pick any version to deploy a new node and there is no way to prevent them from doing so. We can consider adding a proposal for setup os as well but that that should be considered as a separate activity from the guest os upgrades.

I will reach out to the idx team to ask them to take a deeper look into the reproducibility failures because more than one person reported them now it’s not internal heisenbugs only anymore, since you have seen them in the wild as well and maybe they can get the binaries that were built by the code gov members and maybe that helps for debugging and finding the root cause of the issues.

And there’s one more question that I like to tackle here and that is how many reproductions do we need in order to claim a successful rebuild and verify that a build is indeed what to claim it to be. Why do we need reproducibility in the first place? With reproducible builds we actually verify that the code does not contain any unintended changes which means that the only thing we need is two or three independent and trustworthy rebuilds with the same sha256 sum. Do we have those? Or do you all get different results?

I’ll also ping @marko is he was most active in developing the reproducible builds and the corresponding CI checks.

3 Likes

Have checked just now, so far:

  • 6 reviewers failed to build or match
  • 2 reviewers succeeded

You can see full thread here:

Something happened, because for months (dozens of proposals), the builds were being reproducible.

It was noted by one reviewer that the order mattered. Building last version (v694), and then v692, allowed v692 to pass. (I haven’t tested this though)

Thanks for looking into this.

2 Likes

@tiago89 @wpb it would be great if you could try to prune podman images and then try again. In the past we had some problems with stale images being used.

1 Like

In this case, it seems that both a clean install or a “cached” case, was resulting in failed builds.

The stale images might help with passing though.

In my bot, it is doing a “cleaner” step that is harder than “podman prune”.

It is doing a rm tmp and .cache. Then a podman container cleanup --all --rm --rmi.

Source: runner-replica/fargate_task/src/runners/cleanerRunner.js at main · CodeGov-org/runner-replica · GitHub

Since both the bot and the reviewers had problems, it leads me to think it’s unrelated to stale images.

2 Likes

Also, please see below the full logs of the bot :slight_smile:

It might have details that help.

1 Like

The CodeGov neuron has voted to reject proposal 127692 based on consensus of our voting members who are configured as Followees. There are a variety of reasons why we voted to reject, which I will summarize later today when all the reviews are complete.

1 Like

Reviewers for the CodeGov project have completed our review of these replica updates.

Proposal ID: 127692
Vote: REJECT
Full report: CodeGov community Replica Version Management Reviews channel on OpenChat

Proposal ID: 127294
Vote: REJECT
Full report: CodeGov community Replica Version Management Reviews channel on OpenChat

At the time of this comment on the forum there are still 2 days left in the voting period, which means there is still plenty of time for others to review the proposal and vote independently.

We had several very good reviews of the Release Notes on these proposals by @Zane, @cyberowl, @ZackDS, @massimoalbarello, @ilbert, @Gekctek, and @hpeebles. The IC-OS Verification was also performed by @jwiegley and @tiago89. I recommend folks take a look and see the excellent work that was performed on these reviews by the entire CodeGov team. Feel free to comment here or in the thread of each respective proposal in our community on OpenChat if you have any questions or suggestions about these reviews.

2 Likes

As mentioned previously, we had 3 reviewers who could not reproduce the build for the SetupOS and 4 reviewers who had complete build failures and could not verify any of the hashes that are included in the IC-OS verification script for proposal 127692. However, there were 2 reviewers who were successful. We understand that the primary objective is to verify the GuestOS, but most of our reviewers opted for a more conservative position by voting to reject 127692 due to the inconsistencies that we were observing.

It is also noteworthy that @hpeebles, @ilbert, and @massimoalbarello commented that there were many commits included in this release that we have seen in previous proposals. @massimoalbarello noticed that at least one of these prior commits was not preceded by a proposal to revert the previous changes. Hence, it raises questions and some concern about whether or not there was a mix-up in the release notes or which commits should be included in this release.

For proposal 127692, we had 2 vote Adopt and 7 vote Reject, which was sufficient to reach consensus to reject.

It was also observed that the similarities between these proposals (since 127694 just enables one additional feature relative to 127692) made several reviewers (@Zane, @cyberowl, @tiago89) uncomfortable adopting proposal 127694 even though the ic-os verification passed. Several other reviewers (@ZackDS @massimoalbarello @ilbert) adopted 127694 while also expressing concern that the right answer might be to reject. I think @Zane made a good argument in his description of how he voted when he said “as other reviewers have reported I’ve had issues consistently reproducing the hashes for SetupOS. During the first attempt I got mismatching hashes, Interestingly, after building proposal 127694, I retried and, this time, all images validated successfully. Due to this inconsistencies and considering there is no urgency in pushing this build on mainnet, I’ve voted to reject it.”

For proposal 127694, we had 6 vote Adopt and 3 vote Reject, which was not sufficient to reach consensus (since we have 12 Followees configured for this proposal topic). Hence, I have taken the action to manually vote to Reject this proposal after giving consideration to the feedback provided by our reviewers. This is the more conservative approach. In almost all cases the choice to Adopt a Replica Version Management is very easy because the CodeGov reviewers are unanimous. In this case, there were too many discrepancies for us to sign off on these proposals.

We hope that folks will take a closer look at our reviews for this proposal here (127692) and here (127694).

3 Likes

Since SetupOS is used as a streamlining function for hypervisor and virtual machine installation by node providers, i suppose it is not as critical to have SetupOS version mismatch.

However since you can have a many-to-many between HostOS & GuestOS and presumably newer versions of HostOS may have better security & other guarantees at the hypervisor level, i have an indirect question.

Is there impact analysis done with installing a newer GuestOS on a older HostOS? The real intent of this question is that with which HostOS is the proposed GuestOS tested?

Hello there!
After thorough examination, it has come to our attention that these proposals contain incorrect Release Notes, which could potentially lead to confusion and misalignment within our ecosystem.
In light of this, Dfinity is proposing that we reject these current proposals (127694, 127692).
Furthermore, to address the issues identified and ensure that we are moving forward with clarity and precision, we are preparing to introduce two new proposals as replacement for (127694, 127692). These proposals will exclusively include changes from latest rolled out RC (release-2024-01-25_14-09), ensuring that all modifications are accurately reflected and communicated.
We kindly ask for your support in this matter by voting to reject the current proposals (127694, 127692) with incorrect Release Notes. Your participation is invaluable as we strive to make decisions that best serve the interests of our community and the ongoing development of the ICP ecosystem.
Thank you for your attention to this important matter.

11 Likes

Hello there!

We are happy to announce that replacement proposals for (127694 , 127692 ) are ready for voting.


The first NNS proposal is here: IC NNS Proposal 127706.
Here is a summary of the changes since the last release:

Features:

  • [65d3fcad4] Consensus(ecdsa): tECDSA signer component for reduced latency
  • [40ef17f4d] Consensus(ecdsa): Payload builder/verifier for improved tECDSA latency
  • [735ebc8ef] Consensus(ecdsa): Populate key_unmasked_ref in PreSignatureQuadrupleRef
  • [aab055b89] Consensus(ecdsa): Move new tECDSA priority function behind feature flag
  • [1dd119090] Consensus: Add hard bound on notarization/certification and notarization/CUP gap
  • [8d45d7035] Crypto: accept also unmasked random origin of kappa in PreSignatureQuadruple
  • [f1b703465] Crypto: Add RandomUnmasked transcript operation type
  • [895ee8bcb] Execution,Runtime: Canister can use itself as store for chunked install
  • [9ad470e83] Execution,Runtime: Metric for large intra subnet calls
  • [939a5e672] Execution,Runtime: Add fetch_canister_logs simple query call stub
  • [9ff6758fe] Execution,Runtime: Change instruction buckets to be more granular for install_code messages
  • [e841826d5] Execution,Runtime: Add log_visibility logic when checking whether an ingress message should be accepted
  • [ed10b3961] Node: Remove old vsock code
  • [96127923c] Node: Update config.ini with ipv4 values

Bugfixes:

  • [e2808ad89] Boundary Nodes,Node: remove content type nosniff
  • [c4ac0218b] Boundary Nodes,Node(custom-domains): nginx statement order
  • [402549862] Consensus(artifact_pool): Remove warning if unvalidated artifact to be removed doesn’t exist
  • [31ade35ab] Consensus: use a proper key type for the ECDSA pool keys
  • [67730a29e] Consensus: Stop ingress selector prematurely removing canisters from selection
  • [6fed0c308] Consensus: Fix parameter order in CUP maker error log
  • [e30aaaf82] Execution: only reveal cycles top up balance of frozen canisters to controllers
  • [b34272ac0] Execution,Message Routing: Improve fairness of the Ingress Messages to Management Canister
  • [dfcdd4e90] Execution,Runtime: Query Cache: Avoid caching errors
  • [a8a1f93e7] Execution,Runtime: Query Cache: Expire data certificate after 1 min
  • [c2a3d1575] Node,T&V: Revert IPV6 prefix changes
  • [570512674] Runtime: Raise unzipped Wasm limit

Performance improvements:

  • [d205e8fb2] Crypto: extend IDKG complaint benchmarks with other modes
  • [7405dfaae] Crypto: Add support for multiplication by generator

Chores:

  • [aa735436a] Boundary Nodes,Node(boundary-node): extend rosetta list
  • [6fb24fcaa] Boundary Nodes,Node(custom-domains): clean up after SW removal
  • [a212196cc] Consensus(orchestrator): Remove unused metrics
  • [5d9e3bf5a] Crypto: Bump tarpc version to 0.34
  • [1b47dee89] Crypto: Avoid using use ic_types::*
  • [0026fd4e2] Crypto: Remove dependency on ic-types in seed crate
  • [40717aafc] Crypto,Interface(crypto): Add AlgorithmId for threshold BIP340 Schnorr
  • [5280b28d9] Execution,Runtime: Fix ic00_permissions comment for fetch_canister_logs
  • [9b5356b65] Execution,Runtime: Cleanup execute_subnet_message match statement
  • [d2bfbb4cd] Execution,Runtime: Add fetch_canister_logs feature flag
  • [fcc524f77] Message Routing: Metric for number of files in checkpoint
  • [1e90c37f4] Message Routing: Better mr_routed_payload_size_bytes description
  • [c65c46810] Networking: build call service with builder and expose it
  • [9514174fd] Node: Replace libusb with rusb
  • [fba6c0ab4] Node: Remove ipv4 nameserver propagation and hard-code values into generate_network_config.rs
  • [252d3c1c0] Runtime: Combine replica controller crate
  • [15418fb44] Runtime: Combine launcher binary with sandbox crate
  • [89302e259] Runtime: Combine sandbox binary and backend_lib crates

Refactoring:

  • [a4895b9db] Execution,Message Routing: Provide a more efficient ReplicatedState::message_memory_taken() method
  • [0e23b0a01] Execution,Runtime: Optimize and simplify heartbeat and global timer scheduling
  • [76804dbde] Networking: rename some types to match better the intent

Tests:

  • [f8b728841] Consensus(ecdsa): Add more unit tests for signer with improved latency
  • [0794a082d] Consensus: fix assert_consistency test utility function
  • [a41412346] Crypto: add local vault tests for idkg_load_transcript
  • [0c216754a] Crypto: remove large flag from crypto integration tests
  • [4864e7f44] Crypto: move crypto integration test utils into crates
  • [496008542] Crypto: Add test that IDKG transcript sizes are as expected
  • [e5b319595] Crypto: fix cargo test for remote vault tests
  • [a450f4da2] Crypto: add local vault tests for idkg_create_dealing()
  • [d35c2a7e1] Crypto: Mark NIDKG transcript size test as #[ignore]
  • [6cddb86d8] Crypto: better names for Nodes::receivers and some other fns
  • [e1ebc89a9] Crypto: split remote vault integration tests
  • [09f53f4c3] Crypto: extend IDKG test infrastructure
  • [f6a72969b] Crypto: add EccPoint deserialization fuzzer
  • [82c6a46e5] Crypto: Add a cheating dealer test
  • [df4d8328d] Execution,Consensus: System Tests for QueryStats feature
  • [2ad42420b] Execution,Runtime: Enable DTS by default in execution and scheduler tests
  • [3beed4393] Execution,Runtime(run-895): Add tests to check whether the DTS slicing for dirty memory copy is disabled for system subnets.
  • [efffe3bbd] Execution,Runtime: Add more fetch_canister_logs tests to verify submitting ingress messages
  • [0014352d0] Message Routing: State machine based LSMT test
  • [125126042] Networking(http-endpoint): Refactored helper function to start endpoint for integration tests
  • [058672d9b] Node,T&V: Pass IPv6 prefixes via test driver

Other changes:

  • [5b07a6bd7] Boundary Nodes,Node: disable throttling for now
  • [86c70dfde] Consensus,NNS,T&V,Node(ipv4-for-nodes): make orchestrator pick up IPv4 config
  • [2710b1a8a] Crypto: remove obsolete dependency on ic-crypto
  • [7a512af09] Crypto: remove unused dependencies in crypto code
  • [7b65c7a6e] Crypto: remove obsolete getrandom-for-wasm dependency from non-canister code
  • [e7c2f86a2] Execution,Message Routing,Interface: Assume that Callback::originator, respondent and prepayment_.* are always present
  • [563434696] Execution,Runtime: Add new metric for advance install code executions
  • [ad2d9b3b9] Message Routing,Execution: () Save Api Boundary Nodes in ReplicatedState metadata
  • [ef23f44ad] Networking,NNS: remove various unused dependencies
  • [58aa6d6cb] Node: Updating container base images refs [2024-02-06-1029]
  • [f7c33787c] Node: Updating container base images refs [2024-02-01-0814]
  • [cb783e1aa] Node: Clean up systemd services
  • [bdd6081e6] Node: Adjust nftables ratelimits
  • [5021a28c2] Node: Updating container base images refs [2024-01-31-0730]
  • [390eab8b4] Node: Bump vsock version
  • [3d7899f69] Node: Updating container base images refs [2024-01-25-0815]
  • [99bffe617] Node,T&V: Update firewall configuration through SetupOS and HostOS

IC-OS Verification

To build and verify the IC-OS disk image, run:

# From https://github.com/dfinity/ic#verifying-releases
sudo apt-get install -y curl && curl --proto '=https' --tlsv1.2 -sSLO https://raw.githubusercontent.com/dfinity/ic/8d4b6898d878fa3db4028b316b78b469ed29f293/gitlab-ci/tools/repro-check.sh && chmod +x repro-check.sh && ./repro-check.sh -c 8d4b6898d878fa3db4028b316b78b469ed29f293

The two SHA256 sums printed above from a) the downloaded CDN image and b) the locally built image, must be identical, and must match the SHA256 from the payload of the NNS proposal.


The second NNS proposal is here: IC NNS Proposal 127707.

Features:

IC-OS Verification

To build and verify the IC-OS disk image, run:

# From https://github.com/dfinity/ic#verifying-releases
sudo apt-get install -y curl && curl --proto '=https' --tlsv1.2 -sSLO https://raw.githubusercontent.com/dfinity/ic/3e25df8f16f794bc93caaefdce41467304d1b0c7/gitlab-ci/tools/repro-check.sh && chmod +x repro-check.sh && ./repro-check.sh -c 3e25df8f16f794bc93caaefdce41467304d1b0c7

The two SHA256 sums printed above from a) the downloaded CDN image and b) the locally built image, must be identical, and must match the SHA256 from the payload of the NNS proposal.

3 Likes

@wpb @tiago89 we double and triple checked this time and the same version is now reproducible. IDX team will follow up separately with an explanation of what happened. Spoiler: one file in GuestOS has +2 seconds timestamp for some very odd and unexplainable reason. No other differences. They are still looking into this.

The current plan is to have these follow-up proposals adopted on Wed/Thu and roll out these versions until the end of the week. And then hopefully return to the regular release cycle.

4 Likes

Yes, and they are. Almost all CodeGov reviewers already attempted and all succeeded on their builds (as usual).

Thanks for all your attention :pray:

2 Likes