Voting for a new IC release - 2024-02-21_23-01

Hello there!

We are happy to announce that voting is now open for a new IC release. The NNS proposal is here: IC NNS Proposal 128087.

Change log since git revision 8d4b6898d878fa3db4028b316b78b469ed29f293

Features:

  • [26f30f055] Consensus: Purge non-finalized blocks and notarizations below the finalized height
  • [b4673936a] Consensus(ecdsa): Make key_unmasked_ref in PreSignatureQuadrupleRef required
  • [b733f7043] Consensus(ecdsa): Extend Quadruple state machine in preparation for random unmasked kappa
  • [6a4d8962c] Consensus(ecdsa): Make masked kappa config optional
  • [e76c5a374] Consensus(ecdsa): Stop relaying tECDSA signature shares
  • [2d63da24c] Consensus(ecdsa): Add optional kappa_unmasked config to QuadrupleInCreation
  • [24bda32f6] Crypto: Add Ed25519 utility crate
  • [473094dd3] Crypto: construct vault directly in the crypto component
  • [f74f81d40] Execution: implement saving canister log records in sandbox safe system state
  • [f61d37df1] Execution,Message Routing: Metric recording call durations
  • [7f96d2c24] Execution,Message Routing: Snapshot related types
  • [42086140a] Execution,Runtime: Query Cache: Support composite queries
  • [1b6d16dc0] Execution,Runtime: Query Cache: Track nested execution errors
  • [c991c9974] Execution,Runtime: Bump canister version on start/stop
  • [d2ac9f79c] Message Routing: Certify api boundary nodes in the state tree
  • [04fb976a1] Node(icos): Open up public telemetry port.
  • [50483cbe9] Runtime(fuzzing): Generate exports for wasm-smith
  • [636d33426] Runtime,Execution: collect canister logs for update and replicated query calls
  • [a888d6229] Runtime,Message Routing: Use single range overlays as base

Bugfixes:

  • [a04186acb] Consensus(ecdsa): Fix delivered quadruples metric
  • [f7832c20a] Consensus(ecdsa): Don’t replenish quadruples that were matched to signature requests for disabled keys
  • [d7569f06b] Consensus(guestos): initialize basic firewall upon ic-replica start.
  • [a21041c05] Consensus(ecdsa): Fix log prefix in get_completed_signature_from_context
  • [888f0ac06] Execution,Runtime: Fix composite query freezing threshold
  • [0fcf6c16f] Execution,Runtime: Change stored_chunks return type
  • [bbb03a280] Execution,Runtime(fuzzing): disable dts for execute_subnet_message_update_settings fuzzer
  • [224d7c73a] Networking: keep running the old P2P event loop that is responsible for maintaining correct transport connections
  • [0412fb6d7] Networking: log common errors once every minute
  • [5ab50b293] Networking: fix wrong error propagation
  • [c239950fc] Networking: use a tokio channel for the unvalidated artifacts
  • [a27b2f9ed] Node: Freeze clock when using faketime
  • [b5e0d6787] Node: IPv4 - IC-OS Tools: Prefer ‘domain’ over ‘domain_name’ to match all other arg names
  • [ead28e26a] Node: Add ability to inject ipv4 info into SetupOS image
  • [cc64e58eb] Node: IPv4 connectivity check
  • [012349154] Runtime: Allow export of imports

Performance improvements:

  • [e6ccca2e9] IDX,Crypto(crypto): add a pin_cpu flag to rust_bench and set to True in crypto benches
  • [b63589213] Runtime,Message Routing: Do not open overlay files twice

Chores:

  • [245c4878e] Boundary Nodes,Node(ic-boundary): add configuration at deploy
  • [f15d7613c] Consensus: change the visibility of several structs/enums/functions in /rs/consensus/src/ecdsa
  • [7512e739e] Consensus: when deleting an artifact from an LMDB pool, use cursor::get(last) instead of cursor::iter::last to get the last element in the database.
  • [63384bde1] Consensus: Adapt the signatures of some of functions to support multiple keys
  • [dc4f7ad58] Consensus: allow duplicates of the same block payload in the lmdb pool & remove the associated block payload when we remove a block proposal from the lmdb pool
  • [1742b201a] Consensus(ecdsa): Remove get_refs_and_update
  • [eff778a6e] Consensus: remove some From implementations for BlockPayload
  • [6464b3cb9] Crypto: remove obsolete visibility targets
  • [579b55479] Crypto: fix mistakes in IDKG benchmarks
  • [b07fff63c] Execution: Query Cache: Remove ic0.call_perform tracking
  • [accf7f40c] Execution,Message Routing: add storing canister_log_records to canister’s system_state
  • [8abd00f00] Execution,Message Routing: Fix the typo in IngressQueue comment
  • [051a5b1df] Execution,Runtime: Make canister logging feature flag more generic
  • [e33ff2842] Execution,Runtime: Query Cache: Tests with subnet messages
  • [defbdf5f9] Execution,Runtime: Measure the state height difference during query scheduling
  • [c3de2db62] Message Routing: Make members of StreamHeader private.
  • [b7c2aa9eb] Message Routing: Ensure strictly monotonic batch times
  • [deebcd984] Networking: use thiserror instead of strum
  • [5efe31e25] Networking(p2p): Add graceful shutdown to P2P sender
  • [45e7f0875] Networking: remove the timeout in the P2P/state sync
  • [11a3ded1d] Node: - Remove SEV-SNP from ic-os
  • [226246b0a] Node: Remove ipv6 name server propagation and hard-code values into networking code
  • [1059c10d0] Runtime: add method to save log messages when calling ic0.debug_print

Refactoring:

  • [0de37de38] Crypto: cleanup CSP constructors
  • [46399e04d] Crypto: initialize setup variables only when IDKG benchmark is requested
  • [2af225e49] Crypto: remove obsolete CryptoComponent::new_with_fake_node_id
  • [ce270ec25] Execution,Message Routing: Faster ReplicatedState::output_into_iter()
  • [23cfc89de] Interface,Networking: restrict the usage of unbounded channels
  • [de89c122c] Message Routing: Pass necessary functionality of StateLayout into Storage
  • [40858329f] Message Routing,Runtime: Simplify MemoryInstructions; get rid of IndexEntry
  • [acf4af3d0] Networking(p2p-state_sync): Removed duplicate cancellation token

Tests:

  • [88fd0dd5c] Consensus(ecdsa): Add unit tests for tECDSA payload builder with improved latency
  • [a37afc595] Consensus,Execution: remove mock_time() test function
  • [9106b6b07] Crypto: add crypto vault tests for idkg_load_transcript_with_openings
  • [289d08664] Crypto: add more tests for idkg_retain_active_keys
  • [9887b2009] Crypto: add tests of decoding to clib’s tecdsa
  • [125d88c5e] Crypto: add RandomUnmasked to complaint tests
  • [c40b1e684] Execution,Message Routing: Add roundtrip encoding tests for canister logging types
  • [61ae4eba0] Execution,Runtime: Use test_strategy for all EE proptests
  • [b740b0150] T&V,Message Routing: subnet IDs in StateMachine tests derived from the subnets’ public keys

Other changes:

  • [e7d3b50e5] Boundary Nodes,Networking,Crypto,IDX,T&V: use workspace versions for the rustls, tokio-rustls and hyper-rustls crates
  • [905f714de] Boundary Nodes,Node: 5s for 404 responses, enlarge item size, default timeouts for lock
  • [c93f9b18d] Consensus: Revert “fix(guestos): initialize basic firewall upon ic-replica start.”
  • [1e14f3c87] Execution,Runtime: Show dirty page info per message type
  • [f3d614b6e] Execution,Runtime,NNS: Rename ic00_types to management_canister_types
  • [6dfa13163] Networking: upgrade zstd and use workspace version for some p2p deps
  • [178fb9795] Networking,Execution,Message Routing: Add canister snapshots to ReplicatedState
  • [837725d65] Networking,NNS: upgrade the crossbeam crates
  • [d8e7e87ac] Node: Revert “Merge branch ‘base-image-refs-update-2024-02-15-0814’ into ‘master’”
  • [be03351ed] Node: Updating container base images refs [2024-02-15-0814]
  • [b51312bf7] Node: Updating container base images refs [2024-02-08-0815]
  • [1579c4614] Node,T&V,IDX: Revert “Merge branch ‘eero/re-apply-firewall’ into ‘master’”
  • [630f80e17] Node,T&V,IDX: Re-apply reverted firewall changes
  • [cd2cedacd] Runtime: feat:, Compilation in a separate proceess
  • [170c5bd4b] Runtime,NNS,Financial Integrations: bump Rust version to 1.76.0

IC-OS Verification

To build and verify the IC-OS disk image, run:

# From https://github.com/dfinity/ic#verifying-releases
sudo apt-get install -y curl && curl --proto '=https' --tlsv1.2 -sSLO https://raw.githubusercontent.com/dfinity/ic/2e921c9adfc71f3edc96a9eb5d85fc742e7d8a9f/gitlab-ci/tools/repro-check.sh && chmod +x repro-check.sh && ./repro-check.sh -c 2e921c9adfc71f3edc96a9eb5d85fc742e7d8a9f

The two SHA256 sums printed above from a) the downloaded CDN image and b) the locally built image, must be identical, and must match the SHA256 from the payload of the NNS proposal.

And this week we also have another special build with the new p2p consensus layer enabled. This feature was deployed last week to a number of subnets and performed very well.

Hey @Luka there were actually 3 Replica Version Management proposals (128086, 128087, and 128088) submitted to the NNS to change the replica. However, your forum post only references proposals 128087 and 128088.

You indicated that proposal 128087 is the new IC release and references the change log since git revision 8d4b6898d878fa3db4028b316b78b469ed29f293. That doesn’t seem to match the details of proposal 128087. However, it does match the details of proposal 128086. It seems like proposal 128087 might be a mistake and that proposal 128086 is the actual proposal for the new IC release.

Also, normally when we elect a new IC replica, we also retire old replicas. That is consistent with proposal 128086, but not proposal 128087.

Would you please explain further in case I am missing something? I’m inclined to recommend to the CodeGov team that we reject proposal 128087 and that we review proposals 128086 and 128088. Please advise if you think that would be an incorrect recommendation.

image

4 Likes

The proposals 128086 and 128087 are electing the same version. The key differences between them is that 128086 doesn’t have fallback update URL, and 128087 has a copy/paste error which includes text for both its release and the release proposed in 128088. (you can that there are 2 features sections and second one includes the same text as 128086.

Change log since git revision 8d4b6898d878fa3db4028b316b78b469ed29f293

Additionally, you’re right that proposal 128087 doesn’t retire replicas. This is suboptimal, but we can place another proposal to retire versions or retire them on the next release. The reason they’re missing from this proposal is that tooling already parses out active proposals and skips retiring versions that were proposed to retire in proposal 128086.

In conclusion - 128086 is not usable because it’s missing fallback URL, 128087 is usable but has an extra description (copy/pasted from next release in 128088 proposal) which doesn’t accurately describe the release, and doesn’t retire old versions which we can mitigated by placing another proposal.

Two possible solutions to resolve the issues we have here are: (1) adopt 128087 and place another proposal to retire old versions, or (2) reject 128087 and place another proposal that fixes all the issues. Proposal 128086 has to be rejected in any case.

1 Like

Thanks for the explanation @Luka. Unfortunately, it appears the CodeGov neuron has reached consensus on all three proposals already and our vote has been cast.

It appears we made the incorrect choice for proposal 128086 because we voted to adopt. I will summarize the reviews later when they are all complete, but I suspect that none of us realized that the second url is a fallback url. I don’t recall the functional purpose of it being discussed in the past among our team. I do recall thinking it was odd that there are two release package urls and that they are the same url, but never realized the payload actually needs both. I now realize this is something that I should have asked for clarification about previously, especially since it is information in the payload. Fortunately, DFINITY can and will still reject this proposal.

We voted to reject 128087. I would have been uncomfortable adopting 128087 anyway since the summary is not accurate due to the copy/paste error you described (the wrong change log since git revision and the claim that it implements the p2p for consensus). Hence, I’m glad we did reject that one in this case.

At this point, my suggestion is that both 128086 and 128087 should be rejected and a new proposal submitted to correct the issues.

Our vote on proposal 128086 highlights a flaw in how the CodeGov neuron is set up. We’ve known about this issue and have plans to mitigate it programmatically in the future. We are building an app that we will use as our primary tool for submitting reviews, summarizing results, casting votes, paying bounties, and keeping records. We are currently calling this the GaaS App (Governance as a Service), which will be open source for anyone to copy and will also be designed for other organizations to use off the shelf independent of CodeGov. Anyway, the voting mechanism that we plan to implement will account for the issue we experienced here where a small minority of reviewers identify an issue with a proposal that justifies a reject from the CodeGov neuron, but too many other reviewers have already cast their irreversible votes. Since our reviewers are configured as Followees for the CodeGov neuron, we currently cast our vote according to the consensus rules built into every neuron. A better approach would be for us to wait until all reviewers have completed their work and cast a vote after evaluating all information that comes from our reviews. It’s a more complicated voting mechanism to do it programmatically and we don’t have that feature available yet. We could do this manually at this time, but so far have elected to stick to automatic voting. It’s something we should reconsider based on this result.

4 Likes

Reviewers for the CodeGov project have completed our review of these replica updates.

Proposal ID: 128086
Vote: ADOPT
Full report: CodeGov community Replica Version Management Reviews channel on OpenChat
SPECIAL NOTE: After the CodeGov neuron voted by consensus of our Followees, we learned that there is an error in the payload of this proposals and it will need to be rejected. Please see this comment from @Luka as well as my explanation for this vote. For anyone who has not voted yet, we recommend that you REJECT proposal 128086.

Proposal ID: 128087
Vote: REJECT
Full report: CodeGov community Replica Version Management Reviews channel on OpenChat

Proposal ID: 128088
Vote: ADOPT
Full report: CodeGov community Replica Version Management Reviews channel on OpenChat

At the time of this comment on the forum there are still 2 days left in the voting period, which means there is still plenty of time for others to review the proposal and vote independently.

We had several very good reviews of the Release Notes on these proposals by @Zane, @cyberowl, @ZackDS, @massimoalbarello, @ilbert, @Gekctek, and @hpeebles. The IC-OS Verification was also performed by @jwiegley and @tiago89. I recommend folks talk a look and see the excellent work that was performed on these reviews by the entire CodeGov team. Feel free to comment here or in the thread of each respective proposal in our community on OpenChat if you have any questions or suggestions about these reviews.

4 Likes

Hey @Luka after reading all the reviews, I can confirm that nobody commented about the lack of a fallback url. Hence, I don’t think anyone realized it is required.

After further discussion among the CodeGov team, we have decided that we will not change our neuron Followee configuration at this time. Instead, we plan for all reviewers to check for correctness of the proposal payload. Also, @tiago89 will soon modify his automated build to add several checks regarding proposal correctness that will flag when there are discrepancies. This should suffice until we get a smart contract built into the GaaS App that will offer a more robust solution.

5 Likes

Hello @wpb and the CodeGov team.

Since we haven’t had a release last week, and the Friday’s NNS proposal for electing a new IC/Replica version had quality issues, we wanted to submit a new proposal today (Monday), hoping that it would be adopted quickly enough to be able to roll it out by the end of the week. Ideally starting the rollout from Wed, if stars align properly this time.

It is essentially the same as Proposal: 128087 - ICP Dashboard, just fixing the observed issue in the proposal payload.

Thanks a lot for doing the great work that you and the CodeGov team are doing!

6 Likes

Hey @sat the CodeGov team has reached consensus on this new proposal already and our neuron has voted to Adopt. Everyone has been able to build the replica successfully and nobody has concerns about the proposal details so far. Here is a link to our results, which are still being posted.

2 Likes

Thank you @wpb, that was blazing fast! :grin:
Much appreciated!

1 Like