Voting for a new IC release - 4ee6971

Hello there!

We are happy to announce that voting is now open for a new IC release .
The NNS proposal is here: IC NNS Proposal 125342 .

Here is a summary of the changes since the last release:

  • [1f1f92959] Consensus: feat(consensus,backup): Remove hashes from the backed up artifact names
  • [daeb71b7f] Consensus: fix: Use the same minimum chain length for consensus and certification
  • [6d3be90c1] Crypto: chore: Correct renamed function name in docs
  • [930bf0eda] Crypto: chore: Fix typo in function name
  • [3730aae62] Crypto: feat(cketh): Limit number of parallel operations in withdrawals ckETH → ETH
  • [c3db1dc4c] Crypto: feat: Only acquire SKS write lock when necessary for NI-DKG retain
  • [ad4d94cbf] Crypto: perf: Faster verification of discrete logarithm equality and product proofs
  • [61231a3ab] Crypto: perf: Take advantage of k256 MulByGenerator
  • [255265d14] Crypto: refactor: add DomainSeparator instead of many constants and check for duplicates/prefixes
  • [b7deb9cd7] Execution: chore: clarify log message
  • [77659e877] Message Routing: chore: Pass overlay files to PageMap::open
  • [156f47297] Message Routing: feat: hash_of_map supports nested Map
  • [db7cd6c94] Networking: build: use the tokio version from the workspace
  • [6e83a17ea] Networking: chore(p2p-consensus): Disable Pushing of artifacts that are relayed.
  • [a152fa88d] Networking: chore(p2p-consensus): Skip missed ticks in event loop
  • [297f119a1] Networking: chore: remove the CanisterHttpAttribute and use the full share as the key of the pool
  • [896c9017b] Networking: chore: remove the optimization that delays the addition of validated DKG artifact to the block
  • [bf7135051] Networking: chore: update the two dashboards to have the same json model as the prod dashboards
  • [e4e3b1c5f] Networking: feat(state_sync_manager): prefer peers with fewer outstanding chunk requests
  • [221b22685] Networking: fix(p2p-consensus): Populate send view on startup from validated pool
  • [d71613ced] Networking: fix(transport): Don’t hold peer state lock across TLS handshake
  • [2e3589427] Networking: fix: Revert “dbf1b2039d chore: Remove the filter templating and use directly the height”
  • [d84a1cecd] Networking: fix: Revert “feat(consensus): Don’t relay shares during retransmissions”
  • [687d7830d] Networking: fix: don’t expose publicly the production NNS_SUBNET_ID
  • [62401dd27] Networking: refactor: rename CertifiedStateReader to CertifiedStateSnapshot
  • [ac9caa930] Node: Add orchestrator task watching for HostOS upgrades
  • [d3a004e01] Node: Avoid a div by zero during upgrades
  • [eda91b67b] Node: Chore: Remove unnecessary ARM firmware
  • [82f2c35cb] Node: HostOS upgrade follow-ups
  • [d404e2885] Node: Ignore errors from old VSOCK code
  • [617325f64] Node: Wipe more headers when clearing disks in SetupOS
  • [a23727abb] Node: feat(): Ensure determinism in tar commands
  • [840e08501] Node: feat(): Remove pcsc-tools package
  • [7353e3e80] Runtime: , Change arguments of system_api host functions to unsigned
  • [c812b557f] Runtime: Enable new instrumentation on verified app subnets.
  • [a5e741d51] Runtime: Implement InstallChunkedCode
  • [bae0f8a22] Runtime: Remove unnecessary panic in wasmtime embedder
  • [15da2e805] Runtime: Track query stats for each canister individually when doing composite query calls
  • [074f6d16f] Runtime: fix: Set the weight of a Wasm function prologue to 1 instead of 0
  • Various tech-debt management: code refactoring, docs, bug fixes, test updates

Link to the forum post: Voting for a new IC release - 4ee6971

IC-OS Verification

To build and verify the IC-OS disk image, run:

# From https://github.com/dfinity/ic#verifying-releases
sudo apt-get install -y curl && curl --proto '=https' --tlsv1.2 -sSLO https://raw.githubusercontent.com/dfinity/ic/4ee6971d99024a38e06e4ab72119d0139dd6dd85/gitlab-ci/tools/repro-check.sh && chmod +x repro-check.sh && ./repro-check.sh -c 4ee6971d99024a38e06e4ab72119d0139dd6dd85

The two SHA256 sums printed above from a) the downloaded CDN image and b) the locally built image, must be identical, and must match the SHA256 from the payload of the NNS proposal.

1 Like

And a follow-up proposal for the build with QUIC-enabled p2p layer:

1 Like

Reviewers for the CodeGov project have completed our review of these replica updates.

Proposal ID: 125343
Vote: ADOPT
Full report: CodeGov portal on DSCVR

Proposal ID: 125342
Vote: ADOPT
Full report: CodeGov portal on DSCVR

Neuron: CodeGov
NeuronID: 2649066124191664356
Voting history: Dashboard
Website: codegov.org

At the time of this comment on the forum there are still 2 days left in the voting period, which means there is still plenty of time for others to review the proposal and vote independently.

Further summary details will be posted soon…

1 Like

Hey @sat there may be several improvement opportunities that were identified by the CodeGov team. We adopted both proposals, but there is at least one item that we want to bring to your attention could be an important issue worth escalating. We recommend that DFINITY assess further and consider rejecting these proposals. One of our reviewers, @cyberowl, REJECTED proposal 125342 because of an error he found in the code of a consensus update that he felt could be a significant issue. There were several other issues identified with this particular commit by several CodeGov reviewers.

image

Please see the list below for all improvement opportunities identified by the CodeGov team.

We also noticed that DFINITY picked up several if the issues we identified last week and corrected them this week. That was nice to see.

[1f1f92959]
Category: Consensus
Comment by @cyberowl

  • It worries me that block_proposalbin is wrong. It should be block_proposal.bin . I don’t know the implications but need to be extra harsh because it is under consensus topic.

Comment by @Zane

  • it appears there is a typo at line: 1553 , block_proposalbin , should be block_proposal.bin

Comment by @ilbert

  • The backup helper now passes --ignore-existing to rsync command. This skips updating files on the receiver. Not clear how this relates to removing hashes from artifact names.

Comment by @massimoalbarello

  • each artifact at a given height is stored in a bin file named with only the artifact specific name. However, they did not update the documentation of the file_location function accordingly.

[2e3589427]
Category: Networking
Comment by @Zane

  • This appears to be identical to c2e5826a7 from last week, no idea why this commit has been reverted twice. The expanded commit description also seems to confuse this revert with the one in d84a1cecd

Comments by @ilbert

  • The MR message is wrong. This MR is reverting dbf1b20 (even if it was already done in the previous proposal, see c2e5826a7 not d1ac316 as stated).
  • It’s not clear why dbf1b20 commit has been reverted two times now.

[d84a1cecd]
Comments by @ilbert
This MR reverts d1ac316 as stated instead. Not clear why this commit has been reverted though.

[3730aae62]
Comments by @ilbert

  • Not explained why we need to limit the parallel operations of the cketh minter canister, but I believe to have some sort of “rate limiting”.
  • I wasn’t able to find the piece code that does what’s stated in the 3.4 bullet point: The frequency of the timer will double (3min instead of 6min) as long as there is still work to do.

[62401dd27]
Comments by @ilbert

  • The variables at lines: R167, R169, R144, and R146 should be called *snapshot instead of *reader.
8 Likes

After giving it more thought, I believe the block proposal typo in 1f1f92959 is actually intentional, the subsequent assert! statement is only satisfied if the path doesn’t exist, which it shouldn’t due to the incorrect argument. However, it seems that the error messages have been mixed up.

4 Likes

Just to bump this out of curiosity since it was already executed it looks like nothing critical since the example they gave from testing seems to work as intended, I would guess it’s more of a refactoring typo since the assert!() macro uses the !proposal_path.exists(), and would fail, indicating that the backup of the non-final proposal failed in case the file exists.
So yeah feedback much appreciated @sat . Thanks

1 Like

I reached out to the component teams, it’s best if they respond here directly. Thanks for the pedantic review, keep up the good work!

3 Likes

About commit 3730aae commented by @ilbert

  • Not explained why we need to limit the parallel operations of the cketh minter canister, but I believe to have some sort of “rate limiting”.

The number of operations involved for withdrawals is rate-limited to ensure that there is not a huge queue of transactions that will take a very long time to be processed. This is because each withdrawal involves a threshold ECDSA signature + 1 HTTP outcalls to send the transaction via JSON-RPC.

  • I wasn’t able to find the piece code that does what’s stated in the 3.4 bullet point: The frequency of the timer will double (3min instead of 6min) as long as there is still work to do.

This point refers to this line in particular, where the interval for retrial defined by PROCESS_ETH_RETRIEVE_TRANSACTIONS_RETRY_INTERVAL is 3 minutes (instead of 6 minutes for the “regular” interval PROCESS_ETH_RETRIEVE_TRANSACTIONS_INTERVAL).

Thanks for the careful review! Don’t hesitate to let me know if some things are still unclear related to this topic.

4 Likes

Hi folks, I would like to address the question related to the commits in the “Networking” category.

As @ilbert correctly commented, the MR messaging indeed was wrong for 2e3589427. Indeed the MR is reverting dbf1b20. I apologise for the mistake.

As with respect to why it is reverted twice. We had problems with release qualification. So we had to revert the offending MR both on the branch that we wanted to rollout out and on the branch that is used for releasing new features.

Please let me know if you have more question,

Rosti

5 Likes

Hi Wenzel, thank you for highlighting the issues and well done on establishing a productive external review group! :pray:

Regarding the finding from @cyberowl: nice catch! There is no bug in the actual consensus code, but the test was passing for a wrong reason. We, the consensus team, fixed the test already: the fix will appear soon on the public github repository.

Thanks again, please keep up this great work!

5 Likes