Proposal to elect new release rc--2024-05-22_23-01

Hello there!

We are happy to announce that voting is now open for a new IC release.
The NNS proposal is here: IC NNS Proposal 130083.

Here is a summary of the changes since the last release:

Release Notes for release-2024-05-22_23-01-base (ec35ebd252d4ffb151d2cfceba3a86c4fb87c6d6)

Changelog since git revision 5ba1412f9175d987661ae3c0d8dbd1ac3e092b7d

Features:

  • 715c74e4f Crypto: remove the legacy TlsHandshake interface
  • 4b6c64f75 Execution,Message Routing: Implement all message stats in MessagePool
  • a5bb2a8ab Interface: Enable new storage layer
  • b8c05ba56 Message Routing: Don’t use the legacy internal API for establishing TLS connection, instead use the rustls configs
  • 9e2fc26c3 Node: Consolidate rootfs utils #8
  • 9a670a31b Node,Consensus: guestos: enable bouncer in API BN
  • cbf83e6ce Runtime: Support for wasm64 in native stable64
  • cd8c75eb3 Execution: Enable canister logging feature flag
  • c55eee381 Runtime,Execution: Separate ContractViolation errors

Bugfixes:

  • b81813125 Networking: use the rustls client config instead of calling perform_tls_client_handshake
  • 2ec1dbd9a Runtime,Execution: Use saturating multiplication to calculate byte transmission cost

Chores:

  • 141e59113 Execution: Implement routing for iDKG messages and extend network topology with signing subnets
  • dc3af91f8 Boundary Nodes,Node(boundary-node): shell script cosmetics
  • 52e0587d5 Boundary Nodes,Node(boundary-node): Add rate limit for new connections and global connection limit [S3\_UPLOAD]
  • bd620bdc0 Boundary Nodes,Node(boundary-node): cleanup firewall
  • 376ecf7fd Consensus(schnorr): Populate master public key ID in EcdsaKeyTranscript
  • 1b8c37441 Consensus(ecdsa): Use new payload layouts for all newly created ECDSA payloads
  • cb5715b32 Crypto: adjust crypto_fine_grained_verify_dealing_private_duration_seconds buckets
  • e60766877 Crypto: Modify domain separator logic in threshold signatures
  • 70268d330 Execution: Remove data_certificate from replicated query
  • 13daaafe5 Interface: truncate replica logs
  • 1da1a0d27 Networking: use hyper directly in http endpoint instead of axum server
  • 9fd5bdca0 Node: Organize ic-os/cpp
  • d1944bf4a Node: Update SetupOS failure message to direct to Node Deployment troubleshooting guide
  • a2abc42c3 Node,Consensus(api-bn): remove ICMP type parameter-problem
  • 335781a96 Runtime: Split system api list for validation
  • 36ed9bc99 Runtime,Execution: Add instruction limit to the relevant error message

Refactoring:

  • d0af7abc2 Crypto: don’t pull tokio-rustls into crypto and remove unused deps from orchestrator
  • 257a8fa77 Crypto: use rustls instead the re-exported module from tokio-rustls
  • ba603cf0e Crypto: use the TlsConfig trait instead of TlsHandshake
  • 4ca4593c1 Execution,Message Routing: Wrap Callbacks in an Arc
  • 1579e0d85 Execution,Runtime: Clean up process_stopping_canisters
  • 35661b175 Networking,Message Routing: (re)move unneed rustls deps

Tests:

  • 8a3a779ec Crypto: add different message sizes in basic sig bench
  • 01ca07876 Crypto: improve parameters in IDKG reshare_key_transcript tests
  • 24adb86c3 Execution: Refactor system API availability tests
  • 6b13e7b88 Execution,Consensus(consensus): Add a query stats log message
  • 0e1b47806 Interface: Add round trip encoding test for StopCanisterContext
  • d7a771575 Networking(http-endpoint): Parameterise integration tests with rstest crate
  • db017130c Networking(http-endpoint): Remove agent-rs dependency for integration tests.
  • d522089b7 Networking(http-endpoint): Remove agent-rs dependency for read_state tests
  • 497e403b4 Networking(http-endpoint): Remove agent-rs dependency for Query tests
  • 6fe66d4f9 Networking(http-endpoint): Remove agent-rs dependency for update call tests
  • 2ed2eb131 Node: Remove config validation test

Other changes:

  • 8d09e42d8 Execution,Interface,Consensus: Remove old reject code field
  • d32e03dde Node: Updating container base images refs [2024-05-16-0816]
  • 4e221b549 Node,Consensus: () rate and connection limits in nftables for API BN

IC-OS Verification

To build and verify the IC-OS disk image, run:

# From https://github.com/dfinity/ic#verifying-releases
sudo apt-get install -y curl && curl --proto '=https' --tlsv1.2 -sSLO https://raw.githubusercontent.com/dfinity/ic/ec35ebd252d4ffb151d2cfceba3a86c4fb87c6d6/gitlab-ci/tools/repro-check.sh && chmod +x repro-check.sh && ./repro-check.sh -c ec35ebd252d4ffb151d2cfceba3a86c4fb87c6d6

The two SHA256 sums printed above from a) the downloaded CDN image and b) the locally built image, must be identical, and must match the SHA256 from the payload of the NNS proposal.

3 Likes

I see the Proposal: 130082 - ICP Dashboard has the same revision (ec35ebd2) as this one, except that it lists a bigger amount of commits and has a different build hash.

Can someone explain in more detail?

2 Likes

Yes,

My automatic script and even manual, wasn’t able to fetch the proposal 130082 (NOT the 130083, that went fine).

My automatic script log:
2024/05/24 | 09:53:44 | 1716544424 [+] Check the environment
2024/05/24 | 09:53:44 | 1716544424 [+] x86_64 architecture detected
2024/05/24 | 09:53:44 | 1716544424 [+] Ubuntu OS detected
2024/05/24 | 09:53:44 | 1716544424 [+] Version ≥22.04 detected
2024/05/24 | 09:53:44 | 1716544424 [+] More than 16GB of RAM detected
2024/05/24 | 09:53:44 | 1716544424 [+] More than 100GB of free disk space detected
2024/05/24 | 09:53:44 | 1716544424 [+] Update package registry
Hit:1 https://mirror.hetzner.com/ubuntu/packages jammy InRelease
Get:2 https://mirror.hetzner.com/ubuntu/packages jammy-updates InRelease [119 kB]
Hit:3 https://mirror.hetzner.com/ubuntu/packages jammy-backports InRelease
Get:4 https://mirror.hetzner.com/ubuntu/security jammy-security InRelease [110 kB]
Get:5 https://mirror.hetzner.com/ubuntu/packages jammy-updates/main amd64 Packages [1,683 kB]
Get:6 https://mirror.hetzner.com/ubuntu/packages jammy-updates/universe amd64 Packages [1,075 kB]
Get:7 https://mirror.hetzner.com/ubuntu/packages jammy-updates/universe Translation-en [246 kB]
Fetched 3,233 kB in 1s (4,756 kB/s)
Reading package lists…

2024/05/24 | 09:53:46 | 1716544426 [+] Install needed dependencies
Reading package lists…

Building dependency tree…
Reading state information…

jq is already the newest version (1.6-2.1ubuntu3).
curl is already the newest version (7.81.0-1ubuntu1.16).
git is already the newest version (1:2.34.1-1ubuntu1.10).
podman is already the newest version (3.4.4+ds1-1ubuntu1.22.04.2).
0 upgraded, 0 newly installed, 0 to remove and 2 not upgraded.
2024/05/24 | 09:53:46 | 1716544426 [-] Field .payload.replica_version_to_elect does not exist in proposal-body.json

And the printscreen when I manually tried (a few minutes ago):
Screenshot 2024-05-24 at 16.31.23
(I am running a repro check that is already a few months old)

1 Like

Hi Dfinity,

I noted that this proposal retires 12 IC-OS versions all in one go. I got curious about the details and scripted some analysis, which raised a couple questions for me. If you’re able to answer either of these when you get a chance that would be much appreciated.

  • Does Dfinity have a well defined policy for when it is and isn’t appropriate to retire a specific IC-OS version? i.e. a means of minimising risks, for example maintaining a sufficient number of versions that can be rolled back to in the event of a latent IC-OS bug.
  • Can I ask why Dfinity moved away from the ‘RetireReplicaVersion’/‘BlessReplicaVersion’ NNS functions in late 2023 (in favour of using a single ‘ReviseElectedGuestosVersions’ function)? I see the motion where RetireReplicaVersion was proposed, but not where it was replaced (presumably because it’s still available, just not being used). If a member of the community wants to object to the unelection of some IC-OS versions (due to rollback risk tolerance), but they have no issue with the election aspect of the proposal, it seems they have no option but the reject the whole proposal (which previously wasn’t an issue when the relevant NNS function actioned by the proposal was decoupled allowing separate proposals).

The current proposal seems like a potentially good example for the points above.

Based on IC-OS election proposal history, there currently appear to be 17 blessed replica versions stored in the registry canister (so that they can be readily deployed), 12 of which would be removed by this proposal. I’ve listed these below, ordered by elected date, and crossed out the versions that would be unelected/removed.

  • 19dbb5c, elected 2024-04-15 (proposal 129081), UNELECTION PROPOSED, running on 0 subnets
  • 02dcaf3, elected 2024-04-15 (proposal 129084), UNELECTION PROPOSED, running on 0 subnets
  • abcea3e, elected 2024-04-22 (proposal 129378), UNELECTION PROPOSED, running on 0 subnets
  • 0a51fd7, elected 2024-04-22 (proposal 129379), UNELECTION PROPOSED, running on 0 subnets
  • 33dd2ef, elected 2024-04-22 (proposal 129408), UNELECTION PROPOSED, running on 0 subnets
  • 4e9b02f, elected 2024-04-23 (proposal 129423), UNELECTION PROPOSED, running on 0 subnets
  • 687de34, elected 2024-04-24 (proposal 129427), UNELECTION PROPOSED, running on 0 subnets
  • 63acf4f, elected 2024-04-24 (proposal 129428), UNELECTION PROPOSED, running on 0 subnets
  • 80e0363, elected 2024-04-29 (proposal 129493), UNELECTION PROPOSED, running on 0 subnets
  • 5e285dc, elected 2024-04-29 (proposal 129494), UNELECTION PROPOSED, running on 0 subnets
  • bb76748, elected 2024-05-06 (proposal 129627), UNELECTION PROPOSED, running on 0 subnets
  • f58424c, elected 2024-05-06 (proposal 129628), UNELECTION PROPOSED, running on 0 subnets
  • 2c4566b, elected 2024-05-13 (proposal 129696), running on 1 subnets
  • 9866a6f, elected 2024-05-13 (proposal 129697), running on 0 subnets
  • 30bf45e, elected 2024-05-16 (proposal 129706), running on 0 subnets
  • 5ba1412, elected 2024-05-20 (proposal 129746), running on 3 subnets
  • b6b2ef4, elected 2024-05-20 (proposal 129747), running on 33 subnets

Relevant Subnet Version History

I’ve focused on the subnet IC-OS version history of a few of the most important subnets below. The current version is in bold, on the left of which are prior deployed versions (crossed out if due to be unelected), and on the right of which are versions that have not yet been deployed to that subnet and are not due to be unelected.

  • tdb26 (system), has been running 2c4566b since 2024-05-21 (4 days):
    • 19dbb5c,33dd2ef,687de34,80e0363,bb76748,2c4566b,9866a6f,30bf45e,5ba1412,b6b2ef4
  • uzr34 (system), has been running 5ba1412 since 2024-05-22 (3 days):
    • 19dbb5c,33dd2ef,687de34,80e0363,bb76748,2c4566b,bb76748,30bf45e,5ba1412,9866a6f,b6b2ef4
  • w4rem (system), has been running b6b2ef4 since 2024-05-23 (2 days):
    • 19dbb5c,33dd2ef,687de34,80e0363,bb76748,9866a6f,b6b2ef4,2c4566b,30bf45e,5ba1412
  • x33ed (application), has been running 5ba1412 since 2024-05-23 (2 days):
    • 19dbb5c,33dd2ef,687de34,80e0363,bb76748,2c4566b,5ba1412,9866a6f,30bf45e,b6b2ef4
  • pzp6e (fiduciary), has been running 5ba1412 since 2024-05-23 (2 days):
    • 19dbb5c,33dd2ef,687de34,80e0363,bb76748,30bf45e,5ba1412,2c4566b,9866a6f,b6b2ef4

In case there’s an unexpected need to rollback to the prior deployed version, it seems sensible to always leave at least one prior deployed version for each subnet remaining in the registry (otherwise the only option would be to roll foward, or await a new IC-OS release if necessary, which seems suboptimal or possibly dangerous due to needing to wait).

All but 1 of the subnets above will still be able to rollback to a prior version after this proposal is executed, but as far as I can tell tdb26 won’t be able to, it would only be able to roll forward to a version that has not yet been deployed to that subnet (or await a patch election). It’s been running that version for 4 days now, so I’m obviously talking about something that’s unlikely. But I’m still interested in whether there’s some sort of risk aversion policy being adhered to when it comes to unelecting specific IC-OS versions, and what a community member should be looking out for when asserting that such a policy is being adhered to.

Any context you’re able to provide would very helpful. Thanks in advance!

2 Likes

Given that there doesn’t appear to be a forum post for 130082 - Elect new IC/Hostos revision, I hope it’s okay for me to ask those questions here.

I’ve noted that HostOS version elections used to take place under the topic of Node Admin, but have now been switched to IC OS Version Election for clarity. However the history of Node Admin HostOS elections and deployments seems a little confusing. Am I right in observing that the last HostOS version to be elected (ec140b7) has never been deployed to a node, and instead they’re all still running the prior elected version (e268b98)? I’ve gathered this from reviewing Node Admin proposal history.

If I’ve understood this correctly, can I ask why ec140b7 was never deployed to any nodes? Presumably there’s no need to worry about compatibility issues with nodes upgrading from e268b98 straight to ec35ebd2 (skipping ec140b7)?

3 Likes

Reviewers for the CodeGov project have completed our review of these replica updates.

Proposal ID: 130082
Vote: ADOPT
Full report on OpenChat
Note: 2 CodeGov reviewers elected to Reject this proposal due to hash mismatches in the IC-OS Verification build. The other reviewers elected to Adopt. There was a lot of discussion in this proposal review thread linked above as well as posted here and here on the forum. There wasn’t a direct post on the forum for this proposal, so our questions seem to be scattered in different locations.

Proposal ID: 130083
Vote: ADOPT
Full report on OpenChat

At the time of this comment on the forum there are still 2 days left in the voting period, which means there is still plenty of time for others to review the proposal and vote independently.

We had several very good reviews of the Release Notes on these proposals by @Zane, @cyberowl, @ZackDS, @massimoalbarello, @ilbert, @hpeebles, and @Lorimer. The IC-OS Verification was also performed by @jwiegley, @tiago89, and @Gekctek. I recommend folks take a look and see the excellent work that was performed on these reviews by the entire CodeGov team. Feel free to comment here or in the thread of each respective proposal in our community on OpenChat if you have any questions or suggestions about these reviews.

3 Likes

Absolutely! The HostOS rollouts are not fully automated (yet) so the forum posts do not get automatically created. I guess it should be fine to reuse the same forum thread to discuss both.

You are right @Lorimer – impressive observation and analysis skills! The second HostOS version was caused by me heavily underestimating the time that will be needed to roll out a single HostOS version. So I submitted both election proposals hoping that the first one will be rolled out quickly, but not really calculating or checking of the following can be technically done. But then we needed to roll out very slowly (it was the first HostOS rollout ever), and the following could not be easily set up so the community needed to vote on each step, and in result we could have at most 2 steps per week. So then after a month of a rollout, the second version didn’t make much sense to roll out.

We have now reorganized the topics (thanks to the amazing @aterga who did all the work!) and the following for the rollout is already set up, so the HostOS rollout should be much faster and easier both for the community (== less voting) and for us.

3 Likes