TLDR: I’m voting to reject all three proposals. Not because I’m concerned about the actual content of the changes (I anticipate being alone in this, and do not intend on blocking these proposals). Instead I’m rejecting on prinicpal, for the sake of raising awareness and hopefully further discussion.
I think it’s critical that the GuestOS change log in the proposal summary does not intentionally or accidentally hide commits that do indeed alter GuestOS behaviour. I think proposals that do this should be seen as taboo, as a matter of principle (regardless of the scope of the changes in question). Allowing this to be normalised means building up dangerous precedent that can be used by bad actors in the future who may succeed in obfuscating malicious changes. I strongly believe that IC OS proposal summary change logs should be validated as a comprehensive list of everthing that modifies GuestOS behaviour. It’s my contention that proposals that do not meet this requirement should be rejected.
As far as I can see, this is the case for release 35153c7cb7b9d1da60472ca7e94c693e418f87bd. Given that the other two releases are based on that release, I see this as applying to them too.
133142
Build successful and hashes generated on my machine match (CDN and local build), and the GuestOS hash matches the proposal payload.
There are 87 commits since the previous release, 28 of which are referenced in this proposal. There are 45 files that have been modified both by commits referenced in this proposal as well as commits that weren’t. Browsing through these revealed one commit that was omitted from the proposal summary, but should have been included. In addition, the PocketIC commit is not a GuestOS change and ideally wouldn’t have been included in this proposal summary. Raised questions about these here.
All commits appear to match their commit messages well and seem reasonable. If you're interested in my comments to this effect for every commit, then please expand.
Features:
dbce2fa6d
Consensus,Interface(consensus): Increase block maker delay when there have been too many non-rank-0 blocks notarized (#1531)- This looks like a great change, particularly in light of the recent subnet config updates to significantly reduce notarisation delay. It’s similar to the sort of change I had in mind here →
-
a8464ac47
Execution,Interface: Execute subnet messages on aborted canisters (#1378)- The function
can_execute_msg
is refactored tocan_execute_subnet_msg
with additional logic to handle different states of canisters (paused, aborted). Ensures that only safe subnet messages are executed on aborted canisters.
- The function
-
9b242019f
Execution,Interface: Propagate hook execution status to SystemState (#667)- Introduces
OnLowWasmMemoryHookStatus
to track the status of the hook. Adds logic to update and check the hook status based on memory conditions. Updates protobuf definitions to include the new status. Adds tests to verify the new functionality.
- Introduces
-
b36319f9e
Execution,Interface: Capture backtrace in syscalls (#1505)- Modifies error handling to include backtraces in
HypervisorError
. Updates tests to check for backtrace inclusion.
- Modifies error handling to include backtraces in
-
a96b75d28
Execution,Interface: Canister Backtrace on Trap (#1449)- Adds backtrace capture to
HypervisorError::Trapped
. Updates tests to verify backtrace capture.
- Adds backtrace capture to
-
bfd6fa8fb
Execution,Interface: Validate initial wasm memory size for Wasm64 (#1534)- Adds validation logic to check initial wasm memory size. Updates tests to verify the new validation.
-
da884ed19
Execution,Interface,Message Routing: Callback expiration priority queue (#1532)- Again, the commit message accurately describes the changes made in the code. It mentions the implementation of a priority queue for callback expiration times, a function to return newly expired callbacks, and persistence for this queue.
-
2259be58d
Execution,Interface,Networking: Enable canister snapshots (#919)- Does what it says on the tin. Proposal 133144 provides a rollback for this commit, just in case.
-
a9e76c402
Interface,Message Routing: maybe_parallel_map util function (#1376)- Introduces a new utility function
maybe_parallel_map
to handle both parallel and sequential mapping based on the presence of a thread pool. The changes seem to be a straightforward refactor to reduce code duplication.
- Introduces a new utility function
-
220baf8e1
Interface,Networking(call-v3): Return a certificate for duplicate requests that are already executed (#1523)- Added logic to check if a message is already in the certified state before submitting it to the ingress pool
-
b2ce10e4a
Interface,Networking: Introduce metrics to the block stripper/assembler (#1488)- Introduces various metrics to track the performance and behavior of the block stripper/assembler. Metrics include counts of ingress messages, download durations, and block assembly durations.
-
6ab95d4e5
Owners(IDX): Bump bazel to 7.0.1 (#1578)- The commit message accurately describes the changes made, including the upgrade to Bazel 7.0.1 and the necessary adjustments to
rules_haskell
and default options.
- The commit message accurately describes the changes made, including the upgrade to Bazel 7.0.1 and the necessary adjustments to
Bugfixes:
ff4d43607
Consensus,Interface(consensus): Correct several artifact bounds and update docs (#1074)- Corrects the bounds for finalizations and update the related documentation
1914efda5
Consensus,Interface(consensus): Include equivocation proofs in artifacts returned by get_all_validated (#1579)- Ensures that equivocation proofs are included in the artifacts returned by
get_all_validated
, as described in the commit message.
- Ensures that equivocation proofs are included in the artifacts returned by
b383408d6
Execution,Interface: fix a metric for counting actually executed canisters per round (#1596)- Refines the logic for counting executed canisters. Switches the metric from a gauge to a histogram.
ec89de506
Interface,Message Routing: Fix a race condition in StateManager tests (#1673)- Adds a
flush_deallocation_channel
method to ensure that deallocation requests are processed before assertions in tests
- Adds a
ee5a50001
Interface,Networking: start the BTC adapter only by accepting a config (#1584)- The changes include refactoring the initialisation process and ensuring the metrics server starts correctly when managed by systemd.
8d630c57d
Interface,Node: Fix long-standing typo (#1602)- A simple correction of a comment typo from “an partition” to “a partition”. I hope this bug fix was thoroughly tested
Performance improvements:
146430974
Interface,Message Routing: Defragment correct state (#1683 )- The previous commit (a438bb7) aimed to defragment the memory layout but did not target the correct state. This fix ensures that the intended performance improvements are achieved by defragmenting the correct state. I’m not sure why a438bb7 wasn’t actually referenced in this change log though (it should have been).
Chores:
9e9f3653d
Consensus,Interface(consensus): Add info log when producing equivocation (#1623)- Adds a log entry when an equivocation is detected
974ec76f5
Consensus,Interface(consensus): move get_block_maker_delay function from consensus_utils crate to consensus crate (#1527)- Moves the
get_block_maker_delay
function to the consensus crate and remove unnecessary dependencies
- Moves the
a368e8f26
Execution,Interface: make allowed viewers feature flag configurable and enable it for ic-starter (#1598)- Adds a feature flag for allowed viewers in the execution environment config. Enables the feature flag in
ic-starter
- Adds a feature flag for allowed viewers in the execution environment config. Enables the feature flag in
66389f30e
Interface,Networking: refine the exposed public interface of the adapters and start them in consistent way (#1622)- The changes include modifications to the build files and the main initialisation logic
905909681
Interface,Networking: hide some structs and functions from the BTC adapter (#1617)- Hides certain structs and functions from the BTC adapter, making them private
Refactoring:
I’ve also validated the unelection component of this proposal below.
There currently appear to be 13 blessed replica versions registered, 6 of which would be unelected by this proposal. These unelected versions are not running on any subnets, nor any unassigned nodes, so appears safe to unelect. Expand for details.
I’ve listed blessed replica versions below, ordered by elected date, and crossed out the versions that would be unelected.
36c1976, elected 2024-09-09 (proposal 132375), UNELECTION PROPOSED, running on 0 subnets843e71b, elected 2024-09-09 (proposal 132376), UNELECTION PROPOSED, running on 0 subnets3318f74, elected 2024-09-11 (proposal 132412), UNELECTION PROPOSED, running on 0 subnets290fd2a, elected 2024-09-11 (proposal 132413), UNELECTION PROPOSED, running on 0 subnets5d1beac, elected 2024-09-15 (proposal 132500), UNELECTION PROPOSED, running on 0 subnetsafe1a18
, elected 2024-09-16 (proposal 132481), running on 0 subnets1799735
, elected 2024-09-16 (proposal 132482), running on 0 subnetsc180069, elected 2024-09-16 (proposal 132507), UNELECTION PROPOSED, running on 0 subnetsc664899
, elected 2024-09-18 (proposal 132547), running on 1 subnets and all unassigned nodes (since proposal 133093)cacf86a
, elected 2024-09-18 (proposal 132548), running on 0 subnets0441f40
, elected 2024-09-23 (proposal 133061), running on 0 subnets7f6a81f
, elected 2024-09-23 (proposal 133062), running on 27 subnetsc87abf7
, elected 2024-09-23 (proposal 133063), running on 9 subnets
133143
Build successful and hashes generated on my machine match (CDN and local build), and the GuestOS hash matches the proposal payload.
This proposal is largely the same as 133142 (above), except that this proposal also upgrades GuestOS to Ubuntu 24.04. This is a significant upgrade that affects multiple components, including shell scripts, systemd services, SELinux policies, Dockerfile configurations, and minor adjustments in Rust code. The implication would be that this commit ensures compatibility and/or improved functionality with the new OS version. But there are numerous aspects about this commit that I’m unclear about, such as the removal of cleanup commands that were in place to avoid indeterministic builds.
If I knew Eero Kelly’s handle on this forum, I’d ask if he could provide a bit more context and reasoning for each of the changes in that commit. @DRE-Team is this something you could request?
133144
Build successful and hashes generated on my machine match (CDN and local build), and the GuestOS hash matches the proposal payload.
This proposal is largely the same as 133142 (above), except that this proposal also disables canister_snapshots
(a compile-time flag requiring a separate build).