Bringing clarity to ICP upgrade proposals

TLDR

Upgrading the Internet Computer Protocol involves NNS governance proposals grouped by topics. Over time, these upgrade-related topics have evolved in a way that is now somewhat confusing. Understanding proposal topics is crucial for neuron holders who delegate voting decisions. DFINITY is proposing to make it easier for neuron holders to understand what upgrade proposals and topics mean to make more informed voting decisions.

If the community approves this plan (and adopts the corresponding NNS Governance and NNS dapp releases), neuron holders will need to review their following and may need to adjust it to reflect their precise intentions. More information about what needs reviewing will follow; no action from NNS users is required until then.

Background

The Internet Computer Protocol runs on a distributed network of nodes grouped into subnets. Each node runs a stack of operating systems, including HostOS (runs on bare metal) and GuestOS (runs inside HostOS; contains, e.g., the ICP replica process). HostOS and GuestOS are distributed via separate disk images. The umbrella term IC OS refers to the whole stack; for the sake of this discussion, think of IC OS as the combination of HostOS and GuestOS.

There is a process for upgrading IC OS versions via NNS governance proposals. The upgrade process involves two phases, where the first phase is the election of a new IC OS version and the second phase is the deployment of a previously elected IC OS version on all nodes of a subnet or on some number of nodes (including nodes comprising subnets and unassigned nodes).

A special case is for API boundary nodes, special nodes that route API requests to a replica of the right subnet. API boundary nodes run a different process than the replica, but their executable is distributed via the same disk image as GuestOS. Therefore, electing a new GuestOS version also results in a new version of boundary node software being elected.

Motivation

Electing and deploying new IC OS versions currently happens through multiple proposal types within different topics with obscure names. For example, a new GuestOS version is elected under the topic Replica Version Management and deployed to a subnet under the topic Subnet Replica Version Management. New HostOS versions, however, are elected and deployed under one topic: Node Admin. The reasons for this inconsistency are mostly historical, e.g., upgrading HostOS via NNS proposals is a relatively new feature, while for GuestOS this has been the way since genesis.

Scope

The plan affects only proposals in the context of upgrading IC OS: renaming proposal topics and proposal types; adding new proposal types under existing topics; making some proposal types obsolete.

The two-phase IC OS upgrade process will not be changed.

Two topics for all IC OS upgrade proposals

Since upgrading GuestOS and HostOS are conceptually similar processes, neuron holders might want to make one single decision on whom to follow regarding the election of new versions of these operating systems. Therefore, there should be just one proposal topic for GuestOS and HostOS version elections. We propose to call this topic “IC OS Version Election.”

Once a new IC OS version is elected, it can be deployed to some nodes. The deployment process makes use of software that has already been approved by the community; the main decision at this phase is about the order in which software is deployed onto the nodes. Such decisions are made by experts who monitor the health status of nodes after each deployment to ensure that it is safe to proceed. For example, to ensure optimal reliability of the Internet Computer, the GuestOS versions must be deployed to all nodes of one subnet via one proposal, while HostOS versions have to be deployed to one node per subnet at a time. Having just one proposal topic to follow for all proposals related to IC OS deployment would aid this process. We propose to call this topic “IC OS Version Deployment.”

Summary of the planned changes

Action Status Quo Proposed New State
Topic Type Topic Type
Change the set of elected GuestOS versions Replica Version Management Update Elected Replica Versions IC OS Version Election Revise Elected GuestOS Versions
Change the set of elected HostOS versions Node Admin Update Elected HostOS Versions IC OS Version Election Revise Elected HostOS Versions
Deploy GuestOS version to subnet Subnet Replica Version Management Update Subnet Replica Version IC OS Version Deployment Deploy GuestOS To All Subnet Nodes
Deploy GuestOS version to unassigned nodes Node Admin Update Unassigned Nodes Config IC OS Version Deployment Deploy GuestOS To All Unassigned Nodes
Deploy GuestOS version to
API boundary nodes
API Boundary Node Management Update API Boundary Nodes Version IC OS Version Deployment Deploy GuestOS To Some API Boundary Nodes
Deploy HostOS version to node Node Admin Update Nodes HostOS Version IC OS Version Deployment Deploy HostOS To Some Nodes

Implementation

To achieve the above, we propose the following order of changes:

  1. Rename proposal topics:
    1.1. Replica Version Management → IC OS Version Election
    1.2. Subnet Replica Version Management → IC OS Version Deployment

  2. Rename proposal types:
    2.1. Update Elected Replica Versions → Revise Elected GuestOS Versions
    2.2. Update Subnet Replica Version → Deploy GuestOS To All Subnet Nodes

  3. Add new proposal types that did not exist before:
    3.1. Deploy GuestOS To All Unassigned Nodes
    with topic IC OS Version Deployment
    with the action derived from Update Unassigned Nodes Config.
    3.2. Deploy GuestOS To Some API Boundary Nodes
    with topic IC OS Version Deployment
    with the same action as Update API Boundary Nodes Version.
    3.3. Revise Elected HostOS Versions
    with topic IC OS Version Election
    with the same action as Update Elected HostOS Versions
    3.4. Deploy HostOS To Some Nodes
    with topic IC OS Version Deployment
    with the same action as Update Nodes HostOS Version

  4. Make some old proposal types obsolete (return an error if these are submitted):
    4.1. Update Unassigned Nodes Config
    4.2. Update API Boundary Nodes Version
    4.3. Update Elected HostOS Versions
    4.4. Update Nodes HostOS Version

Next steps

DFINITY will propose the relevant upgrades for the NNS dapp, the NNS Governance, and IC Registry in the next few weeks. This will implement the changes summarized above.

If the upgrades are adopted by the community, we will post a reminder to review your neuron following in the NNS dapp (an action prompt will also be displayed there).

15 Likes

@aterga thank you for providing this explanation of potential changes for ICP upgrade proposals. I have a few questions and concerns.

Currently the CodeGov team gets involved in reviewing the replica update proposals. It is easy to identify which proposals we are tasked with reviewing because they are all submitted under the Replica Version Management proposal topic. This proposal topic is manageable because there are typically 1-2 proposals per week and they almost always are submitted on Fridays, which means our developers have 2 days over the weekend to review these proposals. This is a side job in which pays a reasonable bounty in exchange for a modest amount of time for each reviewer. However, we do not review Node Admin, Subnet Replica Version Management, and API Boundary Node Management proposal topics. One reason is that there are far more of these other proposal types than there are Replica Version Management and they are not typically released on a consistent day of the week. This proposed change groups some of the Node Admin proposals into the same proposal topic as Replica Version Management. I don’t have a good sense of how this would complicate our work by CodeGov yet, but would appreciate if you can offer some clarity.

Below is a screen capture of the dashboard with all proposals filtered to display Replica Version Management and Node Admin only. As you can see, there are far more Node Admin proposals than there are Replica Version Management. The vast majority are identified at the Update Node Operator Config type, which is not a Node Admin type that is in your table. I do see a few Update Unassigned Nodes Config proposal types in recent history, but none of the other two that you listed (Update Elected HostOS Versions and Update Nodes HostOS Version). The dashboard does not filter proposals by type. The only option is to filter by topic. Sometimes the topic title shown on the dashboard is different than what is used in the comments in the IC codebase (e.g the dashboard uses System Canister Management instead of Network Canister Management for proposal topic 8). I’m a little concerned that the same may be true for proposal types since I can’t find any Update Elected HostOS Versions proposal types. Hence, it could be a challenge to recognize which of the newly proposed IC OS Version Election proposal topic are for the Revise Elected GuestOS Version and which are for the Revise Elected HostOS Version. I recognize that this might not be a change in workload for CodeGov because the IC-OS Verification is already looking for matching hashes for both GuestOS and HostOS and the changes for both are listed in the Release Notes.

The CodeGov team uses chron jobs to trigger automatic messages to our reviewers so they know when it is time to go to work reviewing Replica Version Management proposals. We are also building a proposal review app and OpenChat bot features that will eventually do the same. In all cases, it will be relatively easy to filter the new proposal topic and type to ensure we focus on the legacy Replica Version Management proposal topic. However, I’m concerned about the inability to easily recognize the difference in the dashboard. Even though we get automated messages about Replica Version Management proposals, I still go the the dashboard to get an overall view of all open proposal types where CodeGov and Synapse are active voters. So I guess what I would like to know is how many proposals do you expect will occur each week that are submitted under the new IC OS Version Election proposal topic? Is it still 1-2 on average and typically on Fridays? Will you please clarify how many (or what percentage) of Node Admin proposals would map to the new IC OS Version Election topic?

Also tagging @Dylan on the dashboard team just in case there is a discrepancy between the Node Admin proposal types listed in the status quo column in the table and what is displayed on the dashboard for Node Admin proposal type.

6 Likes

Will you please clarify if this new proposal topic called IC OS Version Election will be mapped to the All Topics Except Governance and SNS catch all category (aka topic 0 - Unspecified). Is the goal to stimulate all neuron owners to intentionally select a Followee for this new proposal topic? It seems like a potential opportunity to advance decentralization of the ICP on technical proposals given that this proposal topic is the starting point for all changes to the replica and DFINITY intentionally makes it easy for developers to review, understand, and vote to adopt / reject the changes. If this new proposal topic remains under the All Topics catch all, then there is no driver for anyone to make intentional choices regarding their Followee. Neurons will continue with their default following, which means DFINITY will trigger 93% of total voting power when they vote on replica updates (proposal topic 13). I’m sure this is the intent, but just wanted to verify. At some point it would be nice to see changes that give people reasons to consider diversifying their following on technical topics.

1 Like

Hi @wpb, you wrote a lot in your message, so I’m not exactly sure what you’re asking. If you ask me specific questions I can answer.

Oh I understand now, you are asking about the Status Quo column in Arshavir’s message. Yes, all of those proposal topic and type names match the names used by the ICP Dashboard.

2 Likes

Hello @wpb, thank you for looking into these proposed changes and asking good questions. I would like to start by acknowledging that CodeGov is playing an important role in the IC ecosystem. We hope that the proposed changes are not going to be too disruptive for CodeGov’s replica review process. However, as IC OS upgrades become more decentralized (e.g., not just GuestOS but also HostOS is now upgradable via governance proposals), I believe that the proposed changes actually simplify the work for all parties.

Identifying proposals by type

You mentioned that CodeGov is currently specifically interested in replica upgrade proposals (currently under the topic Replica Version Management). Since you mention that the topics Node Admin and Subnet Replica Version Management are not being monitored by CodeGov, I conclude that, conceptually, you are interested in reviewing newly elected replica versions (currently: Update Elected Replica Versions; proposed new type: Revise Elected GuestOS Version) as opposed to deployment-related proposals. This makes sense because the deployment protocol does not change from one replica version to another.

If the changes proposed in this thread are adopted, we would indeed not have one separate topic specifically for Replica / GuestOS, because, e.g., HostOS election would also fall under the same topic. Still, one could check the type of proposals under the topic IC OS Version Election to distinguish GuestOs vs. HostOS. The Dashboard shows the proposal type in the Overview section of each proposal. (Indeed, it is not currently possible to filter proposals by type directly on the Governance - ICP Dashboard page; would it make sense to add another column for this, @Dylan)?

For now, if you would like to have more automation for your proposal monitoring, it’s possible to use DFX to list all pending proposals of a particular type:

NNS_GOVERNANCE=rrkah-fqaaa-aaaaa-aaaaq-cai
REVISE_ELECTED_GUESTOS_VERSIONS=38
curl https://raw.githubusercontent.com/dfinity/ic/master/rs/nns/governance/canister/governance.did \
    > governance.did
dfx canister --network ic \
    call $NNS_GOVERNANCE \
    --candid governance.did \
    get_pending_proposals '()' \
    | idl2json \
    | jq "map(
        select(
            .proposal[0].action[0].ExecuteNnsFunction.nns_function 
                == $REVISE_ELECTED_GUESTOS_VERSIONS
        ) | \"https://dashboard.internetcomputer.org/proposal/\" + .id[0].id)"

Example output:

[
  "https://dashboard.internetcomputer.org/proposal/129081",
  "https://dashboard.internetcomputer.org/proposal/129084"
]

Note that REVISE_ELECTED_GUESTOS_VERSIONS=38 works now and will still work if the proposed changes are adopted. This is because the labels for proposal types (and topics) are not part of the canister API, only their integer identifiers are, and these identifiers can be retained in case of a simple renaming (i.e., whenever a proposal’s topic remains the same).

Addressing your specific concerns

However, I’m concerned about the inability to easily recognize the difference in the dashboard

If one opens the above URLs, they would see the following information in the Overview section:

Status Quo Proposed New State
image1 image3

So, for those interested in specifically replica upgrades, the proposal type information is still going to be there on the Dashboard.

Sometimes the topic title shown on the dashboard is different than what is used in the comments in the IC codebase (e.g the dashboard uses System Canister Management instead of Network Canister Management for proposal topic 8).

Right: Currently we do not have full consistency over how proposal types and topics are labeled (this is one of the reasons I personally believe that the change discussed in this thread is worth adopting, as a more consistent grouping of proposals by topics would simplify having frontends follow a consistent naming scheme). Note that this labeling problem cannot be easily solved on the backend, as the backend would have to assume that all frontends are set to a specific language, but we don’t necessarily want to make such assumptions in the canister-smart contract. However, we strive to make the internal code conventions consistent with the naming conventions, and a lot of improvements to that end are part of the the changes proposed in this thread (in NNS Governance and Registry canisters).

I’m a little concerned that the same may be true for proposal types since I can’t find any Update Elected HostOS Versions proposal types.

No worries; here’s an example Update Elected HostOS Versions proposal: Proposal: 125506 - ICP Dashboard (it isn’t in the first 100 Node Admin proposals, hence one needs to look deeper, indeed).

…it could be a challenge to recognize which of the newly proposed IC OS Version Election proposal topic are for the Revise Elected GuestOS Version and which are for the Revise Elected HostOS Version. I recognize that this might not be a change in workload for CodeGov because the IC-OS Verification is already looking for matching hashes for both GuestOS and HostOS and the changes for both are listed in the Release Notes.

We are also building a proposal review app and OpenChat bot features that will eventually do the same. In all cases, it will be relatively easy to filter the new proposal topic and type to ensure we focus on the legacy Replica Version Management proposal topic.

It’s great that you are building tools for aiding the community to review the replica software. I fully agree that the changes proposed in this thread should not change the workload for CodeGov. You just need a way to filter by proposal type, not just topic, right? Could you use the above-mentioned DFX command (or, similarly, go via the API) as part of the CodeGov process?

I guess what I would like to know is how many proposals do you expect will occur each week that are submitted under the new IC OS Version Election proposal topic? Is it still 1-2 on average and typically on Fridays? Will you please clarify how many (or what percentage) of Node Admin proposals would map to the new IC OS Version Election topic?

If these changes are adopted, there will be just two proposals under the topic IC OS Version Election: one GuestOS-related and one HostOS-related. I do not expect the overall flow of new IC OS Version Election proposals to significantly increase. I think HostOS-related proposals will be submitted less frequently than GuestOS; I’ll check with the Node team and get back to you with a more precise answer.

Questions regarding following

Will you please clarify if this new proposal topic called IC OS Version Election will be mapped to the All Topics Except Governance and SNS catch all category (aka topic 0 - Unspecified).

Yes, if the changes discussed in this thread are adopted, the topic IC OS Version Election will indeed fall under All Topics Except Governance and SNS. Note this is also the case for the current topic Replica Version Management, so there will be no change in how the following mechanism works for GuestOS/replica election proposals.

If this new proposal topic remains under the All Topics catch all, then there is no driver for anyone to make intentional choices regarding their Followee. Neurons will continue with their default following, which means DFINITY will trigger 93% of total voting power when they vote on replica updates

At some point it would be nice to see changes that give people reasons to consider diversifying their following on technical topics.

I think there are two orthogonal issues here: (1) voting and following incentives and (2) consistent proposal topics. The changes proposed in this thread do not affect the incentives (1); they only aim to improve on consistency (2). In my view, making intentional choices would become easier if proposal types and topics had clearer names and were organized according to the proposed convention. While there would be two proposal types under the topic which used to be covering only GuestOS/replica version election, one could argue that the status quo is much more confusing; IMHO, there’s really no rationale why electing new versions of one (HostOS) component of IC OS should fall under Node Admin, while another (GuestOS) — under its own topic.

Expanding the spectrum of proposals reviewed by CodeGov

You mentioned a few good reasons why currently CodeGov doesn’t review non-replica releases. But if CodeGov is interested in expanding on this, perhaps there’s an opportunity to build a seamless process.

Since boundary node-related software is elected via the exact same proposals that elect new replica versions, it’s possible that the schedule is already defined by that; is that right, @rbirkner?

I don’t know yet how this would work for HostOS upgrades (distributed via a separate disk image and elected via a separate proposal type). But if there’s interest, I’d be happy to set up a call to discuss this with the Node team who owns HostOS.

Please let me know if you have further questions or suggestions, and thanks again for your feedback!

4 Likes

The main reason we don’t include a (filterable) Type column is due to space constraints, and as far as I know it’s never been requested, so it hasn’t seemed like a feature that users are interested in.

Note that one can filter on proposal type by using the ICP Dashboard API directly. @wpb, is using the API to filter proposals by type sufficient for the CodeGov use case?

I don’t see the URLs mentioned. If you are talking about proposals 129081 and 129084, then they both look like the status quo example since this change hasn’t been implemented yet. Would you please add the URLs?

If you are talking about proposals 129081 and 129084, then they both look like the status quo example since this change hasn’t been implemented yet

Exactly. My point was that if the proposed changes are adopted, one would instead see something like “Proposed New State.” Does that make sense?

1 Like

I agree. This proposal seems reasonable and a step in the right direction. It would be nice if frontends had a way to label each proposal topic and type consistently. It seems like those labels should be defined on the backend, but I understand there is a language issue.

Wow, that was a nice trip down memory lane. The link to the forum post in that proposal included all three proposals that came out that week (125503, 125504, and 125506). There were many questions from the CodeGov team and a lot of different DFINITY team members provided great responses. It was one of the most active discussions we’ve had on these types of proposals.

Regarding my original concern, it appears that the Update Elected HostOS Version proposal type is rare. Hence, grouping it into the new IC OS Version Election proposal topic shouldn’t be a problem. In fact, we should probably be intentionally reviewing that proposal topic anyway. It just hasn’t been high on our radar screen since it currently falls under Node Admin and they have been rare.

Yes, it would be helpful if you could verify that HostOS updates are still expected to be low frequency. It would also be helpful if they could be issued on Friday’s just like the GuestOS updates. The work process that DFINITY has been using for Replica Version Management proposals has been working very well, so it would be great if all proposals under the new IC-OS Version Election topic could follow that same work process.

I’m not too concerned about being able to filter proposal topics and types with the automation tools we have set up and that are in development. I passed the DFX recommendation you provided to our CodeGov team just in case it is helpful. My main concern was with the inability to easily filter on the dashboard for proposal type. Given that Revise Elected HostOS Version is expected to be infrequent, it probably won’t be a problem after all. It’s more likely that we will include these proposal types in our reviews moving forward.

Thanks for clarifying this detail. It makes sense.

I’d be interested in knowing @Manu’s thoughts on this one. Particularly, why did HostOS updates end up in Node Admin instead of Replica Version Management? There really wasn’t any distinction between GuestOS, HostOS, and SetupOS back when Manu first proposed creating the Replica Version Management proposal topic with proposal 80639 (forum discussion). Perhaps there is an interesting historical context.

Thank you for the excellent explanations on this proposal @aterga. I’m not seeing any major issues. I would vote to support this proposal.

3 Likes

Would it make sense to allow a filter to include topic and type, but only show the topic column? You could use nested lists that expand to show proposal type when the user first selects the proposal topic in the filter dialog.

Hi,

Have by now finished reading it all. I second this change.

Think it’s good and confirm this is an important step of “refactoring” worth doing. The new names and organization are much better, much clearer on the context / actions.

I for example, despite being on the CodeGov team, wasn’t aware of the “all nodes” / “some nodes” subtleties on Deployment.

Naming is important, but even more important is the change of Host OS election now being on the IC OS Election. So that our known neuron and even the community neurons can follow without confusion. It was always hard to setup / explain the voting on just the type Host OS of Node Admin. After this change, CodeGov can announce that we can easily be followed for all election types (both Guest and Host).

I also confirm it should be very little disruptive in our automations, we only need to update our filters (if needed).

Thank you and looking forward to these changes.

4 Likes

I agree as well. Even though it is outside the scope of CodeGov, personally have been keeping an eye on Node Admin and management topics also, so this proposed change makes sense for me.

3 Likes

Exactly, since ic-boundary is part of GuestOS it is already part of the GuestOS proposal (currently: Update Elected Replica Versions; proposed new type: Revise Elected GuestOS Version) and covered by CodeGovs review.

3 Likes

The target cadence for HostOS upgrades is once every 4 months; currently, these upgrades happen even less frequently than that.

If these changes are adopted, the Node and DRE teams agree to have the Revise Elected HostOS Versions proposals submitted on Fridays.

Cc @sat

4 Likes

It’s a cool idea, but we don’t have the development resources at this time for a custom solution like that.

1 Like