Proposal: Opt-In Mechanism for Automatic SNS Target Version Advancement

TL;DR. The Governance team proposes extending the AdvanceSnsTargetVersion mechanism to allow SNSs to opt-in for automatic updates to the latest version approved by the NNS community.

Background

In recent months, the ICP community has published an average of over 1.6 SNS upgrades per week. For each upgrade, an eligible SNS neuron holder from each SNS community had to manually submit an UpgradeSnsToNextVersion proposal. In 2024, this process resulted in approximately 85 proposals across 30 SNSs, requiring over 2,500 DAO decisions to keep SNSs updated.

The introduction of the AdvanceSnsTargetVersion proposal type has significantly improved this situation. Most SNSs now upgrade once per release cycle rather than once per individual SNS canister upgrade. However, with nearly weekly release cycles, this still amounts to roughly 1,500 decisions per year — an effort that will only grow as more projects launch.

New feature

We propose enhancing the AdvanceSnsTargetVersion mechanism to allow SNSs to opt-in for automatic upgrades to the latest available SNS version that has been approved by the NNS community.

This opt-in feature aims to simplify the upgrade process and reduce the maintenance burden of the SNS communities, ensuring SNSs seamlessly adopt community-approved improvements and features. Opting in would require submitting a single ManageNervousSystemParameters proposal. Similarly, SNSs could opt out of automatic updates at any time by submitting another ManageNervousSystemParameters proposal.

We look forward to hearing your feedback and engaging in discussions around this proposed design!

4 Likes

With https://dashboard.internetcomputer.org/proposal/134989 being adopted, this feature is now live.

4 Likes

Via https://dev.ic-toolkit.app its also possible to enable automated upgrades via the following URL (replace the canister in the url with the sns root canister id)

https://dev.ic-toolkit.app/sns-management/tw2vt-hqaaa-aaaaq-aab6a-cai/proposals/new?action=enable-auto-upgrade

Doing the proposal by the above link will result in the following proposal

THE PREVIEW STEP IS MISSING! so i would suggest using NFID because of the build in transaction approval step which enables you to inspect the payload

4 Likes

It doesn’t quite look like that proposal will enable the feature. And there is a proposal linked to this forum post that seems to have been created that way, and I don’t know that it will have the intended effect: NNS Dapp

NNS Dapp - The relevant field in the proposal can be seen here. It is not present in the one for DOLR AI I posted above, as that is likely not upgraded far enough to have this parameter available.

Yes this is correct, it seems that their SNS isn’t upgraded to the version that allows setting the automated upgrades. I wanted to add a note for this with a reference to the correct WASM hash but I was told the proposal would fail if the option was missing. So I didn’t.

Thanks for the notion, I will add it to prevent further confusion for other SNSes that try to do the proposal

Also for completeness, the minimum version that is required for this feature has the following wasm hash

5c43913c77f922a21f54b3422abf7cb43d369677e7668b7c8f91a429acd5c864

Hey @aterga @msumme could I direct your attention to an interesting conversation regarding the opt-in option for SNS upgrade automation that is being considered by the WaterNeuron SNS community. The conversation is pretty focused on this topic, so there are not a lot of extra messages to wade through to find the points being made by the various folks involved in the conversation. However, it is quite lengthy at this point. I think you would find it interesting and I would love to know your thoughts and responses.

I think one of the interesting points in opposition to this opt-in option is that every SNS that opts in will be canary release targets that enables DFINITY to find flaws in the upgrade. The WaterNeuron dev team (@EnzoPlayer0ne @1eo) has concerns about WaterNeuron being first in line for these upgrades. @Lorimer has drawn parallels to IC-OS Version Deployment and how there is a cadence to IC-OS version upgrades to the subnets, which appears to be that all subnets are upgraded within 4 days by what appears to be a mix of automation with human oversight. What do you think are the risks of automating updates to the SNS framework for each SNS? Is there going to be a deployment cadence to SNS upgrades or will they all happen in parallel? Is there any way that WaterNeuron can opt for an automatic upgrade that happens 4 days after all other SNS projects are automatically updated? There is concern about automatic updates when WaterNeuron controls more than 2MM ICP. Perhaps if WaterNeuron could go last in automated updates it would alleviate concerns.

Also, in what ways does the opt-in for automatic SNS framework updates help protect the SNS instead of putting it at risk? Right now the devs involved in the conversation seem to think automatic updates are blind updates that carry high risk. What are the relevant points that need to be made in response to these concerns?

It should be noted that every SNS project that has completed their voting on automating these SNS framework upgrades so far have done so by very large majorities. No SNS has voted to reject yet. Out of 34 total SNS, 14 have already opted in to automated updates, 13 are currently considering it and all of them have large majorities to adopt that are still shy of an immediate majority decision, and 7 have not started considering it (probably because they are too far behind to implement). It seems that most SNS project leaders and communities find these automated upgrades to be a significant benefit. Hence, I don’t want anyone reading this post to think the concerns expressed by the WaterNeuron dev team / community so far is representative of the greater SNS community. Nevertheless, the WaterNeuron conversation is very relevant and I encourage everyone to check it out in order to understand the discussion points and how it might be applicable to other SNS.

Personally, I don’t really understand why there is even an opt-in option. We are talking about SNS framework updates that are owned and managed by the NNS and developed by DFINITY, not the code that is owned an developed by the SNS community and the SNS dev teams. It seems to me that these SNS framework updates should be automatic for all SNS projects, but I do want to learn more about realistic conditions where that might not be a good idea. It seems to me that the higher risk is not automating these updates.

3 Likes

Thank you @wpb, I much appreciate you bringing this to our attention. We don’t have the bandwidth to follow on all community discussions on all the platforms, so discussing the main points in this thread is a good idea.

Background — How the Governance team prepared SNS framework upgrades

Let me first point out that the Governance team does extensive, systematic testing to qualify each new SNS Wasm for a release. Afterwards, the NNS community carefully reviews the respective proposal before deciding if it should be blessed. This ensures that most categories of bugs don’t make it to production. Of course, testing depends on the data, not just the code, which makes it inherently incomplete; as a result, bugs might occur in production.

Before releasing a feature, the team also assesses the worst-case scenario (e.g., how would the system react to failures in each of its components?) and devises a recovery plan to mitigate such unlikely events if they still occur, so we’re well equipped to support you when necessary. In particular, we have tests that ensure that upgrades are never permanently stuck, so even if a bug would slip through, it could be resolved via a hotfix release (roll forward) which is generally safer than rolling back.

What do you think are the risks of automating updates to the SNS framework for each SNS?

in what ways does the opt-in for automatic SNS framework updates help protect the SNS instead of putting it at risk?

True, it’s possible that a newly introduced bug discovered in one SNS would help another SNS avoid it by not upgrading for some time.

On the other hand, postponing upgrades also delays when potentially vital fixes can be delivered to your SNS. For example, last year a subtle / high impact bug was discovered in ic-cdk (a library used by the SNS framework) causing memory leaks, and some SNSs could not quickly upgrade to the latest version that contained the fix. DFINITY could not really help them, either, as the community needed to pass a large number of legacy upgrade proposals, which only upgrade one SNS framework canister at a time. These days, there would be just one AdvanceSnsTargetVersion proposal required to pass, but even that adds up to 4 days of the time of hotfix delivery.

… every SNS that opts in will be canary release targets that enables DFINITY to find flaws in the upgrade.

I don’t think this is accurate. If a hypothetical flaw isn’t already caught during release testing, there’s no guarantee that it would actually be observed in just a few days after some SNS’s upgrade. For example, recently a subtle bug was triggered in Alice SNS due to the interleaved execution of two UpgradeSnsControlledCanister proposals, blocking SNS Root upgrades. I believe this bug has been around since the beginning, but it took years before someone stumbled upon it in production, which is to say that allowing for a limited number of days to pass before an SNS decides it’s safe to upgrade does not prevent the possibility of bugs, and does not enable DFINITY to reliably discover new flaws.

Is there going to be a deployment cadence to SNS upgrades or will they all happen in parallel?

Not quite in parallel — all SNSs check for available upgrades every hour and, if opted in for the automation, advance their SNS target versions to the latest available one. After that, each upgrade step is expected to take 2-10 minutes.

concerns about WaterNeuron being first in line for these upgrades

There is concern about automatic updates when WaterNeuron controls more than 2MM ICP.

I understand that this SNS handles a large flow of financial transactions and is thus especially cautious about risks associated with upgrading. Luckily, the WaterNeuron SNS community is not required to opt-in — that’s why we make this automation configurable while designing the new upgrade feature.

The new AdvanceSnsTargetVersion proposals enable streamlined upgrades without giving up any control over which upgrades are installed when. Even without opting in for the full automation, the WaterNeuron folks are already using this proposal to simplify their operations.

Perhaps if WaterNeuron could go last in automated updates it would alleviate concerns.

Is there any way that WaterNeuron can opt for an automatic upgrade that happens 4 days after all other SNS projects are automatically updated?

This is not currently supported. In addition, I don’t think it would be a sustainable way to set things up, as if SNS-A triggers upgrades only, say, 4 days after SNS-B, nothing would stop SNS-B from setting up similar rules, resulting in a cyclic dependency.

devs involved in the conversation seem to think automatic updates are blind updates that carry high risk.

Please refer them to the Background section above. To summarize:

  • Not being on the latest version is risky, as that may slow down the delivery of hotfixes.
  • SNS framework upgrades are proposed and blessed by the NNS only after meeting strict security requirements, including code reviews, holistic security reviews for new features, unit and integration testing.
  • Special care is taken to keep upgrades backward compatible, breaking changes are rare and they should always be discussed with the community developers upfront.

Out of 34 total SNS … 7 have not started considering it (probably because they are too far behind to implement).

This by itself puts those 7 SNSs at risk, as if a critical vulnerability would be discovered (e.g., in one of the libraries used to implement SNS), we don’t have any estimates how long it would take to deliver the hotfix to those SNSs.

Therefore, I believe that the SNSs that are currently behind are those that would likely benefit from automatic target SNS version advancement the most, ironically.

Personally, I don’t really understand why there is even an opt-in option.

While I agree that the automatic target SNS version advancement feature is very useful for most SNSs (otherwise we wouldn’t have prioritized building in), it makes sense that different SNS communities may have different opinions on how to best deliver upgrades for their project. In particular, WaterNeuron seems like a very active voting community, and having to vote on a single additional AdvanceSnsTargetVersion proposal per week likely won’t add a significant overhead to their voters.

If the current ways for upgrading the SNS framework are still too inconvenient for someone, I invite the stakeholders to start a separate forum thread, describing what exact problem they would like to solve.

In the meantime, I recommend all SNS teams to closely follow the SNS Upgrade Aggregation Thread thread in which the Governance team announces the SNS framework upgrades being proposed.

Please let me know if I missed any questions!

3 Likes

Hey @aterga. I just wanted to say thank you for providing a very clear explanation of the work process that is followed for SNS framework updates. It’s nice to better understand the testing and backup plans, upgrade policies, rollout strategy, risks of not updating (with examples), advantages and disadvantages of auto updating, and the logic behind allowing the choice to opt-in.

It appears that the WaterNeuron team believes it is best to manually implement SNS framework updates using the Advance SNS Target Version proposal type 4 days after the NNS blesses the latest Service Nervous System Management proposal, which means it will likely be a week behind. Security critical updates would be addressed by expediting the proposal and getting the word out to vote quickly. It means someone will need to be paying attention to the latest releases. It is what it is. It’s a model will work if we stay on top of the work process.

1 Like

Hey @aterga. I also noticed that BoomDAO has opted in to automatic updates with proposal 386. However, someone has been using the toolkit (an awesome app by @rem.codes) to submit additional SNS update proposals 387 and 389. All of these proposals have fewer votes cast than what is normal for that SNS, which makes me wonder if the dev team (e.g. @icpmaximalist @atomikm) is aware of these changes. According to the BoomDAO upgrade journal, all SNS canisters were updated to the latest version after proposal 386 was executed (they matched Dragginz, OpenChat, etc), but they were changed to something else after proposal 387 was executed. They will be changed to something else if proposal 389 is executed. Do these changes make sense? What happens if the SNS version is changed to a much older version? Is it possible for someone to attack an SNS with these proposals if they are doing it maliciously? Of course, the SNS can and should reject these proposals if they are indeed malicious. Now that the SNS version has been changed, how does it get reverted back to the latest? Does that happen automatically after the next update is blessed by the NNS? Is it possible to change the SNS to a version that is so old that it won’t auto update?

Update: There are several additional open proposals related to this including…
Upgrade SNS to Next Version (proposal 61) for Origyn (@skilesare)
Advance SNS Target Version (proposal 270) for Seers AI (@Seers)
Advance SNS Target Version (proposal 416) for DOLR AI
All of these SNS projects have already opted in to automatic SNS updates, so what would be a reason why they all would need to switch to something else?

Catalyze has an open proposal to Transfer SNS Treasury Funds (proposal 207), but the summary claims it is an Advance SNS Target Version. This one definitely does not look legitimate, so I hope the dev team (@rlaracue) sees it in time and can respond appropriately. Catalyze has also opted in to automatic updates. I think this is likely unrelated, but mention it because of the claim to be an Advance SNS Target Version proposal type.

4 Likes

I have been submitting most of these proposals. The reason for the follow-up proposals is because I had assumed that submitting the auto-upgrade proposal will fail if the SNS doesn’t yet support this new field, but that is not the case. Instead, what happens is that the proposal successfully submits, executes, and changes none of the parameters.

In order to enable automatic upgrades, the SNS needs to be a version that supports it. One can upgrade to this version by submitting an Advance Target Version proposal and supplying all Wasm hashes to upgrade to. If the SNS doesn’t support Advance Target Version, I still need to submit a series of Upgrade SNS to next version proposals to then submit Advance Target Version and enable auto-updates.

My proposal strategy isn’t optimal, as I didn’t check the SNS version history to find the exact version breaking points. Once currently open proposals are adopted, I will do that.

2 Likes