We at Motoko DAO have encountered an issue during the SNS upgrade process. The upgrade failed with the following error message:
Unable to stop the target canister: (Some(2), “Stop canister request timed out”)
It appears the target canister cannot be stopped, and this is preventing the upgrade from proceeding. Has anyone else encountered this issue? If so, are there any known resolutions or workarounds?
It could be that the target canister has an open call context.
What downstream calls does this canister make (to which other canisters on which subnet)?
Could one of them not be replying for some reason?
TL;DR. The Governance team is investigating the issue. We have a working hypothesis for what’s the root cause. There is no immediate risk, the SNS is fully operational (except for its temporary inability to upgrade itself). More details and a concrete recommendation will follow early next week.
We’ve looked into this issue, and here’s the current hypothesis:
Context: Upgrading SNS-controlled canisters involves SNS Root first stopping that target (the canister being upgraded), then installing the code, and finally starting it again.
There were two Upgrade SNS-Controlled Canister proposals (proposal/114, proposal/113) executed at 2025-01-19, 1:10:(35, 44) PM UTC, i.e., less than 10 seconds apart.
This caused an interleaving in the SNS Root canister, which tries to stop the target (oeee4-qaaaa-aaaak-qaaeq-cai).
Both proposals 113 and 114 are marked as successful, but that’s just an artifact of this proposal type (they are completed before the SNS knows if the upgrade succeeded).
One of the proposals has actually succeeded, so the module f4f3b738eeab0b15527b29b9697e612a08985560b3edcb41fcf5ec904dcd569e was successfully installed onto the target, and the target was successfully started again.
But the other proposal’s task was to await for the target to be stopped, before it would be allowed to proceed installing the code. This was however never observed due.
Could you please re-submit the last proposal to upgrade the SNS-controlled canister (oeee4-qaaaa-aaaak-qaaeq-cai)?
We think that re-submitting proposal/114 (just a single one this time, please) would likely resolve the issue. Please be sure to specify the same Wasm (and all other parameters) as you had for 113 and 114.
In the unlikely even that this doesn’t do the trick (there’s a small chance, due to the uncertainty in ICP messages scheduling) we have a more robust solution, but that would require a bit more work, so we’d suggest that you give this option a try first.