Summary
All Blockchains work on consensus. The typical protocol upgrade mechanism (e.g. BTC, ETH) is that projects’ respective communities convince node providers to agree to perform a certain protocol upgrade at a certain point in time. This is typically done by a lot of conversations and off-chain agreement.
One of the things that makes the Internet Computer adaptable is that it can upgrade itself on-chain by allowing neuron holders to submit and vote on proposals submitted to the governance canister (which resides in the NNS subnet).
That means that changes to the Internet Computer happen via proposals submitted and executed via the governance canister - this requires over 50% of neuron voting power (token holders, essentially). This is completely on-chain consensus.
But what happens if the governance canister is broken or has a bug? How can it be upgraded to fix itself if it cannot accept or execute proposals? Who Watches the Watchmen?
In this case, the Internet Computer has to fall back to the typical blockchain mechanism for upgrades: the community has to agree on a new binary and convince a supermajority of node providers to deploy it. This can be a heavy and slow process and it is not auditable as cleanly as voting on proposals of the governance canister.
This proposed solution is to create an “emergency fallback” mechanism in case the governance canister is ever down, but not fall back all the way to “asking a supermajority node providers to deploy a new version of the binary manually.” The proposed solution is a mechanism to have an alternative proposal submission, voting, and execution exclusively to update just the Wasm code of the governance canister in case it is ever down that collects votes from node providers instead of neurons (the latter being impossible since the governance canister is down) and requires ⅔ + 1 of nodes to agree
Status
Discussing
Key people involved
Johan Granstrom (@johan ), David Ribeiro Alves, Lara Schmid
Pros and Cons
The pros and cons of the feature being proposed:
Pros:
- There would be an auditable on-chain trail of emergency proposals that use this mechanism.
- This solution would affect the upgrading of the governance canister only, not any other parts of the network
- This solution would require the same Byzantine Fault Tolerance (BFT) security assumptions: at least ⅔ of nodes in the NNS subnet need to agree. This is a higher bar than the simple majority of regular proposals and the same BFT assumptions the IC holds for consensus.
- This solution is meant to be used in case of emergencies so the IC community does not have to fall back all the way to the more basic method of convincing nodes and manual upgrading of nodes. So this is an easier, safer, and faster safety net than which currently exists.
Cons
- This design solution would reduce the friction for node providers to update the governance canister. One could see the current pain of having node providers agree to update binaries as beneficial.
Basic Questions
1. Has the governance canister ever been down so nodes needed to manually agree to update the governance canister (⅔ node providers agreeing)?
a. Yes, only once. The first week after Genesis, there was a bug where NNS had reached a max number of outstanding proposals, and was not clearing the backlog. This led to an Incident report and various bugs reported in the forums and socials. The NNS was ultimately updated by asking node providers to manually upgrade the software running. Because it is a decentralized system, the Foundation could only ask node providers to do it for the sake and health of the network.
b. Because it would require at least ⅔ of the nodes in the NNS subnet to agree (which is the security assumption of the entire network), this manual upgrade fell within the existing security parameters.
2. Can nodes collude to take my ICP?
a. This change would update the Wasm of the governance canister, not any other canister (such as Ledger Canister) or part of the network.
b. The harsh truth about all consensus protocols is that all blockchains (including the Internet Computer) have the property that if a supermajority of nodes agree on the state of the network, then that is the state of the network. In practice, both game theory and the increasing number of independent parties make this extremely improbable. Though the Ethereum blockchain’s state was famously rolled back after the DAO hack, this has only happened once, and at a high cost. In this case, it would be a supermajority of nodes in the NNS subnet.
c. The IC minimizes this risk by maximizing the number of independent node providers. The IC is still young and will continue to add new independent node providers over time.
3. Why does the IC community need this now?
a. This is a relatively easy change to make and an action item from our lesson at Genesis where it was very hard to update the network. This change will make the network more resilient. The governance canister is too important to risk making it difficult to upgrade. If it is down, all other proposals and upgrades stop.
4. Will this feature take a lot of time better spent elsewhere?
No, it is fairly simple, and as part of the lesson from 3.5 months ago, DFINITY foundation engineers already have the implementation done, but not deployed yet.
5. What is the DFINITY Foundation asking the community?
a. The Foundation is asking the community to engage in this thread, ask questions and provide feedback. The DFINITY Foundation will also schedule a community conversation on the week of September 27 (which will be recorded). The intent is to submit an NNS motion proposal to gauge the community’s reaction. If the motion passes, then the Foundation will submit an NNS proposal with the actual code change. If the motion fails, it’s back to the drawing board.
b. Ask questions on the proposed mechanism
6. Has this feature been security vetted?
a. This feature has been vetted by the NNS team and the Security team at DFINITY.