Regarding the referenced thread, I want to share some technical explanations to help clarify discussion, and also provide some points for consideration. I will start by explaining how and why Generation I & II blockchains support the local processing of blocks and transactions, then explain how and why the Internet Computer, as a Generation III blockchain, uses a different model, which does not require the local processing of blocks using e.g. “local nodes.” Finally, I’ll raise some technical issues that would be involved in adding that functionality for use in special cases, such as with the Network Nervous System DAO, and raise some societal, security and regulatory issues involved with making all its data, and the interactions with it, totally transparent.
Why Generation I & II blockchains involve local processing of blocks
Generation I (e.g. Bitcoin) and Generation II (e.g. Ethereum) blockchains provide a model for secure interaction that cannot work for Generation III blockchains. Their networks maintain copies of all the blocks of transactions added to the chain and processed. When a party wishes to securely interact with the blockchain, in a manner that does not involve trusting some centralized actor, such as a crypto exchange, or Infura or Alchemy, say, they must download a copy of the blocks, and re-run all the transactions locally to construct a copy of its current state. Since the blockchain protocols involved make it possible to verify that the blocks downloaded represent the correct chain, the party then knows they are interacting with a correct copy of the state, and they can also see past cryptocurrency transfers, and the results of smart contract computations, say. This enables them to safely interact, for example by creating new transactions.
A self-hosted (“decentralized”) Bitcoin wallet must download the blockchain’s past blocks, and re-run the transactions they contain to calculate its bitcoin balances and obtain the transactions sent and received. Meanwhile, those creating Web3 services using Ethereum, say, will typically build the website on a cloud account (for example one provided by Amazon Web Services) by installing a web server, database and Ethereum “local node” that downloads its blocks, and re-runs the transactions, to create a trusted local copy of its smart contracts and data that the website code can interact with. A key problem has emerged with this model over time, which been that as these blockchains grow, it has become more and more expensive to download the blocks and re-run their transactions.
As early as 2013, I can remember my self-hosted Bitcoin wallet taking hours to initialize, and today Ethereum local nodes can often take days to initialize when started, even when running on a very powerful computer. As the result of this key problem, few people now use traditional self-hosted Bitcoin wallets, and when they do, they choose a self-hosted wallet that relies on recent state checkpoints, which involves trusting the checkpoint creators, or worse, and much more commonly, they just keep their bitcoin on a centralized crypto exchange like FTX. Meanwhile, the vast majority of Ethereum developers now simply build blockchain websites using code that interacts with centralized blockchain infrastructure services such as Infura and Alchemy, choosing to trust them as a tradeoff against the cost of running a local node, hoping that those services have not been hacked and aren’t malicious, and also trusting the clouds that these services themselves run on.
For this and other similar reasons, the majority of participants in Generation I and II Generation II ecosystems no longer verify their interactions with their blockchains in a trustless and decentralized way, and instead rely on trusting centralized actors. This not only reduces their security and resilience (for example, Infura has suffered outages because Amazon Web Services has gone down), but also lays them open to being disrupted, for example by regulators who impose conditions upon those centralized actors. This means the architecture that has become the status quo is far from desirable, and those building Generation III blockchain need to leave this old model behind. However, the referenced thread discusses adding support for this traditional model, which is not currently supported by the Internet Computer, for use in specific cases,
Why generation III blockchains cannot easily use the old model
Because Generation III blockchains (today, only the Internet Computer) must operate at massive scale they can’t easily use the old model. Ethereum only processes a handful of transactions a second, while the Internet Computer, was already recently processing more than ten thousand transactions a second, and we hope that one day it will process many millions or billions of transactions a second. Indeed, the network has already processed more than 1.5 billion blocks, and one day will have processed quadrillions of blocks. It is hard to see how developers could run local nodes that could download such large numbers of blocks and replay their transactions economically. Moreover, it would be extremely expensive for the blockchain to store every past block for later download by local nodes.
Another reason why the old model isn’t generally suitable, is that Generation III blockchains must host smart contracts that can directly and securely serve content to end-users, including transmitting content into web browsers, whose pages then directly interact with hosted smart contracts, closing the loop – as this is the only way to obtain the genuine end-to-end decentralization required for security, liveness and censorship resistance purposes, and to allow Web3 services to run fully on-chain under the absolute control of community DAOs, fulfilling a key Web3 promise. Even if some technically feasible way to transparently embed local nodes inside web browsers was found, perhaps by making the local nodes more efficient using zero knowledge proofs, say, so that direct interaction could be enabled using a hybrid of the old model, it’s doubtful anyone would want this!
Forming a blockchain from “subnet blockchains”
The Internet Computer doesn’t need the old model, but understanding why involves an understanding of how it is formed from “subnet blockchains”, and how they work. Each subnet provides the blockchain with additional capacity for hosting smart contracts, allowing it to scale as required. The entire network is directly controlled and managed by a permissionless governance DAO called the “Network Nervous System” (NNS). This forms new subnets from node machines, which are special hardware devices owned and operated by independent “node providers” who install them in independent traditional data centers located in a variety of geographies and jurisdictions around the world.
The NNS creates new subnets very deliberately, by instructing sets of nodes in the Internet Computer network to combine that are sufficiently decentralized, when considering their node providers, data centers and locations, that a) when combined by the network protocols involved they can provide the necessary security and liveness guarantees expected of blockchain, while b) at the same time the minimum possible number of nodes can be combined to reduce the replication of data and computations and thus the cost of running smart contract software. There are different types of subnet, with different replication levels, which host smart contracts with different guarantees, and different operational costs. This approach is unique to the Internet Computer and is called “deterministic decentralization”
The NNS lives on its own subnet, which was, naturally, the first subnet created at mainnet Genesis, since it was responsible for creating every other subnet that exists. A game-changing innovation is that each subnet blockchain has its own unique “chain key,” which it uses to sign its interactions with users and other subnets, and the public chain key never changes, even as the subnet has nodes added and removed. The chain key of the subnet that hosts the NNS serves as the Internet Computer network’s “master chain key,” which like other chain keys never changes, and the NNS uses this master key to sign each new subnet’s chain key, so it can operate as part of the network.
How subnets maintain and use their chain keys
Chain Key Crypto is the name for the special protocol math and cryptography at the heart of the Internet Computer blockchain that makes it possible for subnets to have chain keys. A chain key is produced using threshold cryptography schemes, which involve one public key, and lots of individual “private key shares” that are held by the different nodes in the subnet. This might sound simple, but its not: the complexity is in creating the private key shares in the first place, using an NIDKG (Non-interactive Distributed Key Generation) protocol, and key resharing protocols, which allow the subnet to have nodes added and removed without causing a change to its public chain key. The resharing process in fact runs constantly, with the aim of defeating “adaptive adversaries” that wish to incrementally steal a consistent threshold of private key shares from nodes, by constantly remaking the shares.
So long as a threshold number (i.e. a subset of a certain size) of the nodes are operating correctly, then the subnet can create signatures using its chain key. This is important for the functioning of the blockchain consensus protocols, and also for signing the output of consensus, including the results of newly processed update call TXs, and pre-signing (pre-finalizing) a Merkle root of certified query call TX results, which can then be returned without going through consensus. Threshold signing is Byzantine fault tolerant, which means that faulty (i.e. arbitrarily bad nodes) cannot stop a threshold of correct nodes signing, which property is also leveraged by the blockchain protocols in the way they ensure that subnets are Byzantine fault tolerant (i.e. so they continue operating without any corruption to data or function even when a portion of their nodes are arbitrarily faulty).
Enabling scaling and direct interaction using chain keys
Given the current status quo, it is easy to imagine that a blockchain by its nature must allow users to download past blocks to verify its state, which leads to a misconception that it is a necessary part of blockchain security. However, the requirement is really just that a blockchain provides a tamperproof and unstoppable platform that supports autonomy and is trustless because no centralized actor can do anything other than submit legal transactions to create updates to the state. While it might seem necessary to download the blocks, and replay the transactions to reconstruct the current state, because of past practices, it is possible to devise other systems that can be proven equally unbreakable by mathematics in which cryptography takes care of the verification. This is what Chain Key Crypto protocols do for the Internet Computer.
The availability of chain keys is core to how this is done. When your software interacts with the Internet Computer, the results returned are signed by the chain key of the subnet hosting the smart contract software involved, which key in turn is signed by the blockchain’s master key. Thanks to the way the protocol math works, if the chain key signature validates, this not only tells you that your interaction has not been tampered with, but also that the subnet blockchain returning the result is running correctly and that neither its state or computations have been tampered with (which could only be achieved by corrupting the subnet blockchain involved, for example by overwhelming the fault bounds of its Byzantine Fault Tolerant protocols). This means that it is not necessary for you to download its blocks and re-run the transactions so you can be sure of what is happening.
Chain key signing thus makes three wonderful things possible. Firstly, when two smart contracts interact that are hosted on two different subnets, the subnets can securely pass the traffic involved without needing to download and process the other’s blocks, since they can simply check the chain key signatures on the traffic instead to know that a) the traffic has not been tampered with, and b) the sender is also running correctly. This means that one unified blockchain environment can be created by combining any number of subnets, and that capacity can be scaled by adding new subnets. Secondly, user software, such JavaScript in a web page, can securely directly interact with smart contracts hosted on the Internet Computer by checking the chain key signatures on results, and indeed, the assets composing such web pages, can also be signed and directly served by smart contracts. Thirdly, subnet blockchains can discard blocks they have already processed when the protocol no longer needs them, saving node machines from having to store them – which might quickly exhaust their memory if they are processing a block a second, say.
In summary, a blockchain is defined by the properties it provides to hosted ledgers and smart contracts, such as making them unhackable (tamperproof), unstoppable (always keeping them live) and supporting autonomy (i.e. tokens and smart contracts that exist independently of any centralized party, which provides for sub-properties, such as being censorship resistant). It is the provision of such properties that define what blockchain is, not the technical mechanisms used to create a blockchain network. This means that the Generation I/II model in which participants locally download and process copies of a network’s blocks to secure interaction and ensure copies of the “correct” chain are maintained, is just one possible technical solution. What matters are the provable mathematics of the protocols involved, and the associated security analyses, which show that the required properties are provided. Chain Key Crypto, while undeniably complex, meets these requirements while allowing for the production of Generation III blockchains that can play the role of a World Computer.
Thinking more about what have been called “public subnets” in the referenced thread
Firstly, it’s worth mentioning that things went a little awry because of the nomenclature used by the topic, which talked about “public subnets,” which imply that today’s subnets are private, or something, when in fact they are all public and can be accessed by anyone and run under the control of the permissionless Network Nervous System DAO. What was meant, was today’s subnets do not allow arbitrary access to their current state, and that smart contracts can keep their data private (i.e. your only route to the data hidden inside a smart contract you do not control is through the logic it makes available to you). To gain “unauthorized” access to such smart contract data you need to obtain physical access to a node machine in the subnet that hosts it (and misusing such physical access will soon become harder when node machines switch on their SNP-SEV hardware privacy technology). Furthermore, subnets do not provide access to the previously processed blocks of transactions that created the current state.
The referenced post was only concerned with providing access to smart contract state, and the transaction history that created that state, for specific subnets, such as the subnet that hosts the NNS. Meanwhile, the use of the term “public subnet,” which implied existing subnets are not public, created rather a lot of discussion. Regarding the proposed changes, I do not personally have super strong opinions one way or the other at the moment, but I will highlight some important things to think about.
The technical challenges involved
-
The NNS subnet is processing around 1 block of transactions a second. Should it make those blocks available for download, so that people could try to verify its state using the Generation I/II “local node” model, then each participant running some kind of local node would have to download and process 86,400 blocks a day. Depending upon how many people wished to do that, an enormous bandwidth overhead would be created, which would be expensive, and it’s not clear how that could be measured and node providers compensated via the protocol or NNS. No doubt it could be worked out, but the network would still have to bear a very substantial additional expense.
-
Thanks to Chain Key Crypto math, Internet Computer subnets do not need to keep old blocks around for long, which avoids filling up the memory of the node machines unnecessarily. With some modifications, however, nodes could keep, say, the last 3,000 blocks around (50 minutes worth), which would allow those running “local nodes” to resync if they get disconnected momentarily. Every sync/resync would be very expensive though: a) first a syncing “local node” would have to begin downloading a checkpoint “snapshot” of the entire state, while storing every streamed block thereafter, then applying the collected blocks to the fully downloaded snapshot state to catch up, and b) any “local node” that got disconnected for more than 50 minutes would have to restart this expensive procedure. This magnifies the bandwidth expense involved in streaming the blocks, and might create a DDoS vulnerability that attackers could go after. Moreover, even if only one snapshot were created an hour (and the previous one discarded), this would halve the amount of state that could be stored on a subnet.
-
One of the most amazing things about the Internet Computer is that it is self-updating. That is, a proposal can be submitted to the NNS DAO to update the protocol using a new binary image produced by some referenced source code, and if the proposal is adopted, the image is then used to update the node machines in the network – entirely automatically. The problem here is that the meaning and processing of blocks can change as the protocol design and implementation evolves. This means it would be impossible (without insane engineering effort) to create a “local node” system that could record a state snapshot today, say, and then every block after, for weeks, months or years, that would be sufficient to re-create the current state, since the logic would have to change whenever a block height is hit where a protocol upgrade occurred. Therefore, syncing a “local node” will always involve downloading a snapshot that is less than an hour old, and applying subsequent blocks – which I feel isn’t what people really want here.
Questions about “opening” the NNS
The essential idea is that by creating a new kind of subnet, people will be able to see the insides of the Network Nervous System/NNS, and every transaction (smart contract function call) that updates it.
-
Given the aforementioned technical challenges involved in creating a new kind of subnet that can support “local nodes”, if it were decided that the insides of the NNS must be made available to all, then there is an easier way forwards that involves a tiny fraction of the work, which is simply to add functions to the NNS’s smart contracts to allow people to obtain any information inside they want. Of course, alone this would not reveal the transactions (i.e. how users have interacted) with the NNS to create the information being made available. However, with some rather more substantial work, we could also add logging to the NNS, so that anything and everything anyone did within the NNS, such as staking a neuron, configuring a follow, or voting, would be maintained, say for a couple of days, and also made available. Because the Internet Computer’s Chain Key Crypto protocols create subnet blockchains that are tamperproof, there would be no risk that information returned through such APIs were modified by malicious actors, and we could securely make everything available to anyone.
-
We must ask serious questions about whether doxxing people’s NNS interactions and data would really contribute to a healthy democracy. For example, there are reasons that when you go to a polling booth in an election, your vote is anonymous and private. Arguably, for similar reasons, how much you’ve got staked in neurons, and for how long, who you follow, how you vote on motions, and the links to e.g. your balances of ICP and other governance tokens, should also be kept private. Our community is not immune to toxicity, and those with substantial voting power might find themselves being pressured by zealots keen get their way who wish them to change their neuron follows and/or vote for/against specific proposals. Moreover, they might even be blamed and persecuted for how they voted in the past. The danger is that quiet and reasonable people might be forced out of the governance community, and the useful inputs they provide through voting would be lost.
-
Doxxing people’s NNS interactions and data could cause both security and regulatory risks. Firstly, it would provide attackers with accurate information on exactly who they had to nobble, blackmail or extort, to push through proposals that further their goals. Where gangsters are involved, and sadly nowhere is completely free of dangerous actors, participants in the governance community could find themselves in horrible and dicey situations, and bad actors might cause some real harm to the network by pushing through bad proposals, for example to profit from shorts on ICP. Secondly, aggressive regulators out to make a name for themselves by harming the Internet Computer ecosystem might seek to hold participants in the governance community responsible for the adoption of proposals that they do not like, according to how they voted. Thus arguably, revealing this information could create serious risks for both individuals and the network.
Summary
As is often the case in blockchain, unpacking problems often reveals them to be more complex than they seem on first sight. It is necessary to understand how the technology works in detail, how it might be modified, and consider the potential regulatory, security, game theoretic, tokenomic, socionomic and economic implications. The potential R&D costs and distraction involved with making NNS interactions and data transparent, should we wish to do that, also have to be weighed against the need to ship and polish many other things, such as the SNS functionality, or Ethereum chain key integration.
A last comment is that we can more easily travel in the direction discussed in the referenced thread with the governance token ledgers. Currently, they are implemented in the mode of a blockchain-within-a-blockchain. That is, transactions are recorded in a hashed chain in which each transaction forms one block that lives on the Internet Computer. This, for example, is what enables crypto exchanges to interact with the ICP ledger via the Rosetta API, and meet regulatory demands that require them to have knowledge of every transaction. We could look at ways to modify this so that the signatures on interactions that ultimately caused a transaction are stored with the transaction. This would make it possible to re-run every transaction on the ledger, and prove that some user did not transfer some tokens that they owned, and therefore still should still possess them, independently of what the Chain Key Crypto protocol of the subnet hosting the ledger says. Note that even here there are non-obvious challenges though – what would be the situation if the user had originally received a balance in question via a transfer made by an autonomous smart contract invoked by a blockchain heartbeat!? The tl;dr is that leaning on the hard math of Chain Key Crypto is much easier.