Shuffling node memberships of subnets: an exploratory conversation

Have there been any updates on this topic? It seems to be scheduled for Q4 2022.

5 Likes

Is this feature related to the feature on the roadmap titled “Nodes can be reassigned to a different subnet”? That one is marked as Deployed.

1 Like

That feature “nodes can be reassigned to a different subnet” is what the name suggests: Before nodes could only join one subnet and then have to be redeployed before they can join a different subnet (for no good reason), and that limitation has been taken away. This feature is indeed done.

Yes you could say they are related: being able to switch subnets is a prerequisite for subnet shuffling.

5 Likes

Is this something DFINITY is still working on? With randomized node reshuffling, it would actually achieve shared security that increases with each new subnet.

When it comes to decentralization and security, it would be the holy grail for ICP.

6 Likes

any response on this dfinity, it has become a major criticism of analysts with seemingly a simple solution

2 Likes

Why does @dominicwilliams not to do node shuffling and insist that 13 nodes are the most decentralized?

But it is obvious that this conclusion is not convincing at all.
The fact is that 13 nodes cannot achieve sufficient security required for issuing stablecoins.

2 Likes

Fiduciary subnets, which would be hosting a native stable coin ledger, have 34 nodes.

The concept of node shuffling does not increase security. Instead, it decreases security. With node shuffling an adversary simply has to wait until enough of his nodes happen to be shuffled onto the same subnet, at which point he can attack that subnet. Without shuffling he has only one shot - the initial assignment of his nodes.

Node shuffling significantly reduces the cost for an attacker. Because he gets a free re-assignment of his nodes with each shuffling event. Without node shuffling the only way for an attacker to get a new node assigned is to go through the onboarding process of new nodes (or new providers).

Also note that onboarding new node providers infrequently in batches works in favour of the security of the IC. If a new batch of nodes is onboarded, assigned to subnets and never moved then the attacker has to be the majority of the new batch in order to overcome the attack threshold.

There may be misconceptions out there around what node shuffling can achieve. Regarding that one “analyst” that came out on Twitter: he simply had malign intentions. He essentially said “i told you to do node shuffling years and you haven’t done it”. While all this time there was a reason not to do it.

4 Likes

Does node shuffling reduces attackers cost due to the fact the size of the subnet and pool to pick nodes from is relatively small?

Part of the innovation of the original protocol design was using VRF to pick a random subset of nodes each block, while that is not a 1:1 equivalent to node shuffling, since with the latter membership is changed periodically instead of each round and joining nodes have to trust whatever state the subnet provides them as the source of truth, it supposedly provided good security guarantees assuming a significant majority of nodes weren’t malicious.

I notice that Chainlink, who’s DON networks have a similar node structure to IC subnets, don’t do node shuffling either AFAIK. And they are extremely security focused, so I don’t see any evidence in the industry that node shuffling is good idea.

1 Like

I think there’s a choice between two situations, if we’re to assume that attackers have sybil’d their way into being recognised as many distinct entities (such that we can’t rely on the IC Target Topology restrictions for security).

  • With node shuffling: Attackers would have a limited window to exploit their >2/3 control over a subnet, but their windows will occur more frequently
  • Without node shuffling: Attackers would have a practically unlimited window (more or less) to exploit their >2/3 control over a subnet (perhaps going unnoticed for some time), but they may have to wait longer for these circumstances to emerge

The utility of node shuffling seems to rest on the expectation that attackers cannot do as much damage if they have a limited window over which to exploit the subnet. I’m not sure how true that is in the general case, but you could certainly imagine scenarios where this might be true.

Eventually subnet membership changes will not require an NNS proposal, and will be automated by the NNS, with the decentralisation calculation and decision taking place on chain. At that point I can imagine node shuffling being useful. It would be easy to implement at that stage, and while I’m not sure it gives you that much, I don’t think it hurts.

3 Likes

You raise a valid concern about a node operator tampering with a node to access sensitive data. I agree that a malicious operator would gain an advantage through node shuffling.

However, when it comes to subnet takeovers, such as multiple node operators colluding, node shuffling would significantly enhance security. With each new subnet, this security measure would further strengthen the network.

After reading your post, I agree that node shuffling should be put on hold until a reliable solution is in place to prevent malicious operators from accessing node data. While AMD SEV enhances security, it is not bulletproof.

1 Like

I’m not sure what you’re stating is correct here. Wouldn’t it depend on the size of the subnets, the total pool of nodes, the shuffling frequency, and the nature of the adversaries (adaptive vs static)?

I think the answer is yes. We’ve done a good basic exploration of the math, you can look previously in this thread for that.

I think this statement is quite incorrect in the general sense, and it’s not as easy as just saying that node shuffling is insecure.

4 Likes

As I understand it, if a malicious operator has compromised a node to access data, they could continue their attack when their node is later shuffled into another subnet, granting them access to additional sensitive data.

Subnet shuffling helps prevent subnet takeover by colluding operators, but it also introduces an additional security risk when it comes to data access from single malicious operator.

1 Like

My focus in this conversation has been on collusion to break integrity, not confidentiality. All node operators can see all data yes.

TEEs, FHE, and MPC will hopefully eventually allow us to solve those other issues.

There is currently no BFT on confidentiality whatsoever.

2 Likes

I totally agree that node shuffling should be the goal for real shared security. But data sensitivity is a big concern, and a malicious node operator pulling data from every subnet their node gets assigned to isn’t a great idea either. That’s why I think this issue needs to be solved first.

From a technical perspective, node shuffling itself isn’t the problem, but it would eat up bandwidth and resources.

2 Likes

The original design you are referring to was before sharding. It was a design where there are thousands of nodes who all have the full state and then in each round a committee of 400 nodes was randomly selected to produce the next VRF output, etc. This works fine for small state where each node can have the full state. It could be used for a small state, low throughput (small blocks) “main” subnet that merely coordinates other subnets. The bottleneck here is not the VRF or the consensus. The bottleneck is a) gossiping the new transactions and block proposals to all the nodes in time and b) sharing the same state among all the nodes. Here, a) is slow and b) is expensive. That’s why “worker” subnets are smaller. That being said it might be possible to use this design for the NNS subnet or at least for part of the work done by the NNS subnet (if we were to split the NNS subnet further into two layers).

Worker subnets exist because the subnet state is so large (terabytes). It would not scale if all nodes had the state of all subnets. Node shuffling then has the challenge that when a node moves from one subnet to a different one it has to download the entire state of the new subnet.

3 Likes

Thanks for pointing this out. This is probably the reason why people disagree on the utility of node shuffling. I was always writing under the assumption that any exploitation, even if for a limited time, is fatal to the whole IC. Damage done in that limited time could be potentially unbounded. And it could have already proliferated to other subnets and could be impossible to roll back without halting and restarting the entire IC. The reputational damage would also be impossible to recover from. Hence, I see no benefit in being able to limit the time of damage by shuffling new nodes onto the subnet. Everything I wrote above has to be understood as being under this assumption.

1 Like

I would say:
Node shuffling per se (vs not shuffling) increases the probability that the subnet is compromised at some point in time. That statement is true regardless of the size of the subnet.

This statement is relevant only under the assumption that we consider a one-time compromise of the subnet as fatal.

1 Like

It seems that the participants in this discussion take the potential for NP collusion seriously. There’s a discussion related to this taking place on another thread. I’d be interested to hear more people chip in with their thoughts if you have the time :slightly_smiling_face:

I’m not convinced this is true. Would you mind have a look at this post @timo →

The attack vector described there would be a lot harder to action under a periodic node shuffling regime. I might have a quick stab at writing a simulation that demonstrates this.