Shuffling node memberships of subnets: an exploratory conversation

Attack: Censorship

If some organization wanted to stop a canister they didn’t like, they would currently only have to take down a few identified nodes.

If nodes were shuffled daily at random times, it would be very difficult to keep up with attacking the right nodes quickly enough before the canister gets passed on to new ones.

10 Likes

It seems the main question being posed is “Which attacks are potentially mitigated and which new vulnerabilities are opened by implementing shuffling node membership of subnets?”

The main attack vectors that could be mitigated seem to be well described by the others here. I see daily or similar interval of node shuffling as significantly raising the difficulty of node operator collusion or censoring nodes. The net effect seems to be increase the security of any individual subnet to approximate a subnet with all existing node operators as members rather than just the smaller node membership of any subnet.

To me it seems the concern for new vulnerabilities center on discoverability of other subnet members. As @diegop explains, a contingent of compromised (threatened, corrupted, hacked, etc.) but ostensibly honest nodes could eventually be randomly shuffled into a subnet together (which given enough shuffling should eventually happen), discover the joint membership of the other corrupt nodes to be >2/3, and coordinate to tamper with their subnet’s canister, especially if that canister holds valuable digital assets or information. The best protection against this vulnerability would be to shield the identity of the node operators from one other in any subnet, which I do not know whether it is feasible to do.

However the status quo now is that any motivated actor or group of nodes could work to corrupt known nodes on known subnets whose membership is static. I would rather have the moving target problem of shuffling subnets spontaneously being assigned compromised members than the non-moving target problem of static subnets with fixed (although distributed) points of failure.

4 Likes

I don’t have any new attack vectors to contribute to the discussion, but I think the ones that have been laid out so far seem valid, are concerning to me, and are worth preventing. Obviously the Dfinity foundation has to juggle competing concerns on what to prioritize next so this may not be at the top of the list, but I believe they should be addressed at some point.

I’ve spoken with @lastmjs before about some of these attack vectors and they seem theoretically sound, and potentially not that difficult to even pull off. What worries me even more is that even though Jordan is very smart and has obviously thought about this a lot, he isn’t even a security expert by trade. It makes me wonder what other even more obscure vectors might be discovered by those who have studied attack vectors for years.

I’m not saying that node/canister shuffling has to be the solution to these problems. Maybe there’s a better way to tackle them. Or maybe there’s some reason that these aren’t valid. But if these attack vectors are as valid as they seem, and we know about them, it would be foolish to ignore them, regardless of how low the probability seems. As the internet computer grows so to will the incentive to exploit these. Once again, we may not need to fix them as the very next roadmap item. Hopefully for now the care we’ve put into selecting node providers will keep us safe. But we should have a game plan on these and shuffling seems like it would do the trick nicely.

Also, thanks for everyone’s comments so far. I’m not in the weeds enough yet on this so this thread had been really enlightening.

6 Likes

Thanks everybody for the insightful comments! I understand that it’s a lot of effort to write all these thoughts down in detail, but I think this really helps for the discussion. To keep an overview, so far people brought up the following possible advantages of shuffling:

  • node operator collusion (a majority of the nodes of a subnet are malicious and tamper with the subnet state)

  • security as a network effect (very related to the point above)

  • miner extractable value / front running

  • attacking enclaves

  • censorship resistance

I’ll first reply to the ones that I find less convincing, and then comment on the ones that i think are the best arguments in favor of shuffling.

miner extractable value / front running

A block maker in any blockchain can order messages in any way they like. That is because we haven’t agreed on an ordering yet, we are placing this on a blockchain to reach agreement on an ordering. This means that a malicious block maker can always try to order messages in an order that is favorable for itself. A concrete example of such an attack is if the block maker sees a big sell order to some dex coming in, it can quickly place a sell order itself first, and make sure that appears in the block before the other sell order. Note that this attack only requires a single malicious block maker, it does not need a collusion between multiple node operators, because the MEV block maker will still propose a block that only contains valid messages, so all other replicas will accept this block. They cannot know that the MEV block maker is actually front running, because there is no agreed-upon ordering of messages yet.

Another solution is…you guessed it, node shuffling! If as a node operator you never know which subnet you’ll belong to, and thus which canisters you’ll be hosting, it would become difficult to install the correct modified replica to take advantage of MEV. And 2/3 of the other nodes in the subnet would also need to install this software, so even if you convinced a cohort of buddies to join you in running the modified replica, you’d have to hope you’d all be shuffled into the same subnet that hosts the canisters the modified replica is designed for.

See my point above, this attack can be done by a single malicious replica, you don’t need a majority. The idea that the node operator/replica doesnt know which subnet it belongs to and which canisters it hosts is very difficult to realize: the replica must only accept messages to canisters that are hosted on the subnet you belong to, so you couldn’t properly validate blocks if you dont know which canisters you’re running.

How can this be mitigated? Secure enclaves is one solution, assuming we can get attestations from the enclave that the replica has not been tampered with.

I think this is a promising way to address MEV, and the foundation is already investigating how this can be done. So if we run this in some TEE and require attestations, it will be way harder to run a malicious block maker. Other countermeasures can be taken in the dapp itself. For example, if you are a dex, you could think of batching transactions and using secure randomness to shuffle transactions and execute them in a random order, making front running much harder.

Attacking enclaves

That’s an interesting point that I didn’t think about yet. I’m not an expert on these things, but I would imagine that best way to get info out of the TEE via side channels is to target the encryption key via side channels, not to attack canisters directly. Changing subnets would not change your TEE, so then shuffling wouldn’t “reset” the side channel attack. I’ll try to get some of the experts involved in this one.

censorship resistance

Right, so an attacker targeting a certain canister can figure out which subnet that canister is one and attack nodes from that subnet. @MalcolmMurray, what type of attack are you thinking of here? And how quickly do you imagine nodes would be shuffled? If a couple of nodes are swapped out say every hour or so, and the attack is just a simple DoS attack, then I’m not sure shuffling helps: an attacker could easily keep up with the subnet membership changes because it’s easy to target a new node with a DoS attack. For attacks that take along time to perform, shuffling could indeed help.

Now, I think the main ones are left.
node operator collusion and security from the network effect

This is a super important one and I think the core of this discussion. The argument of “security from the network effect”, so the bigger the ICP grows, the more secure it becomes, is obviously super compelling.

So the internet computer could constantly switch nodes between subnets, such that for example, it makes sure that a node never spends more than 1 week on a given subnet. If we further assume that attacking a node takes > 1 week, then the adversary can no longer target specific canisters/subnets, which is a nice property to have.

Designed appropriately, I think the laws of probability would prevent this exact situation from happening. I remember earlier DFINITY materials which explained that with sufficient numbers of committee membership, the random beacon driving probabilistic slot consensus or some other complicated mechanisms would ensure with outrageous (I think the more correct term is overwhelming) probability that no individual committee would be more than 1/3 malicious. The committee size was something like 400, and committees were chosen out of the total number of nodes participating in consensus. Each committee was in charge of validating for a certain period of time or something like that.

Right, you’re referring to our old whitepaper. This is where things get more tricky. We now essentially sample a new subnet membership every week, and here we want to then draw from the security of the overall amount of nodes. Suppose our total pool of nodes is very large (so i can get away with doing binomial distributions). Let’s do some computational examples. Suppose up to probability_corrupt of all nodes are corrupt, we sample a subnet of size subnet_size, then we compute (using binomial distribution) how likely it is that we select a subnet with more than 1/3rd of the nodes being corrupt, which i call p_insecure. Below you see some examples, here is the google sheet i used.

So what you see is that if for example we assume 1/10th of all nodes in the IC are malicious, and we randomly select a 28 node subnet, there is a 2^-12 probability that the subnet is unsafe, because more than 1/3rd of the nodes are corrupt. This is a small chance, so that is good, but if we regularly choose new subnet members, then every time the sampling needs to be successful. If we have 50 subnets and do this every week, we do this ~2500 times per year, and each one of those must be successful. I think this is the main price of the reshuffling. If we don’t reshuffle, we don’t need to get it right 2500 times per year, but just once, which is obviously a better chance, so theoretically we could tolerate more corruptions overall. How do people feel about these numbers?

So to recap, based on these numbers, I think we can say:

  • shuffling nodes is nice against “adaptive” adversaries that target one subnet, assuming it takes some time to corrupt a node
  • shuffling nodes is not so nice in the sense that we select new subnet memberships every time, and each time, we must select a secure set of nodes. We can get unlucky by combining too many malicious nodes, that can then collude to break the subnet
  • static subnets are weak against adaptive adversaries
  • static subnets are better against static adversaries
10 Likes

I think the question comes down to what is a more realistic modelling assumption. An adaptive adversary seems much more realistic to me, where there is some probability that a node is corrupted over time - say 5% in any time interval. In this real-world scenario you may expect for a node that has been corrupted to stay corrupted, so you would effectively “running repeated trials” - the only difference being that by not shuffling you have no mechanism to
prevent the build up of colluding node providers.

Also a factor of 2500, doesn’t sound too high. I would imagine that this could be compensated for by a small increase in the number of nodes running the consensus? It should only need a logarithmic increase in the number of node providers?

7 Likes

Beautiful points. To further crystalize the implications of your fault analysis @Manu, I added some columns and resulting graphs that forecast the odds of staying at or below Byzantine Fault tolerance in a canister. This is all as I best understand it, but someone please correct me if I am wrong. I take the time interval over a hundred years for various shuffling intervals (daily, weekly, monthly, quarterly, yearly, and never) and take the odds of a halt-free (no faults) canister for 1 instance of a subnet (1-p_insecure) raised to the power of the total number of shuffles (100 years*#ofShuffles/year), shown below, but all available at this google sheet.

Basically, the message of these graphs is the less shuffling the better to avoid matching corrupted nodes into a subnet, assuming a given level of corruption. Having more nodes in a canister obviously helps reduce the likelihood of a fault. (In the spreadsheet I calculate up 50 and 100 nodes too, but omitted here for the sake of space).

However, I think the model assumes a percentage of corrupted nodes regardless of shuffling, even though intuitively it seems the temptation to collude in a static subnet should be more and would grow over time. If we assume that staying on a subnet for a long time comes with a greater temptation to collude, then we would have to run some new numbers. The problem is, the math gets more complicated and requires its own batch of assumptions. Maybe someday soon.

9 Likes

Very nice @rubenhorne!

I think Rubenhorne’s data above actually shows that it still hurts quite a bit. We need that subnets are going to be secure with overwhelming probability. We see that if we assume at most 5% of all nodes are malicious, and we sample a 28 node subnet, the probability that it will be insecure is 2^-20.78, so almost 1 in 2 million. That seems pretty good. But now if we do this 2500 times and we need to be right every time, it’s only 2^-9.5, so more than 1 in a thousand. I think that probability is not small enough to call secure.

We can indeed increase the subnet size to counter this. As an example: sampling a 28 subnet once (with overall corruption 5%) has roughly similar success probability as sampling a 46 node subnet 2500 times.

7 Likes

I agree with the importance of the assumption of adaptive adversaries. I’m more concerned with node operators that start out honest or seem honest at first, and then start becoming corrupted over time.

I have some rebuttals to arguments made and some deeper information I hope to get out next week.

7 Likes

I will point out that 4 minutes is very much a theoretical lower limit. In practice, it would be on the order of hours rather than minutes, for a number of reasons:

  1. Some node operators only have 1 Gbps and that may only be burst traffic, not sustained. I believe we currently require 10 Gbps from new node operators, but again that may be for burst, not sustained traffic. As a specific example, a few months ago when I was testing using QUIC for XNet messaging, the highest sustained throughput I saw between Europe and US West Coast was 400-500 Mbps.

  2. Most subnets are well below the 300 GB limit right now but that’s exactly because we explicitly tried to delay this occurrence for as long as possible (by adding subnets to spread the load) specifically so we don’t require many, many hours to state sync a node should it need replacement. In the long term we will want to use as much of that 300 GB capacity in order to make the IC economically viable.

  3. An admittedly less important, but still worth mentioning point is that transferring 300 GB assumes the state of the subnet does not change (much) for the duration of the state sync. This may be a good enough approximatioon if state sync actually takes 4 minutes, but if we go with a more realistic 1 hour estimate, then by the time we’re transferred the whole 300 GB it is likely that some of that will have changed. So we’ll need to state sync those changes and then do that again and again, until we’re fullly caught up. Which means it is likely we may need to transfer quite a bit more than 300 GB.

  4. Even given an actual 10 Gbps of sustained bandwidth per data center, there is an underlying assumption here that nothing else is using that bandwidth. I.e. no gossip, no ingress or canister-to-canister traffic, no replica or OS upgrades and (more importantly) a single node doing state sync at a time. Some discussion here suggested swapping out a node every 1 hour.If so, you would definitely have more (and possibly a lot more) than 1 node state syncing at any given time, resulting in only a fraction of this (very much theoretical) 10 Gbps available to each node. If you now have to do an emergency swap of a node on a subnet (because some othe rnode went up in smoke or whatever) the situation becomes even worse (particularly for this emergency swap).

All of the effects above are cumulative: if you only have half the theoretical 10 Gbps available for sustained traffic; the subnet size is very close to 300 GB; you need to transfer 150% of the state size before you’re fully caught up; half of the available bandwidth is used by higher priority traffic (gossip, ingress, XNet); and one other replica is doing state sync; you suddently have 1/8th of the bandwidth to transfer 450 GB, i.e. you’re 12x under the theoretical maximum.

10 Likes

I’m thinking more along the lines of political or even military attack. The more often the better, subject to practical constraints.

3 Likes

“Adaptive” adversaries who are tempted by specific target seem more realistic and relevant to me than “static” adversaries who are malicious on principle and would attack indiscriminately.

Also, if I understand correctly, this analysis relies on the assumption that these universally malicious nodes would instantly know when they have been grouped in a subnet with collaborators. I guess this is possible if they know each other ahead of time, but I think it’s more likely that they would need to discover each other, which is a risky and time-consuming process, made less likely by node-shuffling.

5 Likes

What if we could transfer canisters between subnets? E.g. to have a special API, that once called will make the canister reappear at some other random (and available) subnet. The canister itself can cover all the expenses. The developer behind the canister can decide for themselves whether they want to use this mechanism to provide a more secure experience or not.

Like an update_code call, but update_subnet.

2 Likes

This is so much better.

This could even create a positive side effect. Since node providers are publicly known, it is in their interest to keep their reputation. Otherwise devs would just migrate their canisters to better subnets leaving bad boys without money.

1 Like

This looks like a good idea but I can see that it would also open a new set of problems:
1 - What if many projects decided to move to a specific subnet at once, wouldn’t that congest this particular subnet?
2 - If you allow a canister to be moved at will then a canister can select an attractive subnet which holds a lot of value and try to DDOS this subnet or congest it by taking all the traffic or state.
3 - What if a canister that’s not allowed in some jurisdictions decided to move to this jurisdictions, would it cause legal issues to the node provider?

2 Likes

Thanks a lot for all this info @free! So I was a bit optimistic in my concrete numbers. It does also point to the fact that currently we allow subnet sizes that might pose challenges for state sync, that need to be addressed separately. For the purpose of this topic, would you say it’s reasonable that we could at least swap out a couple of nodes per subnet per day?

That’s a good idea, this is definitely something that the internet computer will need soon (eg to load balance between subnets). I know some of my colleagues are already thinking about this. However, I don’t think this really addresses the same problem as shuffling subnet membership, and it has some downsides:

  • moving canisters does not make the subnet more secure. In particular, we don’t get the property of “every subnet becomes more secure the larger the internet computer grows”
  • moving a canister between subnets will very likely involve downtime for the canister (or at least, doing it this way will make the problem at lot easier). Adding and removing nodes to/from a subnet does not stop the subnet since its a fault tolerant system.
4 Likes

With the current data center sizes (<20 nodes) and bandwidth (10 Gbps), sounds doable. As at a subnet size of 13, it would mean 3 nodes (on average) doing state sync within a 24 hour period.

But given larger data centers (without a commensurate increase in bandwidth), smaller subnets or more than a small handful of nodes switched out per subnet per day, you would run into bandwidth limitations at some point.

3 Likes

@Manu but if we deem the attack as “I want to minimize the expected amount of time that a malicious node may have access to a canister” (as per @skilesare suggested), would that not be a relevant mitigation tactic.

That being said, i do not have an opinion on how highly that attack lands in the stack ranking of risks.

2 Likes

The examples mentioned around shuffling so far mostly take a completely random sample of the population. Wouldn’t we want to do this semi randomly based on other factors like:

  • Optimising for the biggest number of different data centers
  • Optimising for biggest number of different node owners (if that is possible)
4 Likes

That’s a great point @Fulco, i think that we should definitely take that into consideration.

It would, all i meant was that increasing the security of subnets is nicer, because then all canisters automatically get more security. If you can only get this increased security if the developer moves their canister all the time, and canister migration would incur some downtime, then you need to choose between convenience+availability and increased security.

3 Likes

Thank you for clarifying the consensus mechanism. Even if a single malicious replica can reorder transactions, that replica I imagine would need to be modified to reorder transactions for specific canisters.

It seems reasonable to me that MEV-specific replicas would need to be created with individual canisters in mind. So I think shuffling would still help, as the node operator would only be able to reorder transactions and reap MEV when a member of a subnet that has the targeted canisters.

Perhaps MEV can still be programmed into a replica without targeting specific canisters, but just classes of canisters…seems less likely. And perhaps if a sufficient number of replicas are MEV-modified for 100s or 1000s of canisters, then shuffling wouldn’t help.

But it seems that shuffling would still make it relatively more difficult to pull off MEV, than without shuffling. Combine that with enclave attestations and in-canister mitigations and I see MEV is nearly impossible to achieve.

But perhaps shuffling offers little benefit relative to attestations and in-canister mitigations, I am not sure.

2 Likes