Can ICP achieve stronger data protection than centralized cloud providers?

Hi everyone!

Centralized cloud providers are constantly targeted and breached, which leads to data loss and downtime. Data may be encrypted, but it can still be decrypted with future tech.

With ICP we effectively get 100% uptime—but how do we reach the highest level of data protection?

If we put all data on one subnet and a single node is compromised, the whole dataset is at risk. I don’t see a strong reason to use that model.

So I’ve been thinking: what if we shard data across multiple subnets? That would significantly lower the risk of compromise, since an attacker would need access to at least one node in each subnet. That’s harder, but the data could still be compromised when reassembled.

Is it even possible to achieve a higher level of protection with ICP than with classic centralized cloud providers?

I dream of a world where each actor owns their own data, and the only weak point IS the actor, not the system. That would greatly reduce the damage a single breach could cause.

I’d love feedback on whether this direction makes sense or if it’s unrealistic for ICP or in general.

Thank you

1 Like

This is what I’m waiting for but right now, you could encrypt your data you store, but you don’t have encrypted state, this is what TEEs will bring, but side-channel attacks are still possible on TEEs, FHE & MPC prevents this but adds massive overhead, see Arcium for MPC or Fhenix for FHE (just examples no Ad). I for one wait especially for the swiss subnet TEE.

Yes, TEEs mitigate the risk, but as I understand it they’re still just a coating—they don’t fix the underlying design. And as stated in the vetKeys paper (https://eprint.iacr.org/2023/616.pdf):
”TEEs do not provide a very strong level of security for user data, and in particular, fail to provide any relevant protection against an adversary that has physical access to the node machines.”

I think we need a new architectural vision for how canisters should store data, so that even if a hacker gains access to a node and sees the state or data, the design doesn’t give them anything usable. It feels like a system-architecture problem I’m trying to solve.

I don’t have the answer, but I suspect it would require some form of cross-validation and data transformation—data spread and reshaped so that no single view reveals the whole picture.

I don’t know—what do you think? Is that even necessary?

yes, TEE is just a Hardware Trust layer that is possible to attack but a layer of security.

I think we need a new architectural vision for how canisters should store data, so that even if a hacker gains access to a node and sees the state or data, the design doesn’t give them anything usable. It feels like a system-architecture problem I’m trying to solve.

It is architectural, based on my research a MPC network is the best way. But MPC (or FHE) are high overhead, but if we’re willing to pay for the cycles we could get it. Another thing to consider would be to outcall to Arcium to handle and store data fully private, but this increases complexity because of circuits.
You see I already tried to find the solution without a feasible path. For now either data and state can be public / leaked or be encrypted E2E.
In my example skillsic.com i tried to outsource the “sensible” data (API Keys of AI provider) via dual key encryption, to the TEE, thus only the TEE can encrypt the API key from the canister data.

Interesting, thanks for highlighting that bit.

It therefore seems fair to state that, in their current state, TEEs do not provide a very strong level of security for user data, and in particular, fail to provide any relevant protection against an adversary that has physical access to the node machines.

@rbirkner, is this statement consistent with the latest state of affairs? My understanding was that the IC is moving towards TEEs specifically in order to protect against potential bad actors having physical access to the nodes.