Why are Internet Computer's storage costs so low?

@free
Currently, the block is not published in the regular subnet either. And I understood that it is undecided whether the block will be published or not.

ICP transactions are ledger transactions. The ledger is a canister deployed on the NNS subnet. I.e. ICP transactions are transactions against a specific NNS canister. That canister (the ledger canister) provides an API to retrieve all transactions.

NNS blocks (and subnet blocks in general) are part of the low level protocol that makes the IC tick.

While one could argue that the ledger canister is also technically part of the protocol (as the protocol covers ICP tokens, neurons and so on), ledger canister ICP transactions are more like database records that can be retrieved by interacting with an application (the ledger canister). Similar to e.g. querying Amazon for your past orders.

Whereas subnet blocks are more like raw network packets, i.e. the raw data that your network hardware sees. And I guess this is a pretty apt analogy, for two reasons:

  1. Your network hardware (e.g. Ethernet card or WiFi adapter) handles A LOT more data (even if we only consider your interactions with amazon.com) than just your list of Amazon orders. It’s the same with subnet blocks: they contain a lot more than just the set of successful ICP transactions.
  2. This extra data that your network hardware sees will contain highly sensitive information (e.g. username/password combinations or “merely” all the websites you visited). In the same way, NNS blocks will contain all your (and everyone else’s) Internet Identity logins (from before the II canister was migrated) but also your (and everyone else’s) voting history and interaction with neurons.

So while publishing all blocks (NNS or otherwise) would provide extra verification (in addition to that provided by likely independent nodes running known software), it will also provide information that users may have assumed to be (mostly) private between them and the IC. E.g. while I’m aware that a malicious IC node may log all user interactions and make them available to the highest bidder, this is still significantly more privacy than having all my DSCVR posts connected to my IP address and easily searchable by anyone (including my employer/bank/abusive ex).

Canister history is being implemented, but it will simply be a list of all upgrades/re-installs/controller changes of a given canister. Not all requests and responses handled by the canister.

As per the above, there are still points that are unclear regarding publishing NNS blocks. I for one am in favor of publishing NNS blocks (and maybe those of a few other select subnets, but not all). The main opposition I heard to this is that NNS blocks contain every principal’s voting history and whether directly (threats, intimidation) or indirectly (fear of the above) it may have a negative effect on voting participation and democracy in general.

1 Like

@free
I now have a better understanding. Very interesting! Thank you very much.

I believe we need to start comparing prices with Amazon and other cloud hosting services. It would be an interesting comparison for a social networking site for example with say 5 million users - I belive open chat currently has 1 canister per user - so that’s basically $5 for the year for one of those canisters - I am wondering how they deal with or plan on dealing with something like that…

1 Like

I agree, Internet Computer has higher storage costs than Amazon S3.
I am looking forward to Storage Subnet.

1 Like

Interesting stuff. Is it possible for the cost of storage to get even cheaper over time as the system evolves, or is this as low as it gets? Lets take the example of storing the phone photo for 1.6 cents, could it get even cheaper over time? Is it compareable to the cost of storing on aws or is aws still a lot cheaper? I understand the benefits that come with storing on the blockchain tho, but just as comparison.

1 Like

Actually, the $5/GB/year cost is somewhat optimistic, as it was (among other things) computed for a subnet of only 4 or 7 nodes, not the 13 nodes that are standard now.

That being said:

  1. Storage costs in general keep dropping. So if the past is any indication, it’s likely that IC storage will also become cheaper over time.
  2. We are talking about the cost of replicating every piece of data 13x and in such a way that it can be retrieved and mutated within less than a second (it’s all either canister heap or stable memory). We have been discussing (although we haven’t actually gotten past the discussion stage) about cheaper, immutable storage. Something like Amazon S3, but based on e.g. IPFS. A subnet would be able to generate some piece of content and store it in IPFS; and it would be able to “pin” some content uploaded to IPFS by a third party and use that content in replicated transaction (same as e.g. ingress messages). As long as the content is pinned, it is mutable and costs as much as other canister memory. But when unpinned, it would cost a lot less, as it would require fewer replicas (e.g. 4 instead of 13); it definitely not be mutable; and maybe not even necessarily need to be available within seconds. So one could e.g. use spinning disks instead of SSDs. But still pay for storage using cycles. And the IC would guarantee persistence and immutability. Just an idea, though.
7 Likes

Sounds good even tho im not very technical. With that being sayd, isnt IPFS something that one would avoid in blockchain? I think i heard that saying in some speeches, mentioning the storage of AWS and IPFS in the same breath.
Like for example, people replying to the statement that building on AWS is not how it should go, they say, well you can use IPFS instead. Wich is another form of basically the same thing.

Thats just what ive picked up in some speeches so far, probably im all the way wrong about it.

1 Like

IPFS is a decentralized protocol for storing, looking up and retrieving immutable content. Similar to Bittorrent in many ways. I don’t know in what context it was compared to AWS, but I don’t see much in common between an open protocol and a big tech company.

IPFS by itself does not provide any guarantees that your data is accessible at all times; or even that it won’t disappear altogether. But there are networks built on top of it (such as Filecoin) that provide those guarantees (by rewarding nodes to store specific pieces of content). The IC could rely on something similar to Filecoin, but e.g. using cycles for payments and ICP for node rewards, so it more seamlessly integrates with canisters.

5 Likes

“IPFS by itself does not provide any guarantees that your data is accessible at all times; or even that it won’t disappear altogether” Ok i heard that saying before about IPFS. Probably i got it mixed up with something else then.

Sounds great, nice how you all are looking ahead.
One last quick unrelated question about the transaction speed that is sayd to be 100 - 200 ms. Is it possible for it to even get faster over time like for example around 15 ms to run multiplayer games that require that kind of latency? Probably not right now, but i believe Dfinity can make it happen in the future!

Transaction latency is on the order of 1-2 seconds. This is because, in order to achieve consensus and have all replicas execute the exact same transactions in the exact same order, an ingress message (user request) has to go through a number of steps:

  1. Be received by a replica and gossiped around to others.
  2. The block maker (a randomly chosen replica out of the 13+ on a subnet) needs to include it into a block and gossip that block around to all other replicas.
  3. All replicas (or at least two thirds) must validate and notarize the block; and gossip their notarization to all other replicas.
  4. As soon as a replica has notarizations from 2/3 of all replicas, it can proceed executing the message (plus all other messages in the block).
  5. Once execution is complete, the replica creates a canonical representation of the state of all canisters; certifies it; and gossips the certification to all other replicas.
  6. Once a replica has identical certifications from 2/3 of replicas, it can “publish” the response, certified by the subnet.
  7. The client agent polls a random replica and sees the certified response.

As you can see there are a number of steps where some piece of data must be gossiped around to all replicas before any of them may progress to the next step. Meaning that this data needs to travel across the Atlantic and/or Pacific a few times before the transaction is executed and its result certified (so the user may be certain that a majority of replicas agree on the response).

One could cut down on the latency by putting all replicas on the same continent / in the same country / in the same data center. But this has obvious effects on the level of trust and (de)centralization. You could even have a single replica subnet, and skip all the network latency. But the protocol still requires a number of synchronization points (a block must be built; then executed; then certified), meaning that it is unrealistic to expect 15 ms latency regardless of how far the subnet has been stripped down.

4 Likes

Thanks for the detailed answer, very insightful i really appreciate it! Interesting how this works and actually pretty impressing that something like this can work within 1-2 seconds.
Recently in a twitter spaces with Hashkey, i think it was mentioned that a query call can reach between 100-200 ms for gaming, and was wondering if there is a option for this to even get faster, because you know how hard core gamers value their low latency. But personally i am proud and satisfied with the stats as they currently are and for me it would be more then enough.

Queries are read-only requests, executed by a single replica. They can be as fast as the average Web 2.0 query, but are read-only. You cannot make an interactive game out of them.

Updates / transactions are executed by all (or most) replicas and they are slower, as per the above.

1 Like

Ok thanks for clearing it up. This was very helpful and i appreciate the time and patience.

We can find different words to try to make it look like it’s the same as having the storage on the IC, but isn’t like that.

Once you stored data on ipfs is not yours anymore, I don’t see how ICP can provide guarantees over that data, everyone needs to align with Dominic’s vision and stop trying to find quick solutions or alternatives, a lot of investors are here because of the promises dominic has been saying, so I don’t want to see that we are taking a “similar approach to filecoin” this just demotivates me to be honest

Also if wouldn’t be on chain anymore, that’s the future we want?

By DAO’s (no central control) or blackhole so it becomes immutable.