Yes. Anyone can record anything. The next unit who is picking up these messages will have to decide which ones should be used.
As with all other off-chain execution models, things recorded on-chain are only there for availability. How to interpret them is up to the code running off-chain excecution.
Thatâs in the first year. Costs grow quadratically over time. So in 10 years youâd have to pay 50x more per year. Storage costs may decrease over time, but quite possibly not at that rate.
Why quadratically? Sure with more use, but is there something else that make it grow faster? Costs also decline according to mooreâs law(or have historically) so 5000 should be less than 78 in 10 years.
YesâŚthis completely depends on if there is a use case. I think I have one for tokens, but likely less important for websites and other static files. Although it can be hard to know what might be interesting eventually. Data is getting more and more valuable with AI. I guess the question to ask is what is in the block data? And what is in the block data that a contract couldnât record itself if it were interested in eternal persistence? (Trying to understand the actual bytesâŚI donât think a bunch of trace info about who has signed would be interesting. calldata very well might be).
To me this begs the question of why arenât we setting up some kind of staking for boundary nodes? The certification dance is a pain(especially for dynamic queries) and some financial guarantees would be interestingâŚmaybe needs its own thread.
My dots could have been better and better laid out. I even screwed up a vocab word(thanks @timo ⌠fixed above.). But to be fair I did it on an iPhone while drinking beer by the river with my kids throwing rocks far too close to me.
As far as why I think and evm on ao would be slower is due to the messaging going through the arweave layer. Depending on your trust assumptions, this is going to add latency over the ic model(maybe the IC model needs to come down on the security axis a bit due to lack of economic slashing security).
My cu running from the IC is going to be slower than what may be possible on an unbounded cu, but mostly because of the tecdsa signing(if you are ok with an on-node key then this could be fasetâŚbut less secureâŚenclaves would make this go away). But I think its going to be fast enough for âmostâ applications that are typical in crypto land. It wonât be able to pick up and run any old cu wasm. But youâll get 13x node agreement out of the box. So some trade offsâŚweâll see once Iâm done if it is practical.
Unless the nodes writing the output are all on the IC and agreeing to sign with a common tecdsa key and that is the security(You only trust messages signed with that key). :). (The eventual question this begs is why use the ao layer then? My very early theory is only memetics or a very specific kind of compute that makes other hoop jumping worth it.)
Because Iâm stupid. I was thinking of the total cost. After 10 years you would have paid 1 + 2 + 3 + ⌠+ 10 = 55 times as much as what you paid after one year. But the growth is linear, not quadratic. Only the total cost is quadratic.
Regardless, trying to predict the future (especially when it comes to costs) is fraught. E.g. looking at this chart I found online (and assuming itâs vaguely accurate), the cost of disk storage only dropped from $37 to $13 between 2011 and 2022. Thatâs less than 3x. More than one order of magnitude away from what Mooreâs law would have predicted.
I donât think that staking addresses the issue that Paul brought up: yes, you can speed things up, but if you have a long chain of transactions spanning multiple canisters/subnets and one of them towards the beginning turns out to have been incorrect, what can you do about it?
And if youâre going to wait for certification whenever a chain of transactions is involved, how is that different from issuing an update and a query at the same time and temporarily using the output of the query (e.g. displaying it in a front-end) until you get the certified response from the update?
Lots of great discussion here on many topics here! Iâll quickly reply wrt instructions limit:
I donât think you can certify things per canister, this might work for 10 canisters on a subnet but seems entirely unfeasible for 100k canisters on a subnet, because creating threshold signatures with all subnet nodes is still significant work.
I agree this would be great. Currently I donât think we know how we could extend DTS to go over checkpoint boundaries, so thatâs one limitation we have now. We are planning to propose to double the DTS instructions limit (from 20B to 40B), so that should already help here. Increasing beyond that would likely be more involved, so I am not aware of any concrete plans there.
Itâs only sort-of transparent. If your process does actually exhaust the RAM, the system will likely collapse for all practical purposes due to thrashing. Another example are supercomputers; in theory, they expose petabytes of RAM, but generally use a NUMA (non-uniform memory access) design, so you have to design your app quite carefully, otherwise itâs going to be extremely slow. So straight up âinfinite memoryâ doesnât exist in Web2, even in the supercomputer world, and thatâs a world with highly specialized (and expensive) hardware and 0 maliciousness. I donât think itâs realistic to expect it in a Web3 world either (and I donât see how Arweave solves that, for example).
Thatâs not to say that we canât have better support for storing lots of data on the IC (for example, what CanDB was doing - Iâm not sure whatâs the state of the project now). But even with this support you canât expect to treat it the same way as youâd treat your Wasm heap.
I aligned on what you are saying in general. Knowing the data is certified is certainly better than betting it is correct. But in extreme circumstances where it you could stream data faster than certify it or predict its request. It might be interesting to at least have some financial assurance from the boundary that they arenât swapping bits on you. Maybe like live streaming video?
I assumed thatâd be the case hence why Iâm proposing to have it only on dedicated âheavy processingâ subnets, where instead of running lots of canisters doing light work, there are fewer but more demanding ones. In order to achieve the vision of a world computer the IC needs higher throughput, it is unlikely to deliver on its promises if all services running on it are subject to homogeneous constraints.
Even if the threshold sigs werenât a bottleneck, I wouldnât expect such subnets to have a high count of actively running canisters anyway, if there are too many and their execution is time sliced too often itâd kinda defeat the purpose.
Itâd certainly benefit canisters likely to run into the instruction limit either very small, e.g HPL ledgers, or subnet sized ones, e.g Bitfinity. The latter kind takes up an entire subnet anyway and the former could be load balanced by quickly moving them between subnets individually or subnet splitting.
Though it is true that if the limit of canisters per subnet under this model is too low and the costs have to be increased by many orders of magnitude to make up for it, then they probably wouldnât be used at all or not enough to justify the engineering effort.
Is ~10 canisters per subnet actually in the ballpark of what we could expect?
Perhaps the way I phrased it made it more dramatic than I intended, I wasnât implying there are hard constraints in the protocol, itâd be worrying if it were the case.
Nonetheless, if the community suddenly decided to create a 100 nodes subnet, it wouldnât be possible. Sure, the foundation could prioritize the work to make it happen sooner, but even then, nobody knows how well itâd run, it might function, but further optimization could be required to make it actually useable.
I can understand low node subnets not being compelling enough to justify the work needed to safely add them, but it is somewhat concerning that almost 3 years after mainnet release, we still donât have even one >=100 nodes subnet, nor have any clue of what kind of performances we can expect when eventually they become a thing.
Imho these too are symptoms of the protocol being developed primarily based on a set of assumptions dictated by the network structure unilaterally chosen for it.
If there had been no a priori bias on which configurations are more desirable, with any form of specialization taking place only after usage patterns spontaneously formed on mainnet, these safeguards likely would have been already implemented as theyâd be mandatory for genesis release.
Assuming the vision is still to offer a crypto cloud, capable of covering the decentralization spectrum as much as possible, doesnât it make more sense to start by accounting for the âworst caseâ scenario first? This would entail implementing high replication, permissionless subnets, and only later optimizing it by granting more favorable conditions, such as low repl, permissioned subnets with server grade hw.
As a result the end product would be more robust, as it would need to account for more adversities. Generally speaking, itâs easier to optimize a system by providing a less harsh environment, see Hyperledger, than doing it the other way around.
Btw the cycle issue should at some point be addressed regardless of any new subnet types, tokenomics are potentially in constant jeopardy of 1 subnet being taken over.
This is super interesting that it seems to be flattening out. I wonder if Mooreâs still holds if you take latency, access and reliability into account? Likely it looks steeper but I wonder by how much. What do the replicas use for disk space?
Replicas use data center SSDs in a RAID configuration (not sure which). This is necessary because orthogonal persistence requires the ability to read and write GBs per second. And do so over and over (at rates that would likely cause consumer SSDs to fail within weeks or months).
For an actual example of the disks used in our Gen2 hardware replica nodes, we use the following SSD model: 6.4TB Micron 7450 MAX Series U.3 PCIe 4.0 x4 NVMe Solid State Drive
Each node server has 5 per the specification, so 5x6.4TB ~ 30TB (leaving some used by the IC-HOST OS, etc). Total cost for the 5x SSD is around US$6000 so that gives you a rule-of-thumb cost of US$200/TB
Also on the topic of disk price changes, the pricing of these datacentre grade nvme SSDs (at least) jumped about ten percent at the end of last year, apparently due to flash module supply constraints. So pricing doesnât always trend down in the near term.
Agree - these are the strongest new blockchain designs. I think they do keep the entire history of the chain though, at least for full nodes⌠are you sure about this? The TPS figures are a bit overblown though.