Let's solve these crucial protocol weaknesses

lastmjs · March 11, 2024, 12:07pm

Some social context: https://x.com/lastmjs/status/1766471613594083348

ICP has a number of crucial weaknesses that I believe work together to help inhibit it from achieving the vision of an unbounded scalable virtual machine, blockchain singularity, infinite scalability, or any other form of saying that ICP will be a limitless compute platform.

The premise of prioritizing to completely remove these weaknesses stems from that vision. If you don’t believe in that vision, believe it’s impossible, or believe it’s unwise to seek it, that’s a different concern or topic. I also believe you then go against the vision of ICP that has been sold for years.

These are the crucial weaknesses. We should resolve them to stop inhibiting achievement of the vision. I have put them somewhat in order of my perceived priority or importance:

Instruction limits
Memory limits
High latencies
Message size limits
Storage limits
High costs
Rigid network architecture (subnets static, canister unable to choose replication/security with flexibility, can’t move between replication factors, homogenous hardware required)
Centralizing DAO governance (one entity able to gain write access to the provisioned protocol, lack of checks and balances)

From my own point of view and desires, I ask for DFINITY to focus and prioritize with relentless effort to completely resolve each of these issues to the satisfaction of developers and community members.

Wisdom is of course required to weigh these concerns with the many other concerns. But I urge you to consider the possibility of great harm being done to ICP’s growth and success by continuing to prioritize other matters over these crucial use-case-inhibiting weaknesses of the protocol.

ChauDoan21165 · March 11, 2024, 12:28pm

Thank you! ICP need this for being more powerful in the future!

cryptodriver · March 11, 2024, 12:55pm

We have not too much time to waste. More and more players are coming to fight against ICP.

I personally totally agree with you, and hope dfinity team to solve these problems as soon as possible.

ICP is the best, if it could clear these problems.

evanmcfarland · March 11, 2024, 1:12pm

Thanks for being the voice of this Jordan. It feels this summary of requests is a broken record of community-requested priorities.

My personal project could greatly benefitted from higher instruction, memory, and storage limits. Having lost hope in timely progress on this, I’ve moved these operations off-chain, and wish I made that decision months ago. I don’t plan on migrating off the IC, but since my dapp is now an ‘off-chain hybrid’, it forces me to wonder why I’m building here.

ktimam · March 11, 2024, 1:43pm

Hope those issues gets prioritized as well.

LightningLad91 · March 11, 2024, 2:00pm

I appreciate you taking the time to express your thoughts both on Twitter and here in the forum. I do have a few questions.

As someone who works in computer science professionally, regardless of the DFINITY marketing material, do you believe it is actually possible to build a world computer that provides “unbounded” processing and “infinite scalability”?

I have other questions/feedback but I was hoping you might respond to this question first.

hehe · March 11, 2024, 2:08pm

My concern is: will there be an official person involved in the discussion

lastmjs · March 11, 2024, 2:13pm

I hope so. My original hopes for ICP were a protocol like TCP/IP. Would you consider TCP/IP infinitely scalable? I would.

For example, instead of subnets being exposed to users and storage being limited to a single subnet, I envision a virtual memory system, perhaps on top of an underlying subnet system, that abstracts the underlying storage access mechanisms. Operating systems do this when RAM hardware is reaching capacity, allowing the hard drive to be used as a location for RAM, all of this as I understand it, transparent to the processes. They don’t need to program themselves with the understanding of a virtual memory system.

Is there any reason why a virtual memory system could not scale infinitely, at least to the needs of the current population of Earth?

LightningLad91 · March 11, 2024, 2:25pm

Thanks for the response. I believe TCP/IP as a protocol is infinitely scalable, sure. Are we just talking about scaling the logical components of ICP? If that’s the case then I think your argument could be presented differently.

But even in this case the amount of swap space is bounded by the configuration of the system, or at the very least, the storage capacity of your drive.

As an engineer, I just have a hard time wrapping my head around unbounded things. When we have unbounded requirements it leads to a lot of risk, as well as potentially wasted effort and resources.

It may seem silly but even setting this expectation seems more reasonable to me. One of my other questions was going to be to ask if you were willing to set some measurable expectations for the following bullets, even if they seem crazy I think it would make for a more productive conversation with the DFINITY engineers.

lastmjs · March 11, 2024, 2:31pm

I’m saying that the virtual memory system would use subnets or any other abstraction or division necessary under-the-hood to present the developer with a single contiguous address space.

LightningLad91 · March 11, 2024, 2:36pm

I understand the ask. I’m just not clear on the expected outcome. Am I correct in my understanding that you want this change so that you don’t have to worry about hitting any storage limits? If so, how does creating a virtual memory system spanning multiple subnets prevent that? There aren’t an infinite number of subnets so eventually the protocol will hit a limit there too.

lastmjs · March 11, 2024, 2:37pm

Instruction limits: there should be no limit as long as it is paid for. Hours of computation should be possible, days even, weeks. GitHub Actions for example times out after a few days I think.

Memory limits: probably 100s of GiBs, maybe less

High latencies: the same as Web2, I don’t know under one second for most requests? As fast as the network please

Message size limits: At least abstract chunking entirely away from the clients and increase the throughput so that it isn’t noticeable. I should be able to upload data at the speed of the network and other processes required, the actual limit probably doesn’t matter if throughput and chunking is never seen by a dev or library author

Storage limits: unbounded as long as you pay, like I assume Filecoin and Arweave give you. Make a virtual memory system for this

High costs: as cheap as the replication? As cheap as possible, barely any margin over the actual resource costs…I don’t know, maybe BFT can be obtained in other ways besides replication

I want this to feel like a Web2 app but with extra-high levels of security. That’s really what we’re after. And security in these senses: confidentiality, integrity, availability, verifiability

lastmjs · March 11, 2024, 2:43pm

Maybe not necessary if adding storage space within a subnet is simple enough.

I want to get rid of the subnet abstraction though, canisters should just exist on the network, have memory and storage, choose level of security, and communicate with other canisters or the Internet, super secure, very fast.

LightningLad91 · March 11, 2024, 2:48pm

I think i’m starting to understand.

Let say that ICP develops into a system that appears, from the developers perspective, to be infinitely scalable. A dev can run a process as long as they like, store as much as they want, and all that. Now let’s say in reality there is an army of people at DFINITY or some other org actively managing the network in the background; actively adding and removing nodes in realtime to meet the needs of the network so that this appearance of infinite scalability is maintained. In that scenario, would your expectations be met?

Edit: I should’ve added that there would obviously be a (managed) risk that the network could run out of resources in the event of some catastrophe or organziational failure. But 99.999% of the time you as a dev wouldn’t have to worry about that.

lastmjs · March 11, 2024, 2:52pm

Sure…from the dev side that’s what I want and I feel like that’s what was promised.

How that actually works under the hood, I would hope it would be decentralized, transparent, relatively permissionless, and automated. Not one or two orgs manually adding and removing resources.

lastmjs · March 11, 2024, 2:56pm

ICP is pretty good at some kinds of horizontal scaling right now, but vertical scaling of even a single app is very bad. You can’t get many single apps that work just fine on Web2 to work on ICP because of all of these vertical/single app limits.

So we can have a lot of subnets right now filled with crippled applications, and we can keep scaling that out pretty far it seems.

We need to get rid of the limitations on single applications so that we can actually build real-world solutions.

LightningLad91 · March 11, 2024, 2:58pm

I think this is a reasonable expectation.

I do wonder though, if you and I were to continue workshopping this, how soon we’d come up with something similar to what is being done today with ICP. Like yourself, I have concerns about the network. But from an architecture standpoint I think DFINITY has developed something that can, theoretically, support the sort of decentralized administration you desire. For me, the biggest risk is the decentralization of the governance.

I do agree.

CoolPineapple · March 11, 2024, 3:06pm

Instruction limits

My understanding is that this is already in production with Deterministic Time Slicing does this not already cover this issue?

Do they just need to increase/remove the limit or is something else required?

Memory limits

Do you mean WASM memory and ease of sharing across multiple canisters?

High latencies

Queries can be cached but for updates. I really can’t see how this can be solved given the need for BFT. The consensus latency is about as low as it can go baring some novel cryptography or completely rearchitecting as rollups. And from my POV subnet replication factors are about as low as acceptable anyway.

What it sounds like you are suggesting though is that there is need for more flexibility to choose very low replication factors for low risk Dapps and I recall a game dev saying something similar. But at some point that just becomes a single server without any replication and correctness guarantees.

One left field idea might be to allow pass though P2P communication so that applications can just work with low latency by exchanging operational transformation/ crdts and only update the consensus state to save snapshots or where trust is important.

Thus for example a game would proceed with players sharing signed messages, updating their game state locally and bypassing the IC consensus nodes most of the time. With the IC saving a snapshot at random intervals. In most cases play would happen with very low latency but in case of dispute you would run a replay of say the last 30 seconds of signed messages since the last snapshot and let the canister decide the state.

Variant would have pass though nodes just notarise the messages as seen without updating state.

Message size limits

I think (from your twitter thread) you are specifically referring to message payload size here with reference to file uploads. Chunking is already implemented for canister WASM uploads so making that for general file uploads and providing grants for tooling around that would seem to fix a lot of the issues except it still would be slow.

Perhaps there is a way to upload and download in parallel using multiple boundary nodes or perhaps specialised large file server nodes.
I worry about DoS attacks if the size is increased unless some metering is applied.
Perhaps there needs to be a separate pure storage system.

Storage limits

Think something could be done about making file storage a transparent service across multiple subnets but again this seems to speak to the need for specialised file storage. I would not however that Filecoin and Arweave are both subsidising storage with issuance. So I’m not really convinced their model is sustainable.

Perhaps the play here is not for the IC itself to provide large scale storage but to deeply integrate with existing file storage networks like Filecoin, Arweave and Ethswarm.

BTW Might be an interesting play for @dfinity to team up with Ethswarm as they are almost completely overlooked and underused so would should in theory welcome both funds and collaboration. Though perhaps they are politically too cyberpunk and Ethereum orientated to accept a deal.

High costs

Elephants in the room here are:

Most networks subsidise via issuance. ICP doesn’t but it has unnecessarily high NNS rewards.
Costs have to be multiples due to replication.
Subnets are not actually net burners of ICP so if anything devs are being undercharged given node rewards. (There are potential pricing models which square this circle but they mean more uncertainty about costs)
Dev pays model means costs fall on devs and also that we don’t benefit from MEV burn.

Rigid network architecture (subnets static, canister unable to choose replication/security with flexibility, can’t move between replication factors, homogenous hardware required)

This is true but nothing stops more flexible systems being built on top of the IC. I think there is potential for incentivised service workers and storage services to be built on top. That this isn’t happening speaks to a cultural problem that the IC is positioned as a one stop full stack and this discourages infrastructure investment by parties which are not Dfinity.

Centralizing DAO governance (one entity able to gain write access to the provisioned protocol, lack of checks and balances)

This is huge and fundamental and solving it would require not just technical breakthroughs but a change in philosophical direction. It would also mean confronting the moral issues around building an actually uncensorable network.

jeshli · March 11, 2024, 4:21pm

Here is some additional context on instruction limits that may help clarify things.

Reviewing the DTS thread shows that:

DTS will have a maximum increase of another 2.5x to the current Update instruction limit (for a total of 50 billion instructions) at that point DTS will be at its current limit of 500 rounds and more fundamental changes to the protocol would be needed.
DTS is not required for queries, however:
- Queries hold the entire subnet state in memory until the query is complete “only retaining the state of a single canister as opposed to the whole subnet’s is not something that’s supported yet. I believe it is technically possible, but it would likely take time and effort to design and implement.”
- Only a few concurrent queries are allowed “the average web server that can handle hundreds of concurrent requests, we impose a hard limit (4? 10?) on the number of concurrent queries”

In another thread, we tried to address another aspect of the issue which is that queries are currently free. This leads to DDOS vulnerability and network cost asymmetry and lack of sustainability. The DeAI community has suggested enabling a fee option to queries in order to address those short comings.

marcpp · March 11, 2024, 4:48pm

Great thread!

Unless I’ve misunderstood the IC’s vision, this should be the main focus. Otherwise, what’s the point?

I really hope we can get some answers from Dfinity on how they plan to tackle those limitations. Even if it’s long term. @lastmjs says he’s in contact with them on a regular basis, so I’m confident they’re aware of those issues, but it’d be great to let the community know what’s being said. The fact that @lastmjs has to come out publicly suggests he might not be as listened as he’d hope. Which is fine if Dfinity have their own convictions and prioirities. But having some transparency about all this would be awesome.

Again, this is the only thing that matters to me (and many in the community I’m sure):

Topic		Replies	Views
Technical Working Group: Scalability & Performance Developers Discussing , community-consideration	176	10085	July 21, 2025
ICP.Lab Storage & Scalability Summaries Developers	18	4749	April 9, 2025
Let's Review the AO Whitepaper's Characterization of ICP and AO's security model General	69	4541	February 10, 2025
Hashed Block Payloads Developers	31	1509	October 25, 2024
Subnets with heavy compute load: what can you do now & next steps Developers	174	4063	November 26, 2024

Let's solve these crucial protocol weaknesses

Related topics