This forum thread is meant to be a place where devs can post every time they run into the instruction limit. The purpose is to show to DFINITY how important raising that limit is while also providing an opportunity for feedback on improving code/practices to perhaps overcome the limit in extra-protocol ways.
I invite all devs whenever they hit this limit to please post a small amount of details describing what caused you to hit the limit.
You’ll know you’ve reached the limit when you see something like this: Replica Error: reject code CanisterError, reject message IC0522: Canister bkyz2-fmaaa-aaaaa-qaaaq-cai exceeded the instruction limit for single message execution., error code Some("IC0522")
I have a simple Express app setup in Azle that uses the static middleware to serve static files. This has generally been working very well for simple frontends.
I have loaded a ~5.3 MiB into the Azle filesystem into the static directory, and I tried to load it with an audio HTML element…well, the first thing that happened when the browser tried to the load the audio file with a GET request (GET requests are treated as queries by the Azle Server Canister) was to return Replica Error: reject code CanisterError, reject message IC0522: Canister bkyz2-fmaaa-aaaaa-qaaaq-cai exceeded the instruction limit for single message execution., error code Some("IC0522").
Right now it looks like I can work on the range requests, I thought they would be automatic but I might need to do some configuration to ensure that the entire file isn’t returned in one request…it would still be nice if this entire file could be processed in one request ideally though.
Hey Jordan. Thanks for starting this thread. I am looking forward to see more real world cases.
As we were discussing offline. In some cases, hitting the instruction limit is not a root cause, but rather a symptom of the underlying performance problem.
Let’s take your example of serving a 5.4MB audio file. That is 5.4 million bytes. Let’s be very generous and say that the program needs 100 instructions for each byte (I would actually expect 10 instructions per byte). This gives us 540 million instructions.
However, the program has hit the limit of 5 billion instructions. That is 10x higher than an already super generous estimate.
I would say let’s investigate what causes the program to use 5B instructions. There seems to be some performance bottleneck. The first step to start would be to see how many instructions does the program need to complete. You can check that by calling the same function in an update method, since updates have 20B instruction limit. I would also suggest to use a performance counter. If you share the code, I can also take a look when I get free time.
In this particular case, blindly raising the limit would not do any good for the developer and the users because executing 5B instructions takes 2 second. So the server that the developer is writing would have throughput of 0.5 requests per second per thread.
And, shameless plug, we’ve developed canbench to help analyze performance bottlenecks such as this. It currently supports Rust, but supporting other languages is quite simple (on the order of days of work, I’d guess).
I reckon some of the IC devs working on LLM inference in a canister would have clear examples of hitting the wasm instruction limit that they could reproduce and share details about.
I a thinking of you @icpp@jeshli@hokosugi and several others who have shared their work in the DeAI Working Group meetings
Being able to just run code from anywhere on the IC as a distributed app is a panacea for those of us working with the IC. I’ve made the fundamental assumption that it is inherently not possible given the compute pattern the IC uses.
Software written outside the IC makes fundamentally different architectural assumptions. You’ll never be able to run ‘the best’ database out there on the IC because it makes assumptions about linear compute availability and writes its indexers and supporting protocols in a way that requires us to fundamentally rewrite most of the pieces to fit the IC.
Given that most software(outside of known quantity compute tasks) needs to be rearchitected to fit into the IC paradigm I’ve been focusing my time on fundamental architectural pieces that make it easier to write general-purpose programs in the round-based architecture that the IC is going to demand(as compute demand will always outstrip consensus restricted by the speed of light and available bandwidth). Building a “New Internet” likely requires stripping down to the wall studs and not bringing the old bad habits with us.
I’d define my base assumption as: There are very few future universes where IC survives another 40 years, but in almost all of them the software running on it is written in a language and set of frameworks ‘shaped like the Internet Computer’ that were purposely architected for the platform.
Given this assumption, there is far more value in steering the thread of time through this small eye of a needle by focusing on the base infrastructure that makes that possible rather than trying to shove the elephant of everything in cargo, npm, pip, and github through it.
(Don’t despair though as we humans have been in the business of needle threading since we’ve been around…I think we can do it.)
Or maybe more applicable to this thread so far, instead of trying to run express we should be trying to build a web server that cannot violate the cycle limit. This seems easier to do from the ground up rather than refactoring the pieces of express that don’t fit the pattern.(although we should borrow the good ideas liberally…and AI may 100x reduce the time necessary to do this over the next couple of years.).
I’m absolutely ecstatic that teams like Demergent are pushing the envelope and making different assumptions because I’m probably wrong and I want to keep working on the IC even if it surpasses my best guesses.
Thanks for the comment and that little part at the end haha!
I fundamentally disagree with this point of view as a guiding principle, and I want to see the IC upgrade itself to handle the world’s software mostly as it already exists, otherwise it will fail in its loftiest visions, which are to run the world’s software.
The amount of pain and development time and learning curve slope required to rewrite everything for the IC is IMO a non-starter or just a very bad problem to have.
I will continue to push for the IC to do what it was promised to do essentially from the beginning, and that is to be an unbounded virtual machine, that’s the vision I was sold and that I signed up for.
100% agree.
The significant blocking barrier of having to rewrite everything with a new paradigm, along with the time and monetary investment it’d require, is on its own a major impediment for widespread adoption.
But even if it weren’t the case, imho there is no way the IC can realistically even come close to running the world’s software with its current limitations.
The scaling solutions offered still aren’t enough to provide for all use cases and even in the scenarios where they might be, the rate at which the network would need to scale in terms of nodes is unsustainable.
What do you think about the notion that the current software stack has become increasingly complex due to several decades of incremental improvements and integration of many components and platforms and the idea to have another “swing at the ball” with a system like the IC?
On topic:
Everytime I reached the limit, I resolved it by making the code more efficient
Hmm…well I think it is that, but the opcodes don’t necessarily line up with what runs on an AWS server or your laptop. The biggest issue is solving consensus. If you want consensus on your updates you’re bound by the speed of light. If you don’t want consensus then you can just run a standard app and db servers behind a load balancer. Unless I’m mistaken about time-slicing, you’re currently blocking all other threads if just run a linear process. Even if you do 32x time-slicing you block updates on that canister for the duration. I don’t know if parallel time-slicing is even on the table. Of course, this all begs the question about whether consensus is necessary with zk stuff, and maybe(hopefully) there are some answers in there.
The amount of pain and development time and learning curve slope required to rewrite everything for the IC is IMO a non-starter or just a very bad problem to have.
A year ago I was a lot more pessimistic about this proposition than I am today. There is likely an AI solution here. “Break this module compute into committable chunks that take up no more than 1/10 of a processing round given what you know about the Internet Computer”
Okay in this case I have mostly overcome the issue. I’ve since implemented range requests and done some optimization of how Azle handles responses, and now I can handle ranges up to 3 MiB (which is strange because I thought the message limit was 2 MiB, maybe the response limit is a bit higher?), all from within a query call.
I encountered the same issue in the rgbonic project code, which is the IC5022 error when importing RGB20 asset to Stock.
How can I increase the instruction limit to address this? Since Stock is an initialization operation and cannot be split into instructions, does anyone have any suggestions? Thank you.
[the code](rgbonic/actors/rgb/src/lib.rs at main · lshoo/rgbonic · GitHub
A ~600 KiB file in an Express get request (query method under-the-hood) using the static middleware, in Azle, is causing the instruction limit to be reached for some reason. I haven’t tracked down why yet.
It seems like this is actually just a relic of something we had already resolved referenced earlier in this thread, we just haven’t released the new version of Azle yet that resolves this, thus it was hit by someone in the wild. We should be good but I will update if it’s still a problem.