Technical Working Group: Scalability & Performance

I’m not an expert on load balancing in the boundary nodes. I’ll ping the relevant team to answer this.

Yes queries are limited to 5B instructions, so they will run quickly. I guess in an edge case it still might be possible for them to run against an old state though. Imagine a node which starts a query, but then stalls for a while - the subnet could still progress and eventually the node could return a response run against the older state.

Well we currently run 4 queries at a time on each node, so a single canister could have 4 queries and an update running against it at the same time. But that would also mean no other canisters are running queries. In general, we can have 4 updates and 4 queries running at the same time and the system can handle that. In terms of memory usage, I don’t think it makes a difference if these executions are running against the same canister or different ones.

We know that most use cases wouldn’t want to read the entire memory on every message, but we need to ensure that a few buggy canisters can’t take down a subnet. There’s no limit on the memory a sandbox can use, but we need to prevent canisters from using up all the RAM on the node.

3 Likes
  1. You mentioned that query throughput increases with the # of nodes in a subnet. However, in practice on mainnet are the boundary nodes aware of how the ingress load is distributed amongst nodes, and do they automatically then distribute the load evenly amongst nodes in the subnet? Or would the boundary nodes still route requests to the closest node regardless of load, meaning that in a scenario where requests are perfected geographically distributed throughput would increase with the number of nodes in a subnet, but realistically this wouldn’t make a difference if all the requests were coming from a single location, say New York City.

Hey @icme,

Rüdiger from the boundary node team here.

The boundary nodes constantly health check all replica nodes in a subnet. For each request, the boundary node picks one random node is picked to which the request is forwarded to. So, the load should be quite evenly distributed.

The natural follow-up is “why random node and not something smarter like closest node”. For the user experience this could be much better as latencies would be more stable and lower. However, since any node could be malicious, with the random routing it is ensured that you will eventually reach an honest node with retries.

There is another interesting point: If the request to the picked node fails (e.g., connection times out, some broken response), the request is retried up to 2 times to different nodes. This helps “hiding” some of the issues from the user (e.g., a node became unhealthy, but our health check hasn’t discovered it yet).

I hope this helps answer your question.

5 Likes

The “scalability tests for ICGPT” is a valuable analysis of IC performance (at least in the domain of AI applications). I’m not sure what generalizations can be drawn or how this could be cross referenced with other applications that have burdened subnets such as BOB for a more complete picture.

I also wonder what the best and easist place to gather this data is? Is there an API to examine subnet traffic? Do I just need to build network performance statistics into my canister logs?

Additionally, this gets into the priority fees that I have seen you discuss in the past which are found in all blockchain environments. So it is more than just a technical feasibility but also a economic question when other Dapps are competing for the queue.

2 Likes

Unfortunately, no. Particularly on heavily backlogged subnets we often have replicas that are a couple of rounds behind the certified height. (Remember that we only need 2f+1 out of 3f+1 replicas to certify a state.) So if you happen to query one of the f slowest replicas, your query may run against an older state.

That being said, this mostly tends to happen when the subnet is under persistent heavy load for extended periods of time, where tiny differences in replica hardware, configuration or network latency accumulate over time, resulting in a more or less fixed “running order” (like in a marathon), with a more or less fixed set of f slowest replicas behind the certified height. Otherwise, this can also momentarily for one replica or other during a network glitch, but you’re less likely to hit that one replica during the brief glitch.

2 Likes

Hi everyone,
We’ll have a meeting tomorrow we’re all discuss some of our recent optimizations to support higher throughput on a single subnet. Note that we have a new zoom link.

Thursday February 20th at 5:00 pm CET Zoom link

The recording and slides from last week have been added here. Let me know if you have any issues accessing the recording.

1 Like