Increasing Ingress Message Throughput

TL;DR: As announced in Hashed Block Payloads - #3 by lastmjs, DFINITY has been working for a while on an approach that increases the throughput of consensus by putting references to ingress messages instead of the full ingress messages in blocks without introducing security holes.
We’ve completed the implementation of this feature now and DFINITY will propose to enable it on a couple of subnets.

Background:
When a node creates a block proposal, it currently adds full ingress messages users submitted to make replicated canister calls. The nodes either received these messages directly from boundary nodes or from peer nodes in their subnet and store them in their ingress pool. Thus, the block proposals a node sends to its peers contain the full ingress messages, even though the peers have most of the ingress messages already in their pools. This is a waste of time and bandwidth.

Proposed design
To mitigate this, the proposed implementation includes only the hashes of ingress messages in the blocks. Under good conditions, since all the ingress messages are broadcast anyway, replicas should be able to reconstruct the block proposal by retrieving all the ingress messages from their respective ingress pools by their hashes. In order to get missing ingress messages to be able to validate and/or execute a proposal, nodes can request the messages they don’t have yet from the peers who advertise the proposal.

To minimize changes to the overall protocol, we created a new component between P2P and Consensus responsible for stripping ingress messages from proposals on the sender side and adding them back on the receiver side. The remaining Consensus and P2P logic are largely not changed.

Proposed rollout plan
We want to provide value quickly with as simple an implementation as possible and without introducing new risks.This is also the reason why we propose to use this new approach on Ingress messages only in the beginning. Moreover, we suggest to use the same message/block limits first, and then later explore allowing larger messages.
According to our measurements, this should increase throughput capacity significantly already. In experiments with 13 node subnets, under some artificially imposed network conditions, we see an improvement from ~2 MBps to ~6 MBps.

DFINITY plans to submit proposals to gradually roll out this feature over the coming weeks, and use this forum thread to keep everybody informed of the progress. We also plan to collect throughput and latency metrics, so hopefully we’ll see these numbers improve soon.

We are working on a blog post with more details, we’ll share these in the next couple of weeks.

Thanks,
Kamil

20 Likes

The voting is now open for IC release with the feature enabled: https://dashboard.internetcomputer.org/proposal/133799

2 Likes

This is amazing news!

I have a couple questions:

  1. Just to be sure, you’re saying that just by implementing this feature now with the same message and block constraints, you expect to see an increase in throughput?
  2. You plan to gradually increase the actual message size limit for blocks? As in the 2 MB incoming message size limit? What about the 3 MB outgoing message size limit?
  3. What do you expect the final message size limit for blocks to be? And can you anticipate more or less how long it will take to get there?
  4. In its end state does this solution essentially remove message size limits on ICP?
4 Likes
  • Just to be sure, you’re saying that just by implementing this feature now with the same message and block constraints, you expect to see an increase in throughput?

Yes! We achieve the improvements Kamil mentions above without adjusting message and block constraints

  • You plan to gradually increase the actual message size limit for blocks? As in the 2 MB incoming message size limit? What about the 3 MB outgoing message size limit?

The changes from this feature are a first step towards increasing the blocksize (some improvements in other areas will be needed in addition).
With outgoing message size limit you mean what you can obtain with read_state /request_status/<request_id> ?

  • What do you expect the final message size limit for blocks to be? And can you anticipate more or less how long it will take to get there?

I hope the protocol will keep increasing things like that, without a final limit, thanks to protocol and HW improvements :slight_smile:
The speed will depend on prioritization of this vs other demands.
Sorry for not being able to give a more concrete answer right now…

5 Likes

Not sure it’s that…it’s whatever was discussed here: Message size limit of 3145728 bytes?

Looks like it’s just the query response size.

This improvement is strictly about inbound messages to a subnet (i.e. what normally goes into a block: ingress messages, canister messages from other subnets, responses to HTTP outcalls, etc.). So it wouldn’t affect query response sizes.

OTOH, it might be that in the future there may be increases to the sizes of individual messages, partly because of this. But the primary goal for the near future is to increase inbound bandwidth, not message size limits. There are other reasons why we may not want larger messages: they take up more memory, they make it harder to balance load, they imply higher latencies for nodes that don’t have the message and have to retrieve it before they can make progress, etc.

2 Likes

Excited to see it being rolled out! Great job Kamil, Yvonne-Anne and many others! :hugs:

3 Likes