Scalable Messaging Model

The NNS motion proposal is live, voting is open for the next 4 days.

4 Likes

I agree with this change and endorse it.

I come from a Web2 experience doing a lot of integrations, and a lot of the “pain” came from handling all the possible errors, edge cases. We could never “prepare well enough”, it was crucial to have proper alert / logging systems.

For the “best effort” system to be a success, we definitely need the “canister logging on traps” and proper handling of these fails (sleep + retry nr as mentioned).

Please kindly prioritise these before the release of wide changes in networking. :pray:

4 Likes

There is already work in progress to preserve and expose logs, with explicit coverage for traps.

For alerting, you can expose standard Prometheus metrics via an HTTP endpoint (e.g. here are the NNS governance canister metrics). The only thing you should be careful about is to explicitly attach timestamps to every sample, so if you hit a replica that is significantly behind the rest of the subnet you get a gap instead of an out-of-order sample.

2 Likes

@here - if you’re interested in this topic, we’ll have a presentation and discussion about it this Thursday.

1 Like

Hello everybody,

it has been quite a while since we shared the last update on the new messaging model so we thought we’d provide a quick update on the progress of the currently ongoing implementation of messages with best effort responses.

  1. The system API changes required to expose the new message type to canisters are done and hidden behind a feature flag.
  2. On top of this, we plan to expose the feature in CDKs behind a similar feature flag for early developer feedback already before the feature will be available on mainnet.
  3. The core changes to support best effort responses are also progressing well. This includes (quite fundamental) changes to canister queues and other related data structures to support the new message types. The strategy here is to develop data structures that are functionally equivalent to the current canister queues but also support the new message types. They exist in parallel to the current ones but remain unused until everything is sufficiently tested. Then there will be a switch from the old to the new ones. This way things can gradually go to master. So watch out for queue related changes in case you’re interested to follow the progress.

Finally note that 2 is not blocked by 3, so if everything goes according to plan canister devs will be able to prepare their canisters and provide feedback even before the feature implementation is fully done.

5 Likes