We hope this message finds you well. We would like to share a proposal aimed at providing trustworthy metrics directly on the Internet Computer. Your feedback is invaluable as we consider implementing these changes.
The core idea is to introduce Trustworthy Node Metrics that provide greater visibility into node performance. These metrics could potentially be used to influence node rewards in the future.
- The Consensus layer will now expose information on which nodes have succeeded or failed in the role of block makers. This data will be sent to the Message Routing layer via an extension to the existing Batch data structure. Notably, no special handling is required for empty blocks.
- The MR layer will integrate this information into the replicated state. Specifically, it will map each node ID to a counter of successes and failures over the last 60 days. A double-ended queue will be used for efficient storage. Importantly, the data will be queryable within specific date ranges.
- A new function will be introduced for fetching these metrics. This function will allow users to pull data from the replicated state across different subnets. Moreover, users will have the ability to specify the date range for the metrics they wish to query.
A draft specification for these changes is available for review here.
- A future feature could adjust node rewards based on these metrics, such that misbehaving nodes get penalties. This feature, however, will not cause any changes in the node rewards.
- Details on changes in the replicated state will be clarified during the implementation phase.
- Open-source (CLI) tools will be developed to facilitate in-depth analysis of these metrics.
In order to prevent DOS attacks, a nominal fee will be charged for each metrics retrieval operation, which will be processed through the wallet canister. We would like to hear your thoughts on this matter.
Beside the CLI tools, follow-up work will expose these metrics on the Public Dashboard and on a canister, which will be open to the public for querying.
- The addition of these metrics will require approximately 50 KB of extra storage per subnet, which is considered to be minimal.
We are particularly interested in your thoughts on the proposed architectural changes and the proposed charges for metrics retrieval.
Once we’ve gathered your feedback and received approval, we’ll start implementing these changes.