Trustworthy Node Metrics for useful work

sat · September 13, 2023, 9:52am

Greetings Internet Computer Community and Node Providers

We hope this message finds you well. We would like to share a proposal aimed at providing trustworthy metrics directly on the Internet Computer. Your feedback is invaluable as we consider implementing these changes.

Proposal Overview🎯

The core idea is to introduce Trustworthy Node Metrics that provide greater visibility into node performance. These metrics could potentially be used to influence node rewards in the future.

Architectural Changes

Consensus Layer:

The Consensus layer will now expose information on which nodes have succeeded or failed in the role of block makers. This data will be sent to the Message Routing layer via an extension to the existing Batch data structure. Notably, no special handling is required for empty blocks.

Message Routing Layer:

The MR layer will integrate this information into the replicated state. Specifically, it will map each node ID to a counter of successes and failures over the last 60 days. A double-ended queue will be used for efficient storage. Importantly, the data will be queryable within specific date ranges.

Management Canister:

A new function will be introduced for fetching these metrics. This function will allow users to pull data from the replicated state across different subnets. Moreover, users will have the ability to specify the date range for the metrics they wish to query.

A draft specification for these changes is available for review here.

Additional Points

A future feature could adjust node rewards based on these metrics, such that misbehaving nodes get penalties. This feature, however, will not cause any changes in the node rewards.
Details on changes in the replicated state will be clarified during the implementation phase.
Open-source (CLI) tools will be developed to facilitate in-depth analysis of these metrics.

Fees for Metrics Retrieval

In order to prevent DOS attacks, a nominal fee will be charged for each metrics retrieval operation, which will be processed through the wallet canister. We would like to hear your thoughts on this matter.

Beside the CLI tools, follow-up work will expose these metrics on the Public Dashboard and on a canister, which will be open to the public for querying.

Data Storage Consideration

The addition of these metrics will require approximately 50 KB of extra storage per subnet, which is considered to be minimal.

Your Feedback is Welcome

We are particularly interested in your thoughts on the proposed architectural changes and the proposed charges for metrics retrieval.

Next Steps

Once we’ve gathered your feedback and received approval, we’ll start implementing these changes.

ckMood · November 22, 2023, 2:28am

Bookmarked and will come back to this after I’m done with homework lol but any tools to help empower users to understand metrics are appreciated

sat · January 22, 2024, 9:40am

I wrote and published docs for this feature:

and

There is even an example for a Jupyter notebook in which the metrics are analyzed:

github.com

dfinity/dre/blob/main/docs/trustworthy-metrics/TrustworthyMetricsAnalytics.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d2be3475-d313-4359-ad98-8ee595dd2057",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import pathlib\n",
    "import json\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dbbc6db6-faf4-4690-9a38-055a84c48ee9",
   "metadata": {

This file has been truncated. show original

yvonneanne · February 7, 2024, 5:22pm

Hi everyone

In the past few months we’ve been working on the implementation of the functionality outlined above. DFINITY teams from almost all layers (consensus, message routing, execution, DRE, security, SDK) have been involved in this project. We’re very happy to announce that we’re very close to completing this effort.

The management canister now offers a node_metrics_history endpoint (see Interface Spec entry). In the DRE tooling linked to in the post by Sasha, you can find examples of how to use this new feature.
More cdk and agent support will be added in the next few weeks.

The functionality is described in more detail in the following blog post.

Should you have any questions or feedback (do you understand the metrics, does the history endpoint make sense, which other metrics could be interesting), let us know!
If there are many questions, we can also organize an interactive session or walkthrough.

Cheers
Yvonne-Anne

Topic		Replies	Views
Public Internet Computer (IC) Node metrics available now! Developers	2	305	July 9, 2024
Upcoming HostOS rollout and enablement of public node metrics Developers	3	347	July 9, 2024
Performance Based Node Rewards Governance	49	1271	March 22, 2025
Introducing Node Monitor: Tools to Manage Your Nodes, All in One Place General	3	96	June 19, 2025
NNS Updates 2025-08-15 NNS proposal discussions nns , Protocol-canister-management	4	68	August 19, 2025