[Example] Canister Monitoring with Prometheus

domwoe · June 3, 2022, 6:57am

Hey,

there were some threads recently about canisters that max out the available WASM memory and asset canisters that got “suddenly” deleted because they only served query calls which are (currently) not affected by the freezing_threshold.

We also have similar issues with the cycles faucet which is unavailable way too often. Some of the engineers at DFINITY pointed me towards how we monitor the canisters on the NNS, and I created a minimal example project that you might find useful.

The example project is a Motoko canister that exposes some metrics (balance, memory sizes) via an HTTP endpoint, and a Prometheus instance that regularly fetches those metrics. In addition, a few alerts are configured and there are many integrations for Prometheus to deliver those alerts.

The metrics endpoint can optionally be secured with an API key. In the future it would be useful to have a Prometheus plugin that can directly talk to the IC using an agent to allow cryptographic authentication.

Another useful avenue for future work would be to include a Grafana instance to create a nicer dashboard.

If you’re interested in monitoring canisters you should also have a look at CanisterGeek for on-chain monitoring.

Hope you find this useful and looking forward to your comments!

cyberowl · December 18, 2022, 12:31am

So for the Prometheus part, that is not hosted on the IC correct. I’m going to try this out as well. I plan only only hosting public metrics data. I will try to get grafana connected as well.

domwoe · December 19, 2022, 8:21am

Hey @cyberowl,

yes, this is a way to provide off-chain monitoring. Prometheus and Grafana run on a single server and periodically fetch metrics data from an HTTP endpoint of the canister.

Topic		Replies	Views
Get assets canister cycles balance Developers	9	1884	October 1, 2022
Cycle balance monitoring service? Programs & Applications	4	1056	January 16, 2022
Understanding Canisters Live on the IC Developers	2	59	January 3, 2025
[Discussion] Approaches for preventing canisters from hitting memory limits Developers Discussing	5	609	October 5, 2022
Canister-side usage metrics, good solutions? Developers Discussing , community-consideration	0	453	February 14, 2023

[Example] Canister Monitoring with Prometheus

Related topics