Canister Insights – Richer Metrics for Developers

We’re excited to share a proposal for Canister Insights, a new feature that gives developers and stakeholders deeper visibility into how cycles and resources are consumed by their canisters.

Motivation

Today, tracking cycle consumption is cumbersome. While developers can check total cycles or memory usage via the existing canister_status endpoint, there’s no clear visibility into where those cycles are being spent, nor an easy way to gain deeper insights into their canister behaviour. This lack of transparency makes it challenging to optimize canister performance or to proactively manage resources.

By exposing detailed metrics, developers can turn raw data into actionable insights, enabling:

  • Real-time and historical monitoring: Track memory, compute usage, and cycle costs over time, identify spikes or unusual patterns, and assess overall canister capacity.

  • Smarter cycles management: Use detailed cycle metrics to predict when a canister will run low on cycles and trigger automated top-ups or alerts, preventing downtime.

  • Faster debugging and troubleshooting: Correlate system-level events with application logs to pinpoint performance bottlenecks, or unexpected resource usage.

  • Detailed reporting and analytics: Generate reports showing exactly how cycles are spent, memory is used, and which methods or features are most costly – enabling data-driven decisions for optimization and budgeting.

Proposed Solution

We propose a solution that balances immediate value with long-term scalability. The idea is to extend the range of metrics collected during execution. Over time, this will naturally evolve into:

  • A standardized core set of metrics available across all canisters.

  • Support for extensible custom metrics to capture per-method or feature-specific insights.

These metrics will be made available through a new management canister API: canister_metrics, enabling stakeholders to better understand their canister’s performance, usage patterns, and resource consumption.

Accessing the Metrics

The new API would follow this schema (note: naming, fields and overall structure are subject to refinement):

// Request arguments for retrieving canister metrics.

type canister_metrics_args = record {

    canister_id : principal;

};

// Response containing the named or custom metrics.

type canister_metrics_result = record {

    // ... 

    // Named field metrics, e.g. cycles_received, cycles_sent etc.

    canister_metrics: opt canister_metrics; // Defined in the section below.



    // Custom metrics.

    // e.g., per exported method or feature specific metrics

    // Additional custom groups can be added later.

    cycles_custom_metrics: opt custom_metrics; // Defined in the section below.

};

service ic : {

    // ... existing methods ...



    // Returns canister's metrics.

    canister_metrics : (canister_metrics_args) -> (canister_metrics_result);

};

Core Metrics

The proposed solution introduces replica support for capturing more detailed metrics. In the first phase, cycle consumption will be the primary focus. Later iterations may extend to other areas (e.g., instructions, memory, throughput).

The list below is meant to give stakeholders an idea of the kind of data that could be provided.

type cycles_metrics = record {

    memory_usage_cycles: nat;                            // cycles consumed for memory

    compute_allocation: nat;                             // cycles consumed for compute allocation

    ingress_induction: nat;                              // cycles for inducting ingress messages

    instruction_executed_cycles: nat;                    // cycles charged for instruction execution

    request_response_transmission_cycles_cost: nat;      // cycles used for message transmission

    cycles_sent: nat;

    cycles_received: nat;

    cycles_refunded: nat;

    cycles_reserved: nat;

    // ... etc

};


// Core set of named, well-defined metrics.

type canister_metrics = record {

    cycles_metrics: opt cycles_metrics;

    // Future expansion: instructions, memory, usage, throughput etc.

}; 

Custom Metrics

In addition to the well-defined metrics, developers may also want customizable metrics and statistics, such as per-method counters or feature-specific data where names are not predefined.

type canister_counter = record {

    name : text;

    value : nat;

    // Optional: additional metadata info, such as category, method type.

};



// Custom metrics (e.g., per-method or feature-specific).

type custom_metrics = record {

    counters : vec canister_counter;

}

What we are asking the community

Please let us know if the solution provided would work and bring value to your development experience on the Internet Computer, and if there are other types of metrics you’d find particularly useful.

We welcome any suggestions that you may have. Your input will help us refine the design and ensure that Canister Insights delivers the visibility and control developers truly need. Thank you for taking the time to share your thoughts with us!

10 Likes

Cool initiative :+1:

Might just be me but I’m a bit confused about the proposed definition of canister_metrics_result, particularly the field canister_metrics.

type canister_metrics_result = record {
    canister_metrics: opt canister_metrics; 
    cycles_custom_metrics: opt custom_metrics; 
};

The field name feels a bit redundant with its parent as I’m not entirely sure what it contains and how it differs from cycles_custom_metrics. Are these meant to be the “core metrics” you mentioned? If so, would it make more sense to call the field core_metrics instead?

type canister_metrics_result = record {
    core_metrics: opt core_metrics; 
    cycles_custom_metrics: opt custom_metrics; 
};
3 Likes

the enriched execution/cycle information is appealing, but if not optional it raises a small concern: some IC applications are heavily dependent on a single canister or small number of canisters. Is there any delta from smoke/load testing to understand the concurrent impact to resources on a validator after this API has a minimal load for all canisters? i.e. what does a validator running all activitely richer metrics for some [or all] queries/updates see compared to one with no usage of the metric API?