I am using HTTP outcalls (Motoko) to send some emails directly from a canister. I noticed that even sending a very small email (for example, just a single line of text) increases the canister memory usage by approximately 200 KB.
Based on this observation, sending around 15–20 emails per day increases the canister memory by roughly 2 MiB .
Is this expected behavior, or could there be an issue in my implementation?
Some memory usage and growth is normal. Email sending most definitely creates objects that are placed on the heap. Motoko has a GC which periodically kicks in to remove unneeded data, but the way it functions depends on which backend you use. For example, do you use enhanced orthogonal persistence? or the classical persistence?
Anyhow, the GC will not necessarily kick in immediately after your canister does the HTTP outcalls, so data may linger around for a while. There’s also a question of course of what you do with the responses of the http outcalls, do you save something etc.
I don’t think the increase should be any cause of concern for now, but couldn’t tell much more without seeing the code or learning more about what you exactly do.
Your canister heap increases initially to have some space to work with, that won’t shrink unless you upgrade it. Likely once you send the first emails it will stay the same for a while so next ones wont be taking 2MIB per day. If you manage to test this with pocketic you will see how it works.
Ok, thanks, I understand that. However, from my observation it appears that when performing an http_outcall, the canister memory increases with each request and the allocated memory is never released afterwards.
I monitored this behavior over several days with roughly the same number of outcalls per day. Even when nothing is stored persistently, the reported memory size still increases linearly with every outcall.
You are allocating quite a few objects in that method you’re showing. My guess is that bodyBytes could for example be one of the largest ones, depending on what you send in that email.
Anyhow, all these objects remain on the heap until a full GC phase is done. With enhanced orthogonal persistence, GC runs depend on the load of the canister (because we piggy-back GC with user calls to make it scalable). If the load is relatively low (20 emails per day), it is also possible that the canister has not even gone through a full GC phase.
Keep an eye on it and report back if anything strange happens, but for now nothing seems out of the ordinary to me.
The memory size reported by canister settings is probably not want you want to monitor since it reflects the memory the canister has requested from the system and will generally only grow over time.
This AI generated summary of the runtime stats available to Motoko might be more useful.
Motoko rts_ Functions
The rts_ (Runtime System) functions in Motoko provide introspection into the canister’s memory and performance statistics. They are accessible via the Prim pseudo-library:
import Prim "mo:prim";
// Example usage
Prim.rts_memory_size()
Available rts_ Functions
Function
Description
rts_version
Returns the runtime system version as Text
rts_memory_size
Total memory size in bytes
rts_heap_size
Current heap size in bytes
rts_total_allocation
Total amount of memory allocated over the canister’s lifetime, in bytes
rts_reclaimed
Amount of memory reclaimed by the garbage collector, in bytes
rts_max_live_size
Maximum live (in-use) heap size observed, in bytes
rts_mutator_instructions
Approximate IC instruction cost of the last completed message due to computation/mutation
rts_collector_instructions
Approximate IC instruction cost of the last completed message due to garbage collection
This function (__motoko_runtime_information()) is only authorized to canister controllers and self-calls of the canister. [Motoko changelog]
Important Notes
rts_memory_size vs rts_heap_size: rts_memory_size reflects the total Wasm memory allocated (in pages), while rts_heap_size reflects only the live heap objects. For example, an array of 1,000,000 Nat8 values may show rts_heap_size ≈ 4,000,052 bytes because Motoko uses a uniform array representation to support generics. [forum post]
rts_mutator_instructions / rts_collector_instructions: These report stats for the last completed message, not the current one, making them useful for post-hoc profiling. [Motoko changelog]
The Prim module is not officially intended for end-user use and its API may change. Use with discretion. [Prim functions; forum]
Currently, I need to monitor several Motoko and asset canisters, and I have not implemented a dedicated monitoring function within the canisters themselves, something that is not possible for asset canisters anyway.
Therefore, I rely on the Management Canister to retrieve the canister status.
// IC management canister for querying canister status
transient let IC = actor “aaaaa-aa” : actor {
canister_status : { canister_id : Principal } → async {
status : { #running; #stopping; #stopped };
memory_size : Nat;
cycles : Nat;
};
};
From my observation, the memory size appears to increase with every request and does not decrease afterward. However, I’m not entirely sure whether this is expected behavior or an issue on my side, it’s simply what I’ve observed so far.
Unless you have a user code memory leak, I would expect the wasm memory_size to steadily increase until it reaches a stable plateau from which the GC can recycle memory without requiring further wasm memory.
A user level memory leak might a global collection that keeps growing (say a log of every message) etc.
Ok, it turns out it’s pretty easy to call the hidden method by using a custom did file (info.did below) that declares the hidden query and supplying it to dfx canister call.
Since version >= 0.15.0 you are using the default 64-bit, enhanced-orthogonal-persistence with incremental GC. (Before 0.15.0 the default was 32-bit, copying gc.)
The runtime system has so far requested memorySize bytes of main memory from ICP.
Of this heapSize bytes are currently occupied by Motoko data, both reachable and unreachable.
The maximum live (reachable) data detected in a past GC was maxLiveSize bytes.
I’m trying to understand why there is such a large gap between heapSize and memorySize. We use some indexes and stemming for text search. Is this expected behavior? What could be the underlying reasons?
The only plausible explanation I can think of is our text search feature. We store documentation directly in the canister, and over time this content grows, typically around the size of an A4 page per entry in some cases.
The text is also processed (stemming) and indexed in a separate canister.
However, the memory increase I observe is not continuous. It reaches a plateau and remains stable, for example, it has been steady for the last 16 days. Before that, since the beginning of the year, the canister was consistently around 370 MB, and then it suddenly jumped to the current plateau.
However, I currently don’t have a clear strategy to determine whether this is actually an issue or how to investigate it further. Below the stemm function e.g:
public func getStemmedWords(stemConfig: Types.StemConfig) : async [Text] {
// Step 1: Clean HTML content
var cleanText = removeHtmlTags(stemConfig.message);
// replace special characters
cleanText := removeSpecialCharacters(cleanText);
// Step 2: Split into words and detect language for each word individually
let words = Text.split(cleanText, #char ' ');
let stemmedWords = List.empty<Text>();
for (word in words) {
let trimmed = Text.trim(word, #char ' ');
if (Text.size(trimmed) > 0) {
// Detect language for each individual word for better accuracy
let wordLanguage = LanguageDetector.detectWordLanguage(trimmed);
let stemmed = stemWord(trimmed, wordLanguage);
if(stemmed != "") {
List.add(stemmedWords, stemmed);
};
};
};
//Debug.print("Indexing stemmed words: " # debug_show(List.toArray(stemmedWords)));
List.toArray(stemmedWords);
Do removeHtmlTags and removeSpecialChars remove on portion at a time, creating an intermediate text each time, or decide which chars to remove from the original text and then create a single text from that?
This function also looks stateless, in the sense that it doesn’t change any global state. In that case, you could declare it as a query. Then memory growth will be temporary and discarded after the query returns it’s result - the GC wouldn’t even run.
Bit of a hack but could work here (assuming actually stateless) in this very special case.
Be aware that the cycle limits for queries are lower so you might not have enough budget for your computations.