Unexpected Canister Memory Growth When Sending Emails via HTTP Outcalls

I am using HTTP outcalls (Motoko) to send some emails directly from a canister. I noticed that even sending a very small email (for example, just a single line of text) increases the canister memory usage by approximately 200 KB.

Based on this observation, sending around 15–20 emails per day increases the canister memory by roughly 2 MiB .

Is this expected behavior, or could there be an issue in my implementation?

hello @rbole , thanks for your question.

Some memory usage and growth is normal. Email sending most definitely creates objects that are placed on the heap. Motoko has a GC which periodically kicks in to remove unneeded data, but the way it functions depends on which backend you use. For example, do you use enhanced orthogonal persistence? or the classical persistence?

Anyhow, the GC will not necessarily kick in immediately after your canister does the HTTP outcalls, so data may linger around for a while. There’s also a question of course of what you do with the responses of the http outcalls, do you save something etc.

I don’t think the increase should be any cause of concern for now, but couldn’t tell much more without seeing the code or learning more about what you exactly do.

1 Like

Hey, this is the function I use.

  public func send(mailMsg:Types.MailMsg, caller:Principal) : async Types.Result2<Text, Text> {
    
    let balanceAtStart : Nat = Cycles.balance();

    let idempotency_key : Text = Principal.toText(caller) # "-" # Int.toText(Time.now());
    let bodyBytes : [Nat8] = Blob.toArray(Text.encodeUtf8(Helper.createRequestBodyJson(mailMsg)));
    let logMsgObject : Types.MailMsg = { mailMsg with mailMsg = "" };
    let ic : Types.IC = actor ("aaaaa-aa");

    let (result, cyclesUsed) = try {
      let http_response : Types.HttpResponsePayload = await (with cycles = 230_850_258_000) ic.http_request({
        url = url;
        max_response_bytes = ?2048;
        headers = [
          { name = "Host"; value = host # ":443" },
          { name = "User-Agent"; value = "http_post_sample" },
          { name = "Content-Type"; value = "application/json" },
          { name = "Idempotency-Key"; value = idempotency_key },
          { name = "Authorization"; value = " "#api_key }
        ];
        body = ?bodyBytes;
        method = #post;
        transform = ?{ function = transform; context = Blob.fromArray([]) };
      });
      let balanceAtEnd : Nat = Cycles.balance();
      let used : Int = Int.max(0, Int.fromNat(balanceAtStart) - Int.fromNat(balanceAtEnd));
      let response_body: Blob = Blob.fromArray(http_response.body);
      let decoded_text: Text = switch (Text.decodeUtf8(response_body)) {
        case (null) { "No value returned" };
        case (?y) { y };
      };
      (decoded_text, used)
    } catch (e) {
      let balanceAtEnd : Nat = Cycles.balance();
      let used : Int = Int.max(0, Int.fromNat(balanceAtStart) - Int.fromNat(balanceAtEnd));
      ("http outcall failed: " # Error.message(e), used)
    };

    Debug.print("send: adding log key=" # idempotency_key # " result=" # result # " mapSize before=" # debug_show(Map.size(mailLogMap)));

    if (result == "ok") {
      return #ok {key = idempotency_key; result = result};
    } else {
      return #err {key = idempotency_key; result = result};
    };
  };

I use these dfx.json parameter for the canister:

“type”: “motoko”,
“args” : “–enhanced-orthogonal-persistence”,
“optimize”: “cycles”,

dfx version: 0.29.2 and moc = 0.16.2

It is a POST HTTP_OUTCALL to a sendMail API. I have observed that the canister’s memory grows by approximately 200 KB per request on average.

This means that if we send around 20 emails per day, the canister memory increases by roughly 2 MiB per day.

I am monitoring the memory usage using: dfx canister status and checking the Memory Size value reported there.

Your canister heap increases initially to have some space to work with, that won’t shrink unless you upgrade it. Likely once you send the first emails it will stay the same for a while so next ones wont be taking 2MIB per day. If you manage to test this with pocketic you will see how it works.

Ok, thanks, I understand that. However, from my observation it appears that when performing an http_outcall, the canister memory increases with each request and the allocated memory is never released afterwards.

I monitored this behavior over several days with roughly the same number of outcalls per day. Even when nothing is stored persistently, the reported memory size still increases linearly with every outcall.

That is the behavior I am currently observing.

hi @rbole, and thanks for sharing the code.

You are allocating quite a few objects in that method you’re showing. My guess is that bodyBytes could for example be one of the largest ones, depending on what you send in that email.

Anyhow, all these objects remain on the heap until a full GC phase is done. With enhanced orthogonal persistence, GC runs depend on the load of the canister (because we piggy-back GC with user calls to make it scalable). If the load is relatively low (20 emails per day), it is also possible that the canister has not even gone through a full GC phase.

Keep an eye on it and report back if anything strange happens, but for now nothing seems out of the ordinary to me.

The memory size reported by canister settings is probably not want you want to monitor since it reflects the memory the canister has requested from the system and will generally only grow over time.

This AI generated summary of the runtime stats available to Motoko might be more useful.

Motoko rts_ Functions

The rts_ (Runtime System) functions in Motoko provide introspection into the canister’s memory and performance statistics. They are accessible via the Prim pseudo-library:

import Prim "mo:prim";

// Example usage
Prim.rts_memory_size()

Available rts_ Functions

Function Description
rts_version Returns the runtime system version as Text
rts_memory_size Total memory size in bytes
rts_heap_size Current heap size in bytes
rts_total_allocation Total amount of memory allocated over the canister’s lifetime, in bytes
rts_reclaimed Amount of memory reclaimed by the garbage collector, in bytes
rts_max_live_size Maximum live (in-use) heap size observed, in bytes
rts_mutator_instructions Approximate IC instruction cost of the last completed message due to computation/mutation
rts_collector_instructions Approximate IC instruction cost of the last completed message due to garbage collection

[Prim functions; forum]

Extended Runtime Information

A more comprehensive privileged query function is also available:

__motoko_runtime_information : () -> {
    compilerVersion : Text;
    rtsVersion : Text;
    garbageCollector : Text;
    sanityChecks : Nat;
    memorySize : Nat;
    heapSize : Nat;
    totalAllocation : Nat;
    reclaimed : Nat;
    maxLiveSize : Nat;
    stableMemorySize : Nat;
    logicalStableMemorySize : Nat;
    maxStackSize : Nat;
    callbackTableCount : Nat;
    callbackTableSize : Nat;
}

This function (__motoko_runtime_information()) is only authorized to canister controllers and self-calls of the canister. [Motoko changelog]

Important Notes

  • rts_memory_size vs rts_heap_size: rts_memory_size reflects the total Wasm memory allocated (in pages), while rts_heap_size reflects only the live heap objects. For example, an array of 1,000,000 Nat8 values may show rts_heap_size ≈ 4,000,052 bytes because Motoko uses a uniform array representation to support generics. [forum post]
  • rts_mutator_instructions / rts_collector_instructions: These report stats for the last completed message, not the current one, making them useful for post-hoc profiling. [Motoko changelog]
  • The Prim module is not officially intended for end-user use and its API may change. Use with discretion. [Prim functions; forum]

Thanks for jumping in.

Currently, I need to monitor several Motoko and asset canisters, and I have not implemented a dedicated monitoring function within the canisters themselves, something that is not possible for asset canisters anyway.

Therefore, I rely on the Management Canister to retrieve the canister status.

// IC management canister for querying canister status
transient let IC = actor “aaaaa-aa” : actor {
canister_status : { canister_id : Principal } → async {
status : { #running; #stopping; #stopped };
memory_size : Nat;
cycles : Nat;
};
};

From my observation, the memory size appears to increase with every request and does not decrease afterward. However, I’m not entirely sure whether this is expected behavior or an issue on my side, it’s simply what I’ve observed so far.

You could use a similar pattern to access the (hidden) __motoko_runtime_information either from a controller or each canister itself, if that helps

Unless you have a user code memory leak, I would expect the wasm memory_size to steadily increase until it reaches a stable plateau from which the GC can recycle memory without requiring further wasm memory.

A user level memory leak might a global collection that keeps growing (say a log of every message) etc.

When memory_size reaches a stable plateau, what size are we talking about? Is there a typical or expected range?

Ok, it turns out it’s pretty easy to call the hidden method by using a custom did file (info.did below) that declares the hidden query and supplying it to dfx canister call.

 russo@crusso-Virtual-Machine:~/sample$ cat info.did
 type RuntimeInformation =
  record {
    callbackTableCount: nat;
    callbackTableSize: nat;
    compilerVersion: text;
    garbageCollector: text;
    heapSize: nat;
    logicalStableMemorySize: nat;
    maxLiveSize: nat;
    maxStackSize: nat;
    memorySize: nat;
    reclaimed: nat;
    rtsVersion: text;
    sanityChecks: bool;
    stableMemorySize: nat;
    totalAllocation: nat;
  };
 service : {
   __motoko_runtime_information: () -> (RuntimeInformation) query;
 }
 crusso@crusso-Virtual-Machine:~/sample$ dfx canister call sample_backend __motoko_runtime_information --candid info.did
 (
   record {
     sanityChecks = false;
     heapSize = 5_281_024 : nat;
     maxLiveSize = 37_936 : nat;
     rtsVersion = "0.1";
     callbackTableSize = 0 : nat;
     maxStackSize = 4_194_304 : nat;
     compilerVersion = "1.0.0";
     totalAllocation = 5_281_024 : nat;
     callbackTableCount = 0 : nat;
     garbageCollector = "default";
     reclaimed = 0 : nat;
     logicalStableMemorySize = 0 : nat;
     stableMemorySize = 0 : nat;
     memorySize = 134_217_728 : nat;
   },
 )

If you are using the default copying collector the memory size should be around twice the maximum heap size (tracked by max live size, I think)

How much heap you use depends on your app.

1 Like

Yes, that works — but how can I interpret the results correctly?

`sim-backend-motoko % ./scripts/motoko-runtime-memory.sh backend-prod
canister: backend-prod
heapSize: 174594216 bytes (166.51 MiB)
memorySize: 1811939328 bytes (1.69 GiB)

Full JSON:
{
“callbackTableCount”: “0”,
“callbackTableSize”: “256”,
“compilerVersion”: “0.16.2”,
“garbageCollector”: “default”,
“heapSize”: “174_594_216”,
“logicalStableMemorySize”: “0”,
“maxLiveSize”: “67_823_600”,
“maxStackSize”: “4_194_304”,
“memorySize”: “1_811_939_328”,
“reclaimed”: “23_130_146_776”,
“rtsVersion”: “0.1”,
“sanityChecks”: false,
“stableMemorySize”: “0”,
“totalAllocation”: “23_304_740_992”
}`

I think this means

Since version >= 0.15.0 you are using the default 64-bit, enhanced-orthogonal-persistence with incremental GC. (Before 0.15.0 the default was 32-bit, copying gc.)

The runtime system has so far requested memorySize bytes of main memory from ICP.

Of this heapSize bytes are currently occupied by Motoko data, both reachable and unreachable.

The maximum live (reachable) data detected in a past GC was maxLiveSize bytes.

I’m trying to understand why there is such a large gap between heapSize and memorySize. We use some indexes and stemming for text search. Is this expected behavior? What could be the underlying reasons?

Allocating a lot of temporary data that ultimately becomes garbage could push up the memory usage.

Culprits could be things like the use of Array.append in a loop.

Hard to know without seeing the code really.

The only plausible explanation I can think of is our text search feature. We store documentation directly in the canister, and over time this content grows, typically around the size of an A4 page per entry in some cases.

The text is also processed (stemming) and indexed in a separate canister.

However, the memory increase I observe is not continuous. It reaches a plateau and remains stable, for example, it has been steady for the last 16 days. Before that, since the beginning of the year, the canister was consistently around 370 MB, and then it suddenly jumped to the current plateau.

However, I currently don’t have a clear strategy to determine whether this is actually an issue or how to investigate it further. Below the stemm function e.g:

public func getStemmedWords(stemConfig: Types.StemConfig) : async [Text] {

// Step 1: Clean HTML content
var cleanText = removeHtmlTags(stemConfig.message);

// replace special characters
cleanText := removeSpecialCharacters(cleanText);

// Step 2: Split into words and detect language for each word individually
let words = Text.split(cleanText, #char ' ');
let stemmedWords = List.empty<Text>();

for (word in words) {
  let trimmed = Text.trim(word, #char ' ');
  if (Text.size(trimmed) > 0) {
    // Detect language for each individual word for better accuracy
    let wordLanguage = LanguageDetector.detectWordLanguage(trimmed);
    let stemmed = stemWord(trimmed, wordLanguage);
    if(stemmed != "") {
      List.add(stemmedWords, stemmed);
    };
  };
};

//Debug.print("Indexing stemmed words: " # debug_show(List.toArray(stemmedWords)));
List.toArray(stemmedWords);

};

Do removeHtmlTags and removeSpecialChars remove on portion at a time, creating an intermediate text each time, or decide which chars to remove from the original text and then create a single text from that?

Maybe ask an AI to analyse your code

This function also looks stateless, in the sense that it doesn’t change any global state. In that case, you could declare it as a query. Then memory growth will be temporary and discarded after the query returns it’s result - the GC wouldn’t even run.

Bit of a hack but could work here (assuming actually stateless) in this very special case.

Be aware that the cycle limits for queries are lower so you might not have enough budget for your computations.