Yeah, I looked at the code for [Nat8] and found just one, probably harmless one.
I strongly suspect the aggressive heartbeat creating lots of call context is a big culprit and also preventing upgrades by always opening up new call contexts and causing callbacks to be stored on the heap, as suggested above.
Can you just do nothing on most heartbeats bar every nth one and provide a method to stop the heartbeat before an upgrade (if there isn’t one already).
Some of the array appends could be replace by more efficient Array.map over a source array, avoiding the quadratic behaviour. Others can be made more efficient by using a Buffer.
I believe TrieMaps are an almost drop-in replacement for the HashMaps that scale better and don’t have the rehashing cost of HashMaps. Others have reported big improvements after switch from HashMaps to TrieMaps.
dfx 0.10.0 shipped with Motoko 0.6.26, which doesn’t have the streaming implementation of stable variable serialization.
dfx 0.10.1 shipped with Motoko 0.6.28, which does have it (and fixes a bug in TrieMap that might affect you if you swap HashMap for TrieMap).
It might be worth upgrading to dfx 0.10.1 if possible.
Yes, that’s exactly what I had hoped to test out this evening. I will also follow your other suggestions.
Again, thank you @claudio@domwoe and @PaulLiu. You’ve all been very helpful and i’ve learned a lot. I will follow-up with the results of my changes soon.
In the future, HashMap will avoid this rehashing (as Trie and TrieMap already do), but not in the very short term. Even when that happens, it still will have a worst-case linear insertion time (when the underlying array has to grow). So, depending on the use case, it may be better to avoid HashMap even in the long term, too.
And finally here is a chart that shows the cycle burn rate during the past week. I’ve tried to circle a few spots that align with periods of time that the heap memory was showing steady linear growth rather than modulating between 1.2GB and 1.8GB. During these steady times the burn rate was almost negligible.
Edit: The chart below reflects the total cycles balance of our canister over a week’s time. The spikes in the balance are our attempts to keep the canister from running out of cycles.
The only correlation I’ve found is that after I go through the steps to deploy the canister the heap memory would go back down to 1.2GB and then steadily increase for a period of time.
During that period of time the canister was burning cycles at the expected rate of .5T per day. That is why the chart shows a flat line at those times.
Perhaps the chart is not useful. I was just trying to show that the canister is not burning 2T cycles/hour right after being deployed. It takes a period of time before it starts to burn that fast.
The graph depicts the total cycles balance of the canister over the past week (6/15 - 6/21). Not the rate of burn. Apologies for not making that clear. The spikes you see are our attempts to keep the canister alive by topping it up. I will update my post to clarify this.
The CanisterGeek method appears to collect the data every 5 minutes.
I did not link back to this forum conversation. But I had it in mind when I wrote that issue just now. Unfortunately, it’s not the only example out in the wild. I think the naming proposal will help future devs.
But what do others think? Please comment, either here (or ideally) in Github.