Canister burning through cycles

We have an issue with a canister containing NFT’s it is eating through cycles like no tomorrow. 2T an hour. We notice that even though we loaded around 1.2 GB of assets the canister size is hitting 4 BG.

Canister is t2mog-myaaa-aaaal-aas7q-cai. What are we doing wrong?

that seems off. Given the storage cost per GB per second from Computation and Storage Costs | Internet Computer Home = 127,000 cycles, you should have a burn rate of around 0.002T cycles/hour (if I calculated correctly :wink: )

Is your code public?

1 Like

Just looked at your canister on icscan: ICSCAN
Seems you’re using the heartbeat. This is most probably the main source of your cost.

Have a look at this related forum post: Cycle burn rate heartbeat - #45 by levi

1 Like

yep the dev is looking at that, but even taking that into account the whole of entrepot is burning 65T a day , our burn seem’s way off.

1 Like

Hi @domwoe I maintain the canister. The code is not public but I will make it public when I get back to my laptop.

We are using the same code as @bob11 but are seeing a much higher burn rate.

.5T cycles/day vs. 1T cycles/hour [correction: 2T per hour]

I recently upgraded the canister to include some log/monitor methods and that’s when we noticed it was reporting a memory size of 4GB. It’s my understanding that canisters have a strict 4GB memory limit so I was thinking that might be contributing to the increased burn rate somehow.

As @jonit mentioned, we uploaded our assets (jpegs) directly to the canister. We aren’t using an asset canister at the moment. This collection of images only totaled 1.3GB in size but the canister is about 3x that size.

One last piece of context; after a deployment our canister will eventually enter a state in which it can’t be upgraded. This usually happens right around the time heap memory hits 2GB. Regular method calls still work (transfer, list, settle, etc.) but no upgrade and no ability to stop the canister (it hangs in a “stopping” state). We can only resolve this by raising the freezing threshold to the point the canister stops; then we lower it again and deploy/start the canister. This brings it back to a state where the heap is at 1GB and upgrades are successful

1 Like

I’ve uploaded our canister code here: PetBots/main.mo at main · lightninglad91/PetBots · GitHub

Thanks!

There is no non-linear increase in (storage) costs when you get close to the 4 GB limit.

I don’t know if there are Garbage Collector related costs that might be relevant here.

One thing I noticed:
You might have multiple xnet inter-canister calls (ledger, cap) in your heartbeat which could open a lot of call contexts, in particular if the callee needs time to respond because it is busy. I think (and I might be wrong) that these call contexts are also stored on your available heap.

Do you hit the cycle limit in the preupgrade hook or do you get another error?

2 Likes

I must have misread this the first time, i thought you were saying there is a non-linear storage cost. That is unfortunate, i was hoping it would be something obvious like that :slight_smile:

Do you have an idea for why the canister is so large when the assets on disk are a 1/3 of the size?

Thank you for pointing out the xnet inter-canister calls. I will keep this in mind as I continue troubleshooting the issue. One of the things I want to test out is reducing the frequency of these calls.

I don’t have the exact error on-hand, but I believe it was something along the lines of “the canister_pre_upgrade attempted with outstanding message callbacks (please stop the canister and attempt the upgrade again)”

Sorry, forgot the “no” and edited it back in afterwards :grimacing:

I’m not sure:

  • You’re storing the assets in an array of a complex type. Don’t know the overhead here
  • You maintain other data structures. Don’t know how big they are.
  • The open call contexts (although don’t think this is soo much. Is the memory usage essentially constant or does it fluctuate/increase ?)

Yep, this is most probably because of the open calls in the heartbeat. You can’t upgrade before all calls got a response. That’s why you need to stop the canister first.

2 Likes

Another point to consider is that HashMap should be strongly discouraged to manage large set of data. If it has to go through rehashing of keys, it will eat through A LOT OF cycles.

4 Likes

@domwoe @PaulLiu thank you both. You’ve given me a lot to consider. I will work towards a solution that attempts to address these points.

2 Likes

The “memory” as reported by the CanisterGeek interface is constant at 4GB. The “heap memory” starts out at 1GB after an upgrade, fluctuates between 1GB & 2GB for a bit, and then it holds steady at 2GB.

(quotes are intended to convey that I’m not familiar enough with the terms to understand the difference in this context)

There’s a thread that provides some explanation: Motoko, Array, Memory

This brings up also the issue with Array.append() which you’re also using extensively :slight_smile:

1 Like

Are you storing any large binary assets as [Nat8] rather than Blob? That would take 4x space and stress the GC.

3 Likes

Also, which version of dfx or motoko are you using. We have optimized the memory requirements of upgrades. These used to take up to 2x space during upgrade (to serialize stable variables into ordinary memory before copying them stable memory). The new releases serialise directly to stable memory in a streaming fashion.

1 Like

I believe I am using 0.10.0. Can’t remember if I pulled the latest update yet.

This could be it!

I am using the ic-py python agent to call the ‘addAsset’ method. The ‘asset’ type defined in the canister expects a ‘blob’ entry but the ic-py agent doesn’t provide a ‘blob’ type so i had to structure it as ‘vec vec nat8’.

Thank you for the reference

This is on my to-do list :smiley:

hmm… given this code:

type File = {
    ctype : Text;//"image/jpeg"
    data : [Blob];
};

type Asset = {
    name : Text;
    thumbnail : ?File;
    payload : File;
};

I don’t think that’s the issue ?!

1 Like

Ok. So it would just be the type that is defined in the code that matters?