At Yral, we spend a significant amount on cycles to keep our entire fleet of canisters topped up. These canisters reside in ALL the available general app subnets.
In the last 1 month, we’ve topped up our entire fleet with around 120K trillion cycles.
For cycle topups, we’ve implemented our own solution that monitors canister cycle balances then estimates amounts to top up canisters that are running low.
Previously, we calculated the required cycles based on the idle_cycles_burned_per_day metric and an assumption of maximum calls to our canisters. However, with the recent proposal to maintain reserved cycles according to subnet usage, we’re finding it more difficult to estimate the required cycles for our canisters.
We’re repeatedly running into this situation in subnets with high memory utilisation where the cycle reservation mechanism is kicking in and causing calls to fail with an “Out of cycles” error message, even for canisters with over 2-3T cycle balances.
We need more clarity on how the reserved cycles are calculated in relation to subnet usage? We need to precisely be able to calculate to be able to estimate how much to top up our canisters with.
I believe a simple fix would be to offset the balance of your canister by the reserved_cycles (which should also be returned by canister_status) to account for the amount that is currently reserved due to the storage reservation mechanism. In other words, your effective cycles balance should be reduced by the reserved amount, this is what you really have available.
This should allow you to safely estimate taking into account the worst case of what is used up for reservations.
Out of curiosity are you receiving an out of cycles error message or a freezing threshold hit error message?
@dsarlis A straight out of cycles error message suggests that maybe the error message needs improving/more clarity?
Now that many subnets are reaching a level of memory utilization where the storage reservation kicks in, I’d expect more developers to be hitting this issue, especially new developers when they deploy their first canister (UX hurdle).
Out of curiosity for the guys at Yral that are using a signficant portion of memory on many different subnets, how much of this memory utilization is from metadata stored in the canister, and how much of it is from duplicate wasms being stored?
Thanks, @dsarlis, for the quick response. I have a follow-up question: if I calculate cycles this way, it assumes that the reserved cycles in the canister are sufficient and that it won’t need to draw from the main balance to recharge. If it does need to draw additional cycles, we risk dropping calls due to insufficient cycles.
I don’t think the message is just “Out of cycles”. We typically add some extra information, like “the canister was trying to do X, Y cycles were needed but only Z available, please top up”.
Now that many subnets are reaching a level of memory utilization where the storage reservation kicks in, I’d expect more developers to be hitting this issue, especially new developers when they deploy their first canister (UX hurdle).
There’s one subnet in that range and a second one that is getting close. I don’t think the rest is within range yet. Let’s wait and see how things evolve, if indeed more users start facing UX issues, we can consider our options.
So, IIUC, you check once in a while and you attempt to top up for a long-ish period ahead of time? In that case, you can instead use reserved_cycles_limit which is the maximum that your canister can have reserved ever (this is a canister setting that is also readable from canister_status).
It’s a more conservative estimate than using reserved_cycles and checking more frequently but that should be a correct estimate to ensure smooth canister operations.