Frozen Canister

Hello,

I’m facing some strange issue with a canister that seems frozen (for 3 months).

Canister logs:

dfx canister logs o3ee2-yyaaa-aaaak-akp3a-cai --network=ic

[318344. 2024-07-05T09:38:52.009524994Z]: in canister_global_timer: SysTransient: Couldn't send message
[318345. 2024-07-05T09:38:52.757444998Z]: in canister_global_timer: SysTransient: Couldn't send message

Canister status:

dfx canister status o3ee2-yyaaa-aaaak-akp3a-cai --network ic

Canister status call result for o3ee2-yyaaa-aaaak-akp3a-cai.
Status: Running
Controllers: 2i2iz-yf3eb-afa5w-jxrzx-dhpwk-jebo5-3p5ea-4xw7i-axi7w-syfex-fqe drpqy-eiaaa-aaaak-afbpq-cai
Memory allocation: 0
Compute allocation: 0
Freezing threshold: 2_592_000
Memory Size: Nat(44014919)
Balance: 5_000_792_427_829 Cycles
Reserved: 0 Cycles
Reserved cycles limit: 5_000_000_000_000 Cycles
Wasm memory limit: 3_221_225_472 Bytes
Module hash: 0x19621bac434528737e2c44d1f27cb05b5081954da58e59520b211047e3e5d0b5
Number of queries: 290_420
Instructions spent in queries: 121_334_257_993
Total query request payload size (bytes): 173_076_423
Total query response payload size (bytes): 36_536_162
Log visibility: controllers

Based on the documentation, it appears that the SysTransient error is related to the canister being frozen. The freezing threshold is set at 30 days, but it’s unclear to me how the canister’s cycle consumption is calculated. As a result, I don’t know how many cycles are required to reactivate it. I have already provided 5T cycles, but the canister is still not active.

The error is not necessarily related to the canister being frozen. There could be other cases where you cannot send a message, e.g. the output queue of the canister is full.

The canister has enough cycles to not be frozen I believe (given other settings I see in the output you shared). The easiest way to confirm that it’s not frozen is to check whether you can do any queries or updates (besides the timer). If these work, then the canister is definitely not frozen.

1 Like

I am able to call both query and update methods. The canister is designed to track the state of a contract on Solana via Solana RPC calls. It searches for new signatures every minute, and reading the data inside a transaction requires two additional calls, occurring every three minutes.

Could you advise on how to debug this issue? How can I clear the output queue, and what steps can I take to prevent this issue from occurring again in the future?

The frequency of calls you’re making is not high enough to reach the canister queue limit (it’s 500 messages between a pair of canisters).

The canister is designed to track the state of a contract on Solana via Solana RPC calls.

This should be happening using HTTPS outcalls. These are more expensive than making an inter-canister call as they require an extra payment. You can double check whether the canister has enough cycles to support this (maybe it has for 1 or 2 concurrent but maybe not for more?).

Could you advise on how to debug this issue? How can I clear the output queue, and what steps can I take to prevent this issue from occurring again in the future?

I think you can add maybe some more logs to help you. E.g. what is your canister balance right before making the call. Unfortunately, the error about the queue being full or not is not propagated in detail back to you so I don’t think you’ll get anything better than Couldn't send message.

You could also try to increase your cycles balance a bit more, say 2-3T extra and see if it changes anything.

I added 3T more cycles, but the canister is still inactive.

I’m using http_request from ic-cdk = "0.12.1"
Galactic-bridge-icp

Perhaps my cycle calculations are incorrect, though I recall using the same logic from ckBTC and overestimating all other values?
Math_1 Math_2

One thing I find strange is that the most recent log is from July 5th. I have plenty of logs using ic_canister_log::log, but none of them are showing up on dfx canister logs.

I believe this computation is out of date. The most recent one is described in the following section in the developer docs. I quote the relevant part:

  • HTTPS outcalls: The cost for an HTTPS outcall is calculated using the formula (3_000_000 + 60_000 * n) * n for the base fee and 400 * n each request byte and 800 * n for each response byte, where n is the number of nodes in the subnet. These costs are included in the chart found below.

Btw, the canister is deployed on subnet k44fs. This is a 13-node subnet, so you should also make sure you use that (I see in the code you shared you’re using 34 nodes instead).

One thing I find strange is that the most recent log is from July 5th. I have plenty of logs using ic_canister_log::log , but none of them are showing up on dfx canister logs .

ic_canister_log is an application level library that maintains logs in the canister’s heap. I’m not very familiar with it so I don’t know how you can see those logs.

The logs that are visible with dfx canister logs are the ones stored by the system when the canister calls the ic0.debug_print System API. ic-cdk exposes this through the print function iirc. I suggest you add some of those and then you should be able to see them using dfx canister logs.