How I optimized cycle consumption by changing state Serialization Strategies

Hello everyone, In my latest work on a transaction logger canister—which extracts bridge transactions from various minters (including ckETH minter canister and couple of other Appic minters that connect evm chains to icp) — I faced challenges in optimizing cycle consumption. This logger canister makes 1-minute interval calls to different minters for scraping minter events, transforming the data, recording them into stable memory, and preparing it for user queries.

Initially, the canister consumed 200 billion cycles daily, mainly due to the heavy cost of state serialization and deserialization, which involved reading and writing to stable memory. Here’s how I solved the issue and made a drastic cut in consumed cycles.


The Problem

I was using ciborium for serialization and deserialization(I’ve seen a lot of Dfinity canisters also use CBOR for this purpose). While functional, its performance in both encoding and decoding was too slow and heavy in used CPU and Memory, leading to high cycle consumption. Query responses for fetching transactions by address also took longer than desired.


The Solution

Switching to bincode for serialization and deserialization resulted in massive efficiency gains. Here’s a performance comparison:

---- state::tests::compare_bincode_and_ciborium stdout ----  
Bincode - Serialization: 9.23µs, Deserialization: 15.318µs  
Ciborium - Serialization: 10.127µs, Deserialization: 39.593µs  

As you can see Deserialization is almost 2.5x faster.

Results

  1. Cycle Consumption:

    • Dropped from 200 billion cycles/day to 40 billion cycles/day (a reduction of 80%).

    • Having said that this amount of cut in cycles consumption is not only achieved by state encoding and decoding changes, but also factors like(Avoiding unnecessary clones, better algorithms for mapping data and etc…

  2. Query Performance:

    • Transaction query responses are now 2x faster.
  3. State Size:

    • Stable memory size is roughly the same as using ciborium but slightly more memory which is not that significant .
---- state::tests::compare_bincode_and_ciborium_size stdout ----
Bincode -  Size: 80 bytes
Ciborium - Size: 77 bytes  

Why This Matters

Serialization and deserialization directly impact the efficiency of stable memory operations, especially in high-frequency canisters like mine. Optimizing these processes can drastically reduce costs and improve user experience.

If you’re building high-performance canisters, bincode might be worth considering for your serialization needs.

Drawbacks

Using bincode comes with its own hassles, for example you should define custom serializer and deserializer for types that are not supported by serde or bincode is native to rust only and if you need interoperability with other languages or tools its better to use cbor.

I would love to know your thoughts on this.

10 Likes

bincode is native to rust only and if you need interoperability with other languages or tools its better to use cbor.

It’s not only about interoperability with other languages but also between different canister versions. A quote from the official FAQ:

Bincode is suitable for storing data. Be aware that it does not implement any sort of data versioning scheme or file headers, as these features are outside the scope of this crate.

This means that handling versioning and backward compatibility is entirely on you now. CBOR doesn’t eliminate this concern but alleviates some of the pain and simplifies debugging.

As you can see Deserialization is almost 2.5x faster.

The comparison would be fairer if you measured the cycle consumption in a canister on a local replica. Runtime improvements in native code do not always (but do often) correlate with reducing cycle consumption.

this amount of cut in cycles consumption is not only achieved by state encoding and decoding changes, but also factors like(Avoiding unnecessary clones, better algorithms for mapping data and etc

It would be nice to isolate the effect of each improvement on the canister cycle consumption. In my experience, eliminating unnecessary clones brings the most bang for the buck.

3 Likes

It’s not only about interoperability with other languages but also between different canister versions

Yes that is true and for that I designed my canister endpoint types in a way that other canisters can still communicate with my canister.
for instance this is the type that has custom serializer and deserializer to be stored in the canisters stable memory:

pub struct IcpToEvmTx {
    pub transaction_hash: Option<TransactionHash>,
    pub native_ledger_burn_index: LedgerBurnIndex,
    pub withdrawal_amount: Erc20TokenAmount,
    pub actual_received: Option<Erc20TokenAmount>,
    pub destination: Address,
    pub from: Principal,
    pub chain_id: ChainId,
    pub from_subaccount: Option<[u8; 32]>,
    pub time: u64,
    pub max_transaction_fee: Option<Erc20TokenAmount>,
    pub effective_gas_price: Option<Erc20TokenAmount>,
    pub gas_used: Option<Erc20TokenAmount>,
    pub total_gas_spent: Option<Erc20TokenAmount>,
    pub erc20_ledger_burn_index: Option<LedgerBurnIndex>,
    pub erc20_contract_address: Address,
    pub icrc_ledger_id: Option<Principal>,
    pub verified: bool,
    pub status: IcpToEvmStatus,
    pub operator: Operator,
}

However this is the type that is returned from canister’s endpoint

pub struct CandidIcpToEvm {
    pub transaction_hash: Option<String>,
    pub native_ledger_burn_index: Nat,
    pub withdrawal_amount: Nat,
    pub actual_received: Option<Nat>,
    pub destination: String,
    pub from: Principal,
    pub from_subaccount: Option<[u8; 32]>,
    pub time: u64,
    pub max_transaction_fee: Option<Nat>,
    pub effective_gas_price: Option<Nat>,
    pub gas_used: Option<Nat>,
    pub total_gas_spent: Option<Nat>,
    pub erc20_ledger_burn_index: Option<Nat>,
    pub erc20_contract_address: String,
    pub icrc_ledger_id: Option<Principal>,
    pub verified: bool,
    pub status: IcpToEvmStatus,
    pub operator: Operator,
    pub chain_id: Nat,
}

The flow is as follow:

  1. User calls canister endpoints
  2. The canister reads the state to get the data and mapps the data into candid supported format
  3. Returns the mapped data.

This means that handling versioning and backward compatibility is entirely on you now. CBOR doesn’t eliminate this concern but alleviates some of the pain and simplifies debugging.

Can you please elaborate on that?

It would be nice to isolate the effect of each improvement on the canister cycle consumption. In my experience, eliminating unnecessary clones brings the most bang for the buck.

Completely agreed. I actually measured all of the optimization effects and in the provided graph
Above you can see two drastically reduces in cycles consumption, the first one was for removing unnecessary clones, and the second one is for switching state changes effect.

I can provide more accurate details on comparisons between Ciborium and Bincode if you’re intrested

1 Like