Stable memory access limit

A few days back I noticed the 221Bravo ICP indexing canister stopped updating new transactions. After a bit of digging I’ve found that that the canister is hitting the stable memory access limit (2gb) - this is a new one for me. I’m aware of instruction limits but wasn’t aware that there is a stable memory access limit as well.

The canister is using IC Stable Memory (Rust) (ic_stable_memory - Rust) and holds just over 4 GB in data.

I’m not sure how IC Stable Memory works under the hood but in terms of the instruction limit it is more efficient than IC Stable Structures. The canister has a couple stable BTreeMaps and stable HashMaps which are updated with new transactions - I’m guessing these updates are what is causing the limit to be breached.

I’ve tried reducing the batch size per call from 10_000 transactions to 1_000 but this didn’t make a difference. I’m kinda stuck in a tricky situation - if I re-design the canister to process less transactions per round then I could end up unable to keep up with the tip of the ICP ledger (especially as ICP grows).

I should mention that the 221B index canister calculates/ holds more metrics on each account than the Dfinity SNS indexer - which might be a big part of the issue.

I’m thinking the solution might be to split some of the maps into their own canisters however this adds complexity when trying to query an account.

Interested in people’s thoughts on this - Has anyone hit a similar issue with stable memory?

2 Likes

The limit on the stable memory access is imposed by the IC, so no matter which library you would use you could still hit it (same goes for instructions limits).

If you do access 2GiB of data, I would reckon it has to do with the access patterns you have and there’s probably little that can be done on the stable structures side. I’d recommend you try to redesign your dapp in such a way that you don’t need to do so many memory accesses in a single message execution.

Unfortunately, without knowing the exact architecture of your dapp it’s hard to make any concrete recommendations.

Are you inserting into the HashMap for each transaction and if so what’s the current size a capacity of the map? Docs say that the HashMap will be reallocated if the size reaches 3/4ths of the capacity and that would perform work proportional to the current size of the map and could easily hit the stable memory access limit if the size of the map is large.

Probably it’s a better idea to just use BTrees because they should always have an access pattern proportional to the operations performed.

2 Likes

I think you’ve hit the nail on the head. There are two HashMaps that create a directory for mapping ICRC accounts to a smaller internal account ID and back again (ie 2vxsx-fae.0000…00 to just account 0) - this is done to keep the transaction records smaller and save duplicating full ICRC addresses.

I know that BTreeMaps will shuffle things to ‘balance’ themselves - from what your saying they are far more efficient?

Would the balancing of a 400gb Map be likely to fall within the 2GB limit? (assuming a fairly reasonable individual record size)

It’s not that btrees are more efficient, it’s just that in the worst case an insertion performs O(log n) memory access whereas the worst case for a hash map is O(n). Normally people don’t care much about the difference because the hash map costs get amortized to be constant on average. But here you want to make sure each operation fits within the limits so the amortization doesn’t help.

Some references: Hash table - Wikipedia and B-tree - Wikipedia

2 Likes

First up, math isn’t my strong point - I’ve been trying to work out how big a b-tree could get while being under the 2GB access limit. Not sure if I’m correct - log(100) = 2 does this mean that a BTreemap over 100GB could have issues with inserts hitting the 2gb access limit.

Or have I got the math way wrong? (if so can you share the theoretical max map size which fits within the 2gb limit)

That’s a bit hard to say how much data is accessed exactly. It’s probably better to do some benchmarking with canbench to get an idea of how much is actually used.

2 Likes