Has anyone performance tested stable memory vs. heap memory usage?

icme · September 5, 2022, 12:50am

Regarding this thread → Increased Canister Smart Contract Memory - #118 by icme

I’m curious if anyone has performance tested stable memory vs. heap memory storage?

Assuming the same canister settings and underlying data structure (i.e. Array, BTree, etc.) is used, I’d like to use data to build some ballpark intuition for the differences between stable & heap memory with respect to the following metrics:

Cycles (and cost)
Lookup/access time
Space/memory utilized

Additionally, ideally these tests would be performed on data sets from 50MB-2GB, to see what the performance comparison looks like as the size of the data increases.

I’m additionally curious about the performance of stable memory in the 2-8GB range.

@senior.joinu @lastmjs @ielashi @rckprtr Would love your input, and please feel free to tag others who have experience with the performance of stable memory.

lastmjs · September 6, 2022, 3:38pm

I haven’t done any benchmarking yet, sorry about that

C-B-Elite · September 7, 2022, 2:24am

Stable Memory Performance: ISP Experiment Report — Lauran

Heap memory vs Stable Memory（In Chinese）:

ielashi · September 7, 2022, 6:56pm

I unfortunately don’t have concrete numbers yet, but very soon we’ll be testing syncing the Bitcoin testnet state on a Wasm canister, so that should give us some insights into what it’s like to have a relatively large canister.

Regarding cycles/cost, this file has the costs of various instructions. AFAICT, the cost for stable_write for example has an overhead of 20 instructions + 100 instructions per byte written. (@berestovskyy did I read that correctly?)

I am not sure what you’re referring to by “space/memory utilized”. Are you referring to how stable/heap memory uses internally in the replicas?

icme · September 8, 2022, 1:32am

I’m referring to if you take the same x amount of MB to GB of data, and then you:

Use the standard BTreeMap in collections to store that data in the heap
Use either the IC stable BTreeMap or the ic-stable-memory BTreeMap to store that data in stable memory

What does the overall memory footprint taken up look like between the two (is the BTreeMap in collections used with the heap more compact, is the BTreeMap in stable memory more compact, or are they similar)? How do the heap and stable memory perform as you continue to insert the same data from say 50MB → 4GB?

It might also make sense to perform this same test on a simpler data structure from an implementation standpoint, such as an array or linked list to ensure that we’re not testing the data structure implementation, and are solely focusing on heap memory vs. stable memory.

The overall goal from this conversation is to gain intuitions for the tradeoffs that a developer will gain or lose from when they choose to use the canister heap or stable memory as storage.

berestovskyy · September 8, 2022, 2:06pm

morgen guys,
The latest system API benchmarks are over here:
https://github.com/dfinity/ic/blob/master/rs/execution_environment/benches/SYSTEM_API.md

The benchmarks are very synthetic, so they are really the best case scenario. But still, the numbers greatly depend on the memory block size we operate on, i.e.:

ic0_stable_write()/1B 297M IPS
ic0_stable_write()/8K 33.1G IPS

while the baseline performance is:

baseline/empty loop 7.64G IPS

So the actual performance of a data structure will greatly depend on the block size we read/write, and it varies from a few times slower to a few times faster for large blocks.

It would be really nice to have a set of benchmarks with some load patterns close to reality…

icme · September 9, 2022, 2:36am

I’m assuming IPS is referring to “instructions per second”, where an instruction is a web assembly instruction and is directly then related to cycles utilized.

What are the /1B and /8K suffixes to ic0_stable_write()?

From the benchmarks you provided above, looking at stable read for the same benchmarks:

update/ic0_stable_read()/1B 	85.1ms 	86.6ms 	+1%
update/ic0_stable_read()/8K 	259ms 	167ms 	-36%

How should I think about both the instruction costs and the speed (round time/new time) if I’m trying to find an element and iterating through a List O(n), looking up an element in a HashMap O(1), or descending through a balanced tree and making comparisons along the way O(log(n))?

Are they additive (for each read & comparison along the way), or is there just some initial overhead for instructions accessing stable memory and then all subsequent computations, say in a search with read/compare, these subsequent instructions don’t have that same overhead?

berestovskyy · September 12, 2022, 12:53pm

What are the /1B and /8K suffixes to ic0_stable_write() ?

Those are stable writes for a 1 byte and 8 kilobyte blocks respectively.

How should I think about both the instruction costs and the speed

I’m not sure, does speed beyond number of executed WASM instructions has any value you think?

Are they additive (for each read & comparison along the way)

There is no caching functionality built into the BTreeMaps, so they are pretty much additive.

ielashi · September 17, 2022, 12:24pm

We unfortunately don’t yet have benchmarks for stable structures (stable BTreeMap and others). For the past few weeks, our contributions to stable structures were around making them functional. Examples:

The MemoryManager, which simulates having multiple memories and is necessary to ergonomically use multiple stable structures in a canister.
The Log data structure, which is useful for storing append-only data.

I think we’ve now reached a point where stable structures have enough functionality to be useful, and within a few days we’ll move it into a separate repo, open it up for external contributions, and publish it as a rust crate on crates.io. Now I think would be the time we’ll start looking into its performance and do some benchmarking. Community contributions here would be more than welcome

Samer · December 20, 2023, 4:17pm

Any benchmarks in the meantime?

TusharGuptaMm · April 9, 2025, 5:49pm

Is this benchmarking in place as of today?

sea-snake · April 9, 2025, 6:49pm

There are benchmarks implemented here: stable-structures/benchmarks at main · dfinity/stable-structures · GitHub

Which seem to be used to check for changes as seen in this PR: test: Add a few new benchmarks that exercise the MemoryManager which is often how StableStructures are used by frankdavid · Pull Request #253 · dfinity/stable-structures · GitHub

TusharGuptaMm · April 9, 2025, 7:36pm

Thank you for sharing @sea-snake. This could be helpful as we plan to move some of the data structures from heap to stable memory to optimize canister memory usage. It would be interesting to check the read/write performance differences between heap and stable memory.

sea-snake · April 9, 2025, 8:54pm

Maybe you could run the benchmarks with the VectorMemory implementation, this would run in the heap and is normally used in tests.

Not entirely sure how representative it would be in comparison to e.g. a Rust structure, but maybe it might at least give some form of indication.

Topic		Replies	Views
Stable memory consumed even if not used? Developers	5	491	December 17, 2023
StableBTreeMap in Canisters Rust	17	1866	August 3, 2023
What happens when accessing a StableBTree over 4 GiB? Developers stable-memory	2	233	March 11, 2024
Inconsistent stable memory increases Developers rust	2	127	June 20, 2024
Doubt Regarding Stable memory Pages Rust	2	88	December 5, 2024

Has anyone performance tested stable memory vs. heap memory usage?

Related topics